> ## Documentation Index
> Fetch the complete documentation index at: https://docs.armature.tech/llms.txt
> Use this file to discover all available pages before exploring further.

# Key concepts in Armature: workflows, runs, and more

> The vocabulary you need to use Armature effectively: MCP servers, workflows, runs, criteria, monitors, coverage, versions, policies, alerts, and API keys.

Armature has a small set of core concepts that appear throughout the dashboard, API, and documentation. Understanding how they relate to each other will help you set up tests confidently and interpret results correctly.

<AccordionGroup>
  <Accordion title="MCP Server">
    An **MCP server** is any HTTP endpoint that speaks the Model Context Protocol. In Armature, an MCP server is the target you are testing. You register it by providing its base URL, transport type (`streamable_http` or `sse`), and authentication credentials.

    When you first connect a server, Armature probes it live — calling `initialize` and `listTools` — to discover the tools it exposes. The discovered tools are stored in Armature's **tool catalog** for that server and updated whenever you run a live probe or sync the catalog.

    Each MCP server can have one or more **auth profiles**: named credential sets (bearer token or API key header) that workflows and monitors can reference. Credentials are stored encrypted and are never written to the dashboard database.

    See [Connecting an MCP server](/mcp-servers/connecting) for setup instructions.
  </Accordion>

  <Accordion title="Workflow">
    A **workflow** is a scheduled test that dispatches an AI agent against one of your MCP servers. Every workflow has two required parts:

    * **Prompt** — The natural-language instruction the agent receives. Write it the way you would write a task for a human: be explicit about what the agent should do and which tools it should use.
    * **Criteria** — One or more plain-English statements about observable outcomes the agent must produce. Each criterion is evaluated independently by a separate evaluator model after the agent finishes.

    Workflows also have a **schedule** (a cron expression, or manual-only), a **model selection** (one or more tester models), and an optional **tool policy** (see below).

    When a workflow fires, Armature creates a **run** — a single execution of the workflow. The workflow itself is a container for runs; the run is where the actual evidence lives.

    See [Workflows overview](/workflows/overview) for authoring guidance.
  </Accordion>

  <Accordion title="Run">
    A **run** is a single execution of a workflow. Runs are created automatically on the workflow's schedule, or manually when you click **Run now**. Each run records:

    * The tester model used
    * The full prompt sent to the agent
    * Every tool call the agent made, including inputs and outputs
    * The evaluator's verdict on each criterion
    * The overall run status

    **Run statuses:**

    | Status    | Meaning                                                                                 |
    | --------- | --------------------------------------------------------------------------------------- |
    | `pass`    | All required criteria evaluated to pass.                                                |
    | `partial` | Some criteria passed, but at least one required criterion was partial or not fully met. |
    | `fail`    | One or more required criteria failed.                                                   |
    | `error`   | The agent itself encountered an unrecoverable error before the evaluator could run.     |

    Runs are immutable after they complete. You can inspect a run's tool call trace and criterion results at any time from the **Run history** page or from the workflow's **History** tab.

    See [Runs](/workflows/runs) for filtering and navigation.
  </Accordion>

  <Accordion title="Criteria">
    **Criteria** are the pass/fail assertions that define what a successful run looks like. Each criterion is a single plain-English statement written in the workflow editor's **Success criteria** tab. The evaluator model reads the agent's trace and grades each criterion independently.

    **Criterion verdicts:**

    | Verdict   | Meaning                                                                                           |
    | --------- | ------------------------------------------------------------------------------------------------- |
    | `pass`    | The criterion is clearly satisfied by the evidence in the run.                                    |
    | `partial` | The criterion is partially satisfied — some evidence supports it, but not all conditions are met. |
    | `fail`    | The criterion is not satisfied.                                                                   |

    Write criteria as externally checkable outcomes. Include the specific evidence the evaluator should look for: which tool was called, what field appeared in the response, what was absent. Avoid vague criteria like "works correctly" — prefer "The agent calls `create_order` exactly once and reports the returned order ID."

    One criterion per line. Each line is evaluated separately, so a single failing criterion does not prevent the others from passing.

    See [Workflow authoring guidelines](/workflows/authoring) for examples and best practices.
  </Accordion>

  <Accordion title="Tool Monitor">
    A **tool monitor** is a lightweight, recurring health check for a single MCP tool. Unlike a workflow, a tool monitor does not run an AI agent — it calls the tool directly with a fixed set of arguments and records whether it succeeded.

    **Monitor intervals:** 1 minute, 5 minutes, 15 minutes, or 1 hour.

    **Monitor statuses:**

    | Status  | Meaning                                                                                                            |
    | ------- | ------------------------------------------------------------------------------------------------------------------ |
    | `pass`  | The tool responded without throwing and without returning `isError: true`.                                         |
    | `fail`  | The tool returned `isError: true` in its response.                                                                 |
    | `error` | The transport itself failed — the server was unreachable or returned an unexpected error before MCP-level parsing. |

    Monitors are configured from the **MCP Servers** page. After connecting a server, the monitor wizard discovers available tools and lets you select which ones to monitor and at what interval. Tools that require arguments ask for a JSON argument blob before saving.

    You can have multiple monitors for the same server — one per tool you want to track. Monitors run independently and report individually.

    See [Tool monitors](/mcp-servers/tool-monitors) for setup and management.
  </Accordion>

  <Accordion title="Coverage Report">
    A **coverage report** shows which tools in your MCP server's catalog are exercised by your workflows and which are not. Coverage is calculated from the most recent 50 runs for the server, across up to 100 active tools in the catalog.

    A tool counts as covered if:

    * It was successfully called in at least one recent run, **or**
    * It is listed in a workflow's `allowed_tools` policy (even if not called in recent runs).

    A tool is **uncovered** if it appears in the catalog but has not been called by any recent workflow run and is not in any `allowed_tools` policy.

    Coverage reports help you identify gaps in your test suite before they become production regressions. Use the report to decide which tools need new workflows.

    Coverage is also accessible through the MCP repair API as a resource: `armature://mcp-servers/{id}/coverage-report`.

    See [Coverage reports](/mcp-servers/coverage) for details.
  </Accordion>

  <Accordion title="Workflow Version">
    Workflows in Armature are **immutable by version**. When you edit a workflow's prompt, criteria, or tool policy and save, Armature does not overwrite the existing definition — it creates a new **workflow version** and makes it the active version going forward.

    Past runs are always associated with the version that was active when they ran, so you can compare results before and after a change without losing historical data.

    The MCP repair API's `propose_workflow_patch` and `apply_workflow_change` tools interact with workflow versions explicitly: a proposal targets a specific `baseVersionId`, and applying the proposal creates a new version. If the workflow has moved to a newer version before you apply, Armature returns `base_version_stale` with the current version ID so you can rebase your proposal.
  </Accordion>

  <Accordion title="Tool Policy">
    A **tool policy** constrains which tools a workflow's agent is permitted to use. You can set it in the workflow editor's advanced settings or through the MCP repair API.

    * **`allowed_tools`** — An explicit list of tools the agent is expected to use. If you set `allowed_tools`, only those tools count toward coverage for this workflow. An empty `allowed_tools` means "open policy" — no restriction, and no coverage credit from the policy alone.
    * **`blocked_tools`** — Tools the agent must never call during this workflow. If the agent calls a blocked tool, the run is automatically flagged.

    Tool policies are validated against the server's tool catalog at save time. If a policy entry references a tool that no longer exists in the catalog, Armature warns you.

    See [Workflow authoring guidelines](/workflows/authoring) for examples.
  </Accordion>

  <Accordion title="Alert Rule">
    An **alert rule** triggers a notification when a workflow or tool monitor reaches a failing state. Armature supports two delivery channels:

    * **Slack** — Sends a message to a channel or user in your connected Slack workspace. Public channels can be entered as `#channel-name`. Private channels require you to invite the Armature bot first, then use the channel ID.
    * **Email** — Sends an email to a specified address when a rule fires.

    Alert rules are scoped: you can create one rule that alerts on all workflow failures, or narrow rules that alert only on specific workflows or specific monitors. This lets you route critical workflow failures to a dedicated alert channel while keeping lower-priority monitors quieter.

    See [Slack alerts](/alerts/slack) and [Email alerts](/alerts/email) for setup instructions.
  </Accordion>

  <Accordion title="Org roles">
    Every member of an Armature workspace has one of four roles. Roles control what actions that member can take in the dashboard and through the MCP repair API.

    | Role       | What they can do                                                                                                                              |
    | ---------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
    | **viewer** | Read workflows, runs, monitors, coverage reports, and MCP API resources. Propose workflow patches. Compare runs.                              |
    | **editor** | Everything a viewer can do, plus: apply workflow change proposals, trigger manual runs, create and edit monitors, and create alert rules.     |
    | **admin**  | Everything an editor can do, plus: connect and delete MCP servers, sync tool catalogs, manage team invitations, and configure alert channels. |
    | **owner**  | Everything an admin can do, plus: manage billing, change workspace settings, and transfer ownership.                                          |

    The workspace owner is the account that created the workspace. Owners and admins can invite teammates from **Settings → Team**.

    <Note>
      API keys are frozen to the issuing user's role at the time of creation. If your role is later upgraded, existing keys do not automatically gain new permissions — create a new key to pick up the expanded role.
    </Note>
  </Accordion>

  <Accordion title="API Key">
    An **API key** is a bearer token that authenticates with the Armature MCP API (`/api/mcp`). Keys have the format `amt_<key-id>_<secret>` and are passed in the `Authorization` header:

    ```text theme={null}
    Authorization: Bearer amt_<key-id>_<secret>
    ```

    Keys are:

    * **Shown once** — The full token is displayed only when it is created. Armature stores only a scrypt hash server-side. Copy the token before closing the creation dialog.
    * **Scoped to your organization** — A key only grants access to the workspace it was created in.
    * **Frozen to your role** — The key inherits the permissions of your role at creation time.
    * **Revocable** — Deleting a key from the dashboard immediately invalidates it. Subsequent MCP requests with that key return `401 unauthenticated`.

    Create and manage API keys from [**Settings → API keys**](/settings/api-keys).

    See [MCP API authentication](/mcp-api/authentication) for how to use a key with an MCP client.
  </Accordion>
</AccordionGroup>

## How the concepts connect

A typical Armature setup looks like this: you connect one or more **MCP servers**, then create **workflows** (scheduled agent tests) and **tool monitors** (direct tool pings) against those servers. Each time a workflow fires, Armature produces a **run** with a status and individual **criteria** verdicts. **Coverage reports** roll up across all runs to show which tools are tested. When something fails, **alert rules** notify your team on Slack or email. If you use the **MCP repair API**, you authenticate with an **API key** and use a client with your org **role** to triage and propose fixes.

<CardGroup cols={2}>
  <Card title="Get started" icon="rocket" href="/quickstart">
    Walk through connecting a server, setting up monitors, and creating your first workflow.
  </Card>

  <Card title="Workflows in depth" icon="layers" href="/workflows/overview">
    Learn how to author effective prompts and criteria.
  </Card>

  <Card title="MCP Servers" icon="plug" href="/mcp-servers/connecting">
    Connect servers, manage auth profiles, and run live probes.
  </Card>

  <Card title="MCP Repair API" icon="wrench" href="/mcp-api/overview">
    Use the agent-facing API to investigate and fix failing workflows.
  </Card>
</CardGroup>