AIShell-Gate — Using the MCP

01 What the MCP Does

The AIShell-Gate MCP server is a Python stdio process that translates MCP tool calls from an AI coding environment into subprocess invocations of the two AIShell-Gate binaries. It has no policy logic of its own. It does not execute commands on its own. It is a structured bridge between the AI agent and the existing policy and execution architecture.

Claude Code / Cursor (or any MCP-compatible environment) │ MCP tool call (JSON-RPC 2.0 over stdio) ▼ aishell-gate-mcp.py (this server — translates tool calls to subprocess invocations) │ ├──▶ aishell-gate-exec evaluate_plan · execute_plan │ └──▶ aishell-gate-policy evaluate_command · get_policy_template · verify_policy · dump_policy · verify_audit_log │ ▼ Structured JSON response (returned to the AI agent as tool result)

The server exposes eight tools. Six are available in both Standard and Enterprise editions. Two require Enterprise. The AI agent calls get_version on first connection to discover which tools are available and which edition is installed.

evaluate_plan

Multi-command plan assessment. No execution. The primary pre-flight tool.

execute_plan

Live execution of a plan. Pre-flight runs automatically before any command touches the system.

get_version

Detect edition, version strings, and available tools. Call this first on every new connection.

evaluate_command

Single-command quick assessment with full flag catalog analysis. Faster than a full plan for ad-hoc checks.

get_policy_template

Emit the built-in policy layer as an editable JSON file. Starting point for custom policy authoring.

verify_policy

Run a test suite of command/expected pairs. PASS/FAIL per assertion. Regression testing after policy changes.

dump_policy

Resolved operator overlay layers as JSON — base, project, and user overrides. Enterprise only. The standard_policy layer and built-in command/flag catalog are not included.

verify_audit_log

HMAC chain integrity verification for exec or policy audit logs. Enterprise only.

Design intent The MCP server is a thin, stateless translator. Every response it returns is derived directly from the binary that produced it. No caching, no inference, no state between calls. The binary is always the authority.

Something you may not have expected

An AI-built plan is a policy-governed alternative to a shell script.

A shell script is a sequence of commands toward a goal. So is an AIShell-Gate plan. The difference is everything between the commands and the kernel — policy evaluation, risk scoring, confirmation gates, and a tamper-evident audit trail that a shell script cannot provide. The plan model is limited: no pipes, no branching, no conditionals. But for sequences of known actions toward a stated goal, it is not just safer than a shell script. It is auditable in ways a shell script never can be. See §08 →

02 The Plan Model — Rebuilding the Reflex

If you have arrived at this section because the plan model is not clicking, you are in good company. The program's most common point of confusion is not the policy layer, the audit chain, or the two-binary split — those land quickly. The confusion is more fundamental: a decades-old reflex about how AI agents work with Unix is silently breaking, and it is not always obvious that it is the reflex that is wrong.

This section is written bluntly. It is here to name the reflex, show you why it fails, and replace it with a mental model that fits what the system actually does.

The Reflex That Fails You

Every other MCP server you have used follows the same shape. The server exposes a list of capability-shaped tools: write_file, read_file, edit_file, list_directory, run_command. The AI calls them one at a time. Each call does one small thing. Composition is the AI's responsibility — it strings the calls together in whatever order makes sense.

When you open AIShell-Gate's MCP and see only evaluate_plan and execute_plan, your reflex is to look for the file-writing tool. You scan the tool list again. You check the documentation for what you must have missed. You start to suspect something is wrong with the installation.

Nothing is wrong with the installation. The reflex is the problem. There is no write_file tool because AIShell-Gate is not a filesystem MCP. It is a policy-gated execution MCP that happens to be able to affect files as a side effect of running commands. Those are different categories of thing, and conflating them is where understanding breaks down.

The reframing in one sentence AIShell-Gate does not expose filesystem operations. It exposes a disciplined way to submit and run Unix commands. Everything a command can do — including creating files — is available. Nothing a shell can do that a command alone cannot — redirection, pipes, substitution, conditionals — is available.

Why the Reflex Is Worth Unlearning

The reflex comes from a real tradeoff in how existing tools were designed. A capability-shaped MCP server is easy to use safely because it is narrow. write_file can only write files. It cannot escalate privilege, spawn a shell, or touch the network, because the tool surface simply does not expose those things. Safety comes from what the tool cannot do.

AIShell-Gate takes the opposite approach. It exposes the whole operating system as a submission surface and derives safety from what the policy will allow, not from what the tool surface can address. This is a more powerful model — a single MCP can govern anything the OS can do — but it requires you to think in commands and policy, not in capabilities and tools.

The payoff is that you stop building bespoke MCP servers for every capability you want to expose. One MCP, one policy file, one audit chain, and the AI can do anything you are willing to permit. The cost is that you have to think about the system the way a Unix operator thinks about it: in commands and their effects, with a deterministic policy as the gate.

The Constraints That Define the Model

Before showing what the model can do, it helps to be explicit about what it cannot do. These constraints are not limitations to work around — they are the definition of the model. Every one of them exists because allowing it would reintroduce the attack surface the gate is built to eliminate.

A plan has no	Because
Pipes (`\|`)	Pipes require a shell. The execution path has no shell. Rejected as a metacharacter.
Redirection (`>`, `>>`, `<`)	Redirection is shell behaviour, not command behaviour. Rejected as a metacharacter.
Heredocs (`<<EOF`)	Heredocs are shell syntax for stdin redirection. Rejected as a metacharacter.
Command substitution (`$(...)`, backticks)	Substitution runs an inner command and captures its output. A second execution path outside policy review. Rejected.
Conditionals (`&&`, `\|\|`)	Branching on exit codes is shell control flow. Rejected. (See "State between plans" below for the right approach.)
Loops	Shell control flow. Not expressible in a plan.
Variable expansion (`${VAR}`)	Environment-dependent behaviour makes audit records non-deterministic. Rejected.
Quoted arguments with spaces	Quotes are shell syntax. Supporting them would require implementing a shell grammar subset, which is a bypass surface.
State between plans	Each plan is stateless and self-contained. No accumulated context, no long-lived process. This is by design.

These nine constraints are the mental model. Internalise them and the rest of the system's behaviour follows.

The Model You Should Build Instead

Three analogies that capture the plan model better than "MCP tool":

A plan is a CI job submission, not a terminal session. When you push a commit to a CI server, you are submitting a declarative description of work to be done. The CI server evaluates it against rules (branch protections, required reviewers, merge policy), requires approvals where necessary, runs the job, and produces an immutable record. You do not ssh into the runner and vim a file. You submit a job; the job runs. AIShell-Gate is the same shape, scaled down to the individual command level.

A plan is a signed transaction, not an API call. A database transaction is atomic: a group of operations that either all succeed under the committed rules or all fail together. You cannot reach into the middle of a transaction and change it based on intermediate state. A plan is the same. All the confirmations are collected before any action runs; once execution starts, the plan proceeds deterministically or stops cleanly.

A plan is a work order, not a conversation. When you hand a work order to a contractor, it specifies the goal and the sequence of tasks. If the contractor discovers midway that the plan does not fit the site, they do not improvise — they come back and get a new work order. That is exactly the plan segmentation pattern. Work requiring intermediate judgment is split across multiple plans, with the calling environment reading results and submitting the next plan.

Worked Examples of the Reflex Failing

The fastest way to build the new model is to see the old reflex fail against specific tasks and see what the right answer looks like.

Task 1 — "Write a Dockerfile"

The failing reflex. You reach for a write_file tool that does not exist, then try:

# Plan action — this will be denied at evaluation
{"cmd": "echo 'FROM python:3.11' > Dockerfile"}

The policy engine rejects this at step 1 of evaluation, before any rule lookup. > is a shell metacharacter. There is no shell, so the character has no meaning the executor can honour. The action is denied with a parse-level error.

The model-fit answer. There are three idiomatic approaches. All of them work because none of them need a shell:

Stage the file outside the gate, then place it with a command. The AI writes the Dockerfile to a staging location it already controls (its own working directory, a scratch mount, a git checkout it has direct access to). The plan contains a single cp or mv command to place it in the target location. Policy evaluates cp ./staging/Dockerfile ./Dockerfile like any other file operation — subject to writable_dirs, confirmation level, and audit record.
Commit to version control, pull through the gate. The AI commits the Dockerfile to a branch it already has access to outside the gated environment. The plan runs git pull or git checkout. The writing happens in git; the gate is only involved in the pull. This is the most common real-world pattern.
Use install or cp with a pre-written source file. If the Dockerfile content is reproducible (a template with no per-run variation), keep it in a known location and install -m 0644 /path/to/Dockerfile.template ./Dockerfile. One command. No shell. Fully auditable.

Task 2 — "Check if the build passed, then deploy"

The failing reflex. You reach for the conditional:

# Plan action — this will be denied
{"cmd": "make test && ./deploy.sh"}

&& is a shell metacharacter. Denied at parse.

The model-fit answer. Plan segmentation. Submit plan A, read its result in your orchestration code, then submit plan B based on what came back:

# Plan A — run the test
{"goal": "run test suite", "actions": [{"cmd": "make test"}]}

# Calling code inspects plan A's exit code.
# If zero, submit plan B. If not, surface the failure and stop.

# Plan B — deploy, submitted only on success
{"goal": "deploy build artifact", "actions": [{"cmd": "./deploy.sh"}]}

This is not a workaround. It is the model working correctly. Each plan is a signed transaction. Branching logic lives in the calling environment where it belongs — the Python script, the orchestration agent, the CI pipeline definition — not inside a plan where it would be hidden from policy review.

Task 3 — "Find all Python files and count their lines"

The failing reflex. The pipe:

# Denied — pipe is a metacharacter
{"cmd": "find . -name '*.py' | xargs wc -l"}

The model-fit answer. Use a single command that does the composition internally. Many Unix utilities have flags that replace common pipeline patterns:

# find with -exec: no shell, no pipe, composition done inside find
{"cmd": "find . -name *.py -exec wc -l {} +"}

This works because find spawns wc directly via its own fork/exec machinery — no shell, no pipe. The composition is semantic, not syntactic. This pattern applies broadly: grep -r instead of find | xargs grep, sort -u instead of sort | uniq, awk instead of cut | head.

The general principle: Before reaching for a pipe, ask whether a single command with a richer flag set can do the same work. Most of the time, one can. When none can, you have identified a real case for plan segmentation.

What the Model Asks of You

Three habits replace the ones you are unlearning:

Think in commands, not tools. When you want to affect the system, ask "what Unix command does this?" — not "what MCP tool does this?" There is no one-to-one mapping between capabilities and tools here. The mapping is between capabilities and allowed commands under the active policy.

Move composition out of the command string. Anything that would be a pipe, a conditional, or a substitution in a shell script moves up one level — into the orchestration code that submits plans. The plan itself stays flat and declarative.

Treat each plan as atomic. A plan is not a script; it is a transaction. Compose transactions, do not embed logic inside them. The calling code decides what transaction to submit next based on what the previous one returned.

If you remember one thing The plan model is not a worse shell. It is a different abstraction. A shell composes by letting commands manipulate each other's input and output at runtime. A plan composes by submitting a validated sequence to a policy-governed execution gateway. The first is powerful and unreviewable. The second is constrained and auditable. They are not in competition; they are different tools for different parts of the problem. Once you stop expecting a plan to be a shell, it stops being frustrating and starts being useful.

03 Installation

The MCP server is a single Python 3 script with no external dependencies. Python 3.8 or later is required. The AIShell-Gate binaries (aishell-gate-exec and aishell-gate-policy) must be installed and accessible before the server will function.

Verify Prerequisites

# Python version — 3.8 or later required
python3 --version

# Verify both binaries are reachable
aishell-gate-policy --version
aishell-gate-exec   --version

# Test the MCP server itself
python3 /usr/bin/aishell-gate-mcp --version

The --version output from the MCP server lists the protocol version, all registered tools, and which edition tools require. If the binaries are not found, the server will start but all tool calls will return structured error responses rather than crashing — making misconfiguration diagnosable from within Claude Code.

Place the Script

The MCP server can run from any directory. For system-wide availability, install it alongside the binaries:

install -m 755 aishell-gate-mcp.py /usr/bin/aishell-gate-mcp

Note: The MCP server does not need to be in the same directory as the binaries. Binary paths are resolved at startup via PATH or from absolute paths set in aishell-mcp.json.

Connect Claude Code or Cursor

Add the server to your project's .mcp.json file. The AI coding environment discovers and launches it automatically on startup.

{
  "mcpServers": {
    "aishell-gate": {
      "command": "python3",
      "args": ["/usr/bin/aishell-gate-mcp"],
      "env": {}
    }
  }
}

Restart Claude Code or Cursor after adding this entry. The eight tools will appear in the available tool list. Use get_version to confirm the connection and edition.

04 Configuration

The server reads aishell-mcp.json from the working directory, or from the path given via --config. All fields are optional. A missing file is not an error — defaults are used throughout.

{
  // Binary paths — resolved via PATH if not absolute
  "exec_binary":        "aishell-gate-exec",
  "policy_binary":      "aishell-gate-policy",

  // Policy settings
  "preset":             "ops_safe",
  "jail_root":          null,
  "sandbox":            null,
  "policy_base":        null,
  "policy_project":     null,
  "policy_user":        null,

  // Source identity — set by deployer; never read from the AI plan
  "source":             "ai",

  // Audit — exec log and policy log must be different files
  "audit_log":          null,
  "policy_audit_log":   null,

  // Enterprise: HMAC key file for audit chain verification
  "audit_key":          null,

  // Timeouts in seconds — 0 to disable
  "eval_timeout":       30,
  "input_timeout":      30,

  // Response byte cap — 0 uses binary default (8 MiB)
  "max_response_bytes": 0,

  // Extra flags forwarded verbatim to aishell-gate-exec
  "extra_flags":        []
}

Field Reference

Field	Default	Description
exec_binary	"aishell-gate-exec"	Path to the executor binary. Resolved via PATH if not absolute. Must contain a '/' to prevent PATH substitution attacks if set explicitly.
policy_binary	"aishell-gate-policy"	Path to the policy engine binary. Resolved via PATH if not absolute.
preset	"ops_safe"	Built-in policy preset. Values: `read_only`, `ops_safe`, `dev_sandbox`, `ci_build`, `ci_deploy`, `ci_admin`, `danger_zone`. Applied to all tool calls that involve policy evaluation.
jail_root	null	Restrict write-like commands to this path tree. Forwarded to the policy engine as `--jail-root`. Required when using CI presets for path containment.
sandbox	null	Sandbox mode hint forwarded to the policy engine. Values: `none`, `cwd_jail`, `chroot`, `container`, `userns`. Advisory only except `cwd_jail`, which is actively enforced.
policy_base	null	Path to the base policy override file. Applied below preset in evaluation order.
policy_project	null	Path to the project policy override file. Applied above base, below user.
policy_user	null	Path to the per-user policy override file. Highest priority layer. A deny at any layer is final.
source	"ai"	Source identity label applied to all plans. Set by the deployer — the AI plan cannot override this. Tells the policy engine who is submitting commands. Values: `ai`, `human`, `raw`, `web`, `unknown`.
audit_log	null	Path for the executor audit log (chain_hmac format). Written by `aishell-gate-exec`. Do not point this at the same file as `policy_audit_log` — the formats are incompatible.
policy_audit_log	null	Path for the policy engine audit log (entry_hash format). Written by `aishell-gate-policy`. Separate file from `audit_log`.
audit_key enterprise	null	Path to HMAC key file for `verify_audit_log`. Exec key format: 64 ASCII hex characters (32 bytes). Policy key format: 64 raw binary bytes. These two formats are not interchangeable — use the correct key for the correct log type.
eval_timeout	30	Seconds before the policy engine subprocess is killed. 0 disables. The binary default is also 30; this field only sends the flag if the value differs from the default.
input_timeout	30	Seconds to wait for stdin plan data before failing. 0 disables. Has no effect when reading from a file.
max_response_bytes	0	Kill the policy engine and fail closed if its response exceeds this many bytes. 0 uses the binary default of 8 MiB.
extra_flags	[]	List of additional flags forwarded verbatim to `aishell-gate-exec`. Use for flags not covered by the standard config fields.

CI presets require batch mode: The ci_build and ci_deploy presets are designed for unattended pipelines and require --mode batch to function correctly. This flag must reach aishell-gate-policy, which means it goes after the -- separator in extra_flags. Add the following to aishell-mcp.json when using either CI preset:

"extra_flags": ["--", "--mode", "batch"]

The -- is required. Everything after it is forwarded verbatim to the policy engine. Flags before -- are consumed by the executor. See the executor man page for the full forwarding convention.

Audit log warning: The executor and policy engine write audit logs in incompatible internal formats — chain_hmac and entry_hash respectively. Never point audit_log and policy_audit_log at the same file. Mixing them will produce an unverifiable log and will cause verify_audit_log to report errors.

05 Standard Edition Tools

These six tools are available in both Standard and Enterprise editions. All edition detection is automatic — the server calls --version on both binaries at startup and caches the result.

get_version

standard enterprise

Report the installed edition, version strings for both binaries, a map of which tools are available, and a summary of active configuration. Call this first on every new connection. The response tells the AI agent exactly what it can and cannot do before it attempts any other tool call.

Input

No arguments required.

Response includes

edition — "standard", "enterprise", or "unknown"
exec_version and policy_version — first line of each binary's --version output
available_tools — map of tool name to boolean availability
enterprise_features — map of Enterprise-only feature availability
config_summary — active preset, jail root, audit log paths, key status

When to call it

At the start of every session. If the result shows "edition": "unknown", the binaries may not be installed or reachable — check exec_binary and policy_binary in aishell-mcp.json.

evaluate_plan

standard enterprise

Submit a goal and a list of commands to the policy engine for assessment without executing anything. The internal mechanism is --dry-run-json, which runs the full policy evaluation for every action and returns a machine-readable JSON document. Call this before execute_plan to identify denied actions and required confirmation levels before any command reaches the system.

Input

goal (required) — human-readable description of what the plan is trying to achieve
commands (required) — list of shell command strings, maximum 24
strategy (optional) — "fail_fast" (default) or "best_effort"

Response includes

overall_decision — "allow" or "deny"
actions — per-action array: decision, confirm level, risk score, blast radius, resolved binary path, argv, reason, flag assessment
summary — total, allowed, denied, max confirm level, blocking count
guidance — human-readable string summarising what needs attention

When to call it

Before every execute_plan call. Also useful when helping a user understand why a command would be denied, or when building a plan that needs to stay within a particular confirmation level.

execute_plan

standard enterprise

Submit a goal and list of commands for live execution via aishell-gate-exec. A pre-flight policy evaluation runs automatically before any command reaches the system. Plans with denied actions are rejected immediately. Plans where any action requires confirm level action or typed are blocked — these require a human operator at the terminal, which cannot be satisfied through the MCP interface.

Input

goal (required) — human-readable description of intent
commands (required) — list of shell command strings, maximum 24
strategy (optional) — "fail_fast" (default) or "best_effort"

Response includes

executed — boolean; true only if all actions ran successfully
exit_code — the executor's exit code (0 = all succeeded)
outcome — human-readable outcome string
blocked_reason — present when not executed: "policy_denied" or "confirmation_required"
denied_actions or actions_requiring_confirmation — detail on what blocked execution

Exit code meanings

0 — all actions executed successfully
1 — one or more actions denied by policy
2 — operator confirmation refused
3 — policy engine error
4 — JSON parse error
5 — usage or configuration error
6 — binary execution failed after allow

On confirmation levels Commands at confirm level none or plan can proceed through the MCP. Commands at action or typed cannot — they require a human operator at a terminal. If execute_plan is blocked for this reason, the response lists the specific commands that require confirmation so the operator can either lower the confirm level in their policy file or run those commands manually.

evaluate_command

standard enterprise

Evaluate a single raw command string directly through the policy engine with --json. Returns the full JSON assessment including per-flag catalog analysis, risk score, blast radius, IO classification, confirmation level, taint status, and deny suggestions. Faster and simpler than evaluate_plan for ad-hoc single-command checks — no goal or envelope required.

Input

command (required) — the shell command string to evaluate, e.g. "rm -rf /tmp/old"

Response includes

assessment — full JSON output from the policy engine, including:
- decision (allow / deny), layer, reason
- confirm level, risk score, blast radius, IO class
- flag_assessment — per-flag disposition: safe, warn, danger, or unassessed
- suggestions on deny — commands the policy would permit instead
- busy_summary_text — 3–5 line plain-text summary suitable for display

When to call it

When a user asks why a specific command was denied, what confirmation level a command requires, or what flags the policy considers dangerous. Also useful when building or debugging policy rules — evaluate a command against a modified policy file to see the effect immediately.

get_policy_template

standard enterprise

Emit the built-in policy layer as a ready-to-edit JSON override file via --dump-standard-template. The output includes all rule arrays — cmd_allow, cmd_deny, arg_rules, path_rules, net_rules — along with a schema header documenting every available key and the _replace flags. Use the result as a starting point for custom policy files rather than writing rules from scratch.

Input

preset (optional) — which preset to template. Defaults to the active configured preset. Values: read_only, ops_safe, dev_sandbox, ci_build, ci_deploy, ci_admin, danger_zone

Response includes

template — the parsed JSON policy object, ready to use
header — the plain-text schema documentation that precedes the JSON in raw output
usage — instructions on which config fields to set to activate the file

Workflow

Call get_policy_template, save the template JSON to a file (e.g. my-project-policy.json), edit the relevant rules, then set policy_project in aishell-mcp.json to point at it. The new rules take effect immediately on the next tool call.

_replace flags: By default, rule lists in override files append to the preset's lists. To replace a list entirely, set the corresponding _replace flag: "cmd_allow_replace": true. This prevents unexpected accumulation of rules across layers.

verify_policy

standard enterprise

Run a JSON policy test suite against the active policy and report PASS or FAIL per test case. Each test case states a command and the expected decision — "allow" or "deny". If reality matches expectation, the case passes. This is a verification tool, not a discovery tool: you already know what the policy should do, and you are confirming it still does so after a change.

Input

tests (required) — list of test case objects, each with:
- cmd (required) — the command to evaluate
- expected (required) — "allow" or "deny"
- label (optional) — human-readable description for output
- preset (optional) — override the active preset for this case only
preset (optional) — active preset for all cases that do not specify their own

Response includes

all_passed — boolean; true only if every test case passed
total, passed, failed — counts
results — per-case array with "PASS" or "FAIL" and the output line
exit_code — 0 = all pass, 1 = any fail, 2 = file or parse error
raw_output — the full text output from the policy binary

The difference from evaluate_command

evaluate_command is a discovery tool — it tells you what the policy does to a command you are exploring. verify_policy is a verification tool — it tells you whether the policy behaves the way you have specified it should. Each test case is evaluated independently with no relationship between commands.

CI use

Commit a test suite file alongside your policy override files. After any policy change, call verify_policy with your test cases. If all_passed is false, the change broke a specified expectation — find and fix the regression before deploying. The exit_code field (0 / 1 / 2) maps directly to CI pass/fail conventions.

// Example test suite
[
  { "cmd": "git status",     "expected": "allow", "label": "git read ok" },
  { "cmd": "git push",       "expected": "deny",  "label": "git push blocked on ops_safe" },
  { "cmd": "rm -rf /",       "expected": "deny",  "label": "destructive rm denied" },
  { "cmd": "make",           "expected": "allow", "preset": "dev_sandbox", "label": "make ok in dev" },
  { "cmd": "curl http://x", "expected": "deny",  "label": "network denied under net-default-deny" }
]

06 Enterprise Edition Tools

These two tools require the Enterprise edition binary. Calling them on a Standard installation returns a clear structured error — not a crash or silent failure — mirroring the binary's own behaviour. Call get_version first to confirm edition before attempting enterprise tools.

Edition detection

How the server knows which edition is installed.

At startup, the server calls --version on both binaries and parses the output for the word "standard" or "enterprise". The result is cached for the session. If both binaries report different editions, "enterprise" takes priority. If neither binary is found, the edition is "unknown" and all enterprise tools return an error. The get_version response always shows the detected edition and which features it enables.

dump_policy

enterprise only

Dump the operator-supplied policy overlay layers as JSON via --dump-policy. The output reflects the base, project, and user override layers — the rules you have added on top of the standard policy. The standard_policy layer and the built-in command/flag catalog are intentionally omitted; they constitute proprietary AIShell Labs intellectual property. To inspect how the engine evaluates a specific command, use evaluate_command or evaluate_plan — those return the full per-command decision including which rule and layer matched.

Input

preset (optional) — which preset stack to dump. Defaults to the active configured preset.

Response includes

policy — the JSON dump from --dump-policy, including:
- overlay_layers — array of operator-supplied layer objects (base, project, user) each with their cmd_allow, cmd_deny, arg_rules, path_rules, net_rules. Empty layers are omitted.
- preset, default_deny, net_default_deny, version
- note — explains what is omitted and directs to evaluate_command for decision inspection

When to call it

When verifying that an override file was applied correctly — that your custom rules appear in the expected layer with the expected values. When conducting a security review of operator-added rules. For inspecting why a specific command was allowed or denied, use evaluate_command instead — it returns the matched rule, layer, risk score, and reason directly.

verify_audit_log

enterprise only

Verify the HMAC-SHA256 chain integrity of an audit log file. Detects any gap, truncation, reordering, or post-hoc modification. Each audit log entry is linked to the previous by an HMAC of its content — breaking the chain requires either the key or knowledge of every previous entry. This tool runs the correct verifier binary for the log type you specify.

Input

log_path (required) — absolute path to the audit log file to verify
log_type (optional) — "exec" (default) or "policy"

Response includes

chain_intact — boolean; true means no tampering detected
outcome — human-readable result string
exit_code — 0 = chain intact, 1 = tampering detected, 2 = file error
detail — per-entry report from the verifier binary
key_used — whether an HMAC key was used for verification

Log types — do not mix

The executor and policy engine write logs in incompatible internal formats. Exec logs use the chain_hmac field and are verified by aishell-gate-exec --audit-verify. Policy logs use the entry_hash field and are verified by aishell-gate-policy --audit-verify. Passing the wrong log to the wrong verifier produces an error or a false broken-chain result. The log_type argument selects the correct verifier automatically.

Key files

Set audit_key in aishell-mcp.json to the path of your key file for keyed HMAC verification. Without a persistent key, the binary uses a per-session ephemeral key and cross-session chain verification is not possible. Key file formats differ between the two binaries — see the Configuration section.

Multi-session logs: Each aishell-gate-exec invocation is an independent process with its own in-memory chain state. When multiple sessions write to the same log file, the file contains one internally consistent chain per session_id. Verify chains per session_id rather than treating the entire file as a single linear sequence.

07 Common Workflows

First Connection

Call get_version. Confirm edition and that both binaries are reachable. Note which tools are available.

Check config_summary.preset in the response. Confirm the active preset is appropriate for the current workflow.

If in Enterprise edition and audit logging is configured, note config_summary.audit_log and config_summary.audit_key_set.

Running a Safe Plan

Call evaluate_plan with the goal and command list. Inspect overall_decision and summary.max_confirm.

If any actions are denied, surface the guidance string to the user. Remove or replace denied commands and re-evaluate.

If max confirm is action or typed, inform the operator — execution through the MCP is not possible for those actions. They must either lower the confirm level in their policy file or run those commands manually.

When overall_decision is "allow" and max_confirm is "none" or "plan", call execute_plan with the same goal and commands.

Check executed and exit_code in the response. Surface outcome and stderr_tail to the user if execution failed.

Debugging a Policy Denial

Call evaluate_command with the denied command. Read assessment.reason and assessment.layer — these identify exactly which rule and which policy layer caused the denial.

Read assessment.flag_assessment for per-flag analysis. Flags marked danger or warn show the specific reasoning for any confirmation escalation.

If a policy exception is warranted, call get_policy_template to retrieve the active baseline. Add a targeted cmd_allow rule for the specific command pattern and save as an override file.

Set the override file path in aishell-mcp.json under policy_project or policy_user. Re-evaluate the command to confirm the rule takes effect.

Add a test case to your verify_policy suite asserting the new command is now allowed, and the commands you do not want allowed are still denied.

Authoring and Testing a Policy Override

Call get_policy_template with the desired preset. Save the template JSON to a file.

Edit the file — add rules to cmd_allow, cmd_deny, arg_rules, path_rules, or net_rules. Use _replace: true flags to replace lists entirely rather than appending.

Point policy_project in aishell-mcp.json at the new file. The change takes effect on the next tool call — no restart required.

Call verify_policy with your test suite. Confirm all_passed: true. If any case fails, review the failed case detail and adjust the rule.

Enterprise only: call dump_policy to inspect your operator overlay layers and confirm the new rule appears in the correct layer with the expected values. To verify the decision the rule produces, call evaluate_command with a representative command — it shows the matched rule, layer, and reason directly.

08 Plans as an Alternative to Shell Scripting

A shell script is a sequence of commands toward a goal. So is an AIShell-Gate plan. The difference is everything that happens between the commands and the kernel.

A shell script carries implicit trust — whoever wrote it is assumed to have gotten it right. There is no policy layer, no risk assessment, no mandatory confirmation, and no tamper-evident record of what ran. The script and the shell are a single execution path with nothing in between.

When an AI builds a plan and submits it through the MCP, the structure is the same — a sequence of commands toward a stated goal — but the execution path runs through the policy engine first. Every command is evaluated, risk-scored, assigned a confirmation level, and logged before a single byte reaches the kernel. The AI is doing what a script author does. The environment is fundamentally different from what a shell provides.

Where Plans Are Better

A plan submitted through evaluate_plan is inspectable before it runs. The goal is stated, every action is listed, and the policy engine's assessment of each one is available before the operator commits to execution. A shell script is opaque until it runs.

Policy is external and declarable. A shell script embeds its own permissions implicitly — whatever the executing user can do, the script can do. An AIShell-Gate plan runs against a declared policy stack that is versioned, auditable, and independent of the plan itself. The same plan submitted under a stricter policy produces a different outcome without changing a line of the plan.

The audit record is tamper-evident. Shell history is not. In the Enterprise edition, every plan evaluation and execution is HMAC-chained — a verifiable record that cannot be silently modified after the fact.

Confirmation gates are structural. A destructive shell script runs. A plan with a destructive action stops at the appropriate confirmation level and requires explicit operator acknowledgement before proceeding. The gate is not advisory — it is enforced by the executor.

Where Shell Scripts Are Still Better

Shell scripts compose. They pipe the output of one command into the input of the next. They branch on exit codes. They loop. They handle intermediate state inline. A shell script that checks the output of git status before deciding whether to run git push is straightforward to write and straightforward to read.

The plan model is sequential and flat. Commands in a plan are evaluated independently — there are no conditionals, no loops, no pipes, and no shell metacharacters. Each command is a discrete action. For workflows that depend on intermediate output to decide what to do next, the plan model does not replace a shell script. The right approach in those cases is to break the workflow into plan segments, evaluate intermediate results in the calling environment, and submit subsequent plans based on what was returned.

The practical boundary Use a plan when the sequence of actions is known in advance, the goal can be stated clearly, and auditability and policy enforcement matter. Use a shell script when the workflow requires branching on intermediate output, inline composition, or logic that depends on runtime state the plan model cannot express. The two are not in competition — they address different parts of the problem.

The Adoption Angle

For teams already using Claude Code to generate shell commands, the MCP path requires no change in how they think about the problem. The AI still proposes a sequence of commands toward a goal. The difference is that the sequence now passes through a policy gate, gets confirmed at the appropriate level, and produces an audit record. The cognitive model is identical. The execution environment is safer.

09 The Preamble Guard

All binary stdout is scanned for the first { character before JSON parsing. Any text that precedes the opening brace is silently discarded. This is called the preamble guard.

Why it exists

Evaluation copies of AIShell-Gate write a banner to stdout before the JSON response — for example, an expiry warning such as "Evaluation copy — expires in 23 days." If this text precedes the JSON object, a naive json.loads(stdout) call will fail with a parse error, making the MCP server appear broken when the binary is working correctly.

The guard was added not because the binaries misbehave, but because future binary versions, custom builds, or deployment scenarios could introduce preamble text in stdout without the MCP developer expecting it. The guard makes the server resilient regardless of what a binary writes before its JSON output.

How it works

# In _strip_preamble() — called before every json.loads() on binary stdout

brace = stdout.find('{')

# No JSON object found at all — return a structured error
if brace == -1:
    return error_response("binary output contained no JSON object")

# Preamble present — log how many bytes were stripped, then discard
if brace > 0:
    log.debug("stripped %d bytes of preamble from stdout", brace)

# Return only the JSON portion
return stdout[brace:]

Where it is applied

The guard runs at four points — every location where the server calls json.loads() on binary stdout:

evaluate_plan — on the --dry-run-json output from aishell-gate-exec
execute_plan — on the pre-flight --dry-run-json output before live execution
evaluate_command — on the --json output from aishell-gate-policy
dump_policy — on the --dump-policy output from aishell-gate-policy

The get_policy_template tool is not subject to the guard because it already scans for the first { line by design — the template output has a documented plain-text header block before the JSON body.

What the error looks like

If the guard finds no { in the binary output at all, it returns a structured error response rather than crashing:

{
  "tool": "evaluate_plan",
  "error": true,
  "message": "binary output contained no JSON object — possible eval copy warning or binary error. stderr: ..."
}

This makes the failure diagnosable from within Claude Code without inspecting server logs. The stderr snippet in the message typically reveals whether the issue is a missing binary, a license problem, or an unexpected runtime error.

Design principle The guard is an example of defensive programming that costs nothing when everything works and makes failures explicit when something goes wrong. A future developer adding a new binary invocation to this server should apply the same pattern — call _strip_preamble() before every json.loads() on binary stdout, and pass the stderr string so the error message is informative.

10 Troubleshooting

Edition shows as "unknown"

The server could not find or run one of the binaries at startup. Check that exec_binary and policy_binary in aishell-mcp.json point to the correct locations, or that both are on PATH. Run each binary with --version directly to confirm they are executable.

All tool calls return "binary not found"

The binaries are not reachable from the working directory the MCP server started in. Use absolute paths in aishell-mcp.json rather than relative paths, or install the binaries to a system PATH location.

evaluate_plan returns "exec produced no output"

The executor ran but wrote nothing to stdout. This usually means a startup security check failed — the binary refuses to run as root, or the binary itself has setuid or setgid bits set. Check the stderr field in the error response for the specific check that failed.

execute_plan is blocked with "confirmation_required"

One or more commands in the plan require confirm level action or typed. These cannot be satisfied through the MCP interface. Options: lower the confirm level for the specific commands in a policy override file, break the plan into smaller pieces that exclude those commands, or run the high-risk commands manually at the terminal.

verify_audit_log reports a broken chain

The audit log has been modified, truncated, or reordered since it was written. If you are using an ephemeral key (no audit_key configured), cross-session verification is not possible by design — each session generates its own key. For reliable cross-session verification, configure a persistent audit_key file before the first session that should be verifiable.

dump_policy returns "not available in standard edition"

The installed binary is the Standard edition. dump_policy and verify_audit_log require the Enterprise edition binary. Contact www.aishellgate.com for Enterprise licensing.

Server does not appear in Claude Code after adding to .mcp.json

Claude Code must be restarted after changes to .mcp.json. Ensure the Python 3 interpreter path is correct and that the script is executable. Run the server manually in a terminal to confirm it starts without error: python3 /path/to/aishell-gate-mcp --version.

Enabling debug logging

Pass --debug to the server to enable verbose stderr logging. In Claude Code, stderr from MCP servers is typically available in the developer console or log output. Debug output includes binary invocations, response sizes, preamble strip events, and edition detection results.

# In .mcp.json — add --debug to args
{
  "mcpServers": {
    "aishell-gate": {
      "command": "python3",
      "args": ["/usr/bin/aishell-gate-mcp", "--debug"]
    }
  }
}

11 Edition Summary

Standard Edition

get_version
evaluate_plan
execute_plan
evaluate_command
get_policy_template
verify_policy
All 7 built-in presets
Confirmation gates (none / plan / action / typed)
Audit logging (JSON Lines, no HMAC chain)
Jail-root path enforcement
Custom policy file layers

Enterprise Edition

All Standard features plus:
dump_policy
verify_audit_log
HMAC-SHA256 tamper-evident audit chain
--audit-key (keyed HMAC)
Cryptographic session ID correlation

Run get_version to identify the installed edition. Calling an Enterprise tool on a Standard installation returns a structured error response with a clear message and a link to licensing information — not a crash or an unknown error.

Using the MCPaishell-gate-mcp — Reference and Manual

01 What the MCP Does

An AI-built plan is a policy-governed alternative to a shell script.

02 The Plan Model — Rebuilding the Reflex

The Reflex That Fails You

Why the Reflex Is Worth Unlearning

The Constraints That Define the Model

The Model You Should Build Instead

Worked Examples of the Reflex Failing

Task 1 — "Write a Dockerfile"

Task 2 — "Check if the build passed, then deploy"

Task 3 — "Find all Python files and count their lines"

What the Model Asks of You

03 Installation

Verify Prerequisites

Place the Script

Connect Claude Code or Cursor

04 Configuration

Field Reference

05 Standard Edition Tools

06 Enterprise Edition Tools

How the server knows which edition is installed.

07 Common Workflows

First Connection

Running a Safe Plan

Debugging a Policy Denial

Authoring and Testing a Policy Override

08 Plans as an Alternative to Shell Scripting

Where Plans Are Better

Where Shell Scripts Are Still Better

The Adoption Angle

09 The Preamble Guard

Why it exists

How it works

Where it is applied

What the error looks like

10 Troubleshooting

Edition shows as "unknown"

All tool calls return "binary not found"

evaluate_plan returns "exec produced no output"

execute_plan is blocked with "confirmation_required"

verify_audit_log reports a broken chain

dump_policy returns "not available in standard edition"

Server does not appear in Claude Code after adding to .mcp.json

Enabling debug logging

11 Edition Summary

Using the MCP
aishell-gate-mcp — Reference and Manual