Skip to content

ado-aw audit

ado-aw audit inspects one completed Azure DevOps build at a time. It downloads the three audit artifact families (agent outputs, detection outputs, safe outputs), runs the built-in analyzers (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, and missing-tool / missing-data / noop extraction), and renders a structured console report or the raw AuditData JSON.

ado-aw audit <build-id-or-url> [options]
InputExample
Numeric build ID12345
dev.azure.com URLhttps://dev.azure.com/my-org/My%20Project/_build/results?buildId=12345
dev.azure.com URL with job/step anchors...?buildId=12345&j=<guid>&t=<guid> (accepted; the build-level audit still runs)
Legacy visualstudio.com URLhttps://my-org.visualstudio.com/proj/_build/results?buildId=12345
On-prem Azure DevOps Server URLhttps://onprem.example.com/DefaultCollection/MyProject/_build/results?buildId=12345

URL-encoded project segments are decoded automatically. Both t= and s= are accepted as step-anchor parameters.

FlagDefaultBehavior
-o, --output <dir>./logsDirectory under which <dir>/build-<id>/ is written. Non-CLI entry points (ado-aw trace and the mcp-author tools) default to the shared ${TEMP}/ado-aw/audit cache root so they do not scatter ./logs/ directories under arbitrary working directories.
--jsonoffEmit the full AuditData as JSON to stdout. Suppresses the trailing Audit complete stderr line.
--org <url>autoADO organization override for bare build IDs. Full build URLs supply this directly.
--project <name>autoADO project override for bare build IDs. Full build URLs supply this directly.
--pat <token>envPersonal Access Token. Also reads AZURE_DEVOPS_EXT_PAT. Falls back to the Azure CLI auth chain when omitted.
--artifacts <set,...>allRestrict download + analysis to a subset. Valid values: agent, detection, safe-outputs (safe_outputs is also accepted).
--no-cacheoffForce re-processing even if <dir>/build-<id>/run-summary.json already exists.
  • Input resolution. Bare IDs use --org / --project or git-remote auto-detection. Full build URLs contribute host, org, and project — those URL-derived values win over CLI flags.
  • Artifact scope. Only agent_outputs*, analyzed_outputs*, and safe_outputs* are fetched. All other published build artifacts are ignored.
  • Artifact refresh. If a local artifact directory already exists, it is renamed aside before re-download and restored if the download fails — no data is lost on a network error.
  • Analyzer failures are soft. The command records a warning, keeps any successfully-derived sections, and still renders the report.
  • Multiple directories. When multiple local directories share one recognized prefix, the lexicographically last match wins.
<output>/build-<id>/
├── run-summary.json # Cached AuditData, CLI-version-keyed
├── agent_outputs[_<BuildId>]/ # Agent stage artifacts
│ ├── staging/
│ │ ├── safe_outputs.ndjson # Agent's safe-output proposals
│ │ ├── aw_info.json # Runtime engine / agent / source metadata
│ │ └── otel.jsonl # Copilot OTel (when emitted)
│ └── logs/
│ ├── firewall/ # AWF Squid proxy logs
│ ├── mcpg/ # MCP Gateway logs
│ ├── safeoutputs.log # SafeOutputs HTTP server log
│ └── agent-output.txt # Filtered agent stdout
├── analyzed_outputs[_<BuildId>]/ # Detection stage artifacts
│ ├── threat-analysis.json # Aggregate verdict + reasons
│ └── threat-analysis-output.txt
└── safe_outputs[_<BuildId>]/ # SafeOutputs stage artifacts
└── safe-outputs-executed.ndjson # Per-item execution log

aw_info.json, otel.jsonl, and safe_outputs.ndjson are searched in staging/ first, then at the artifact top level, so older artifact layouts still audit cleanly.

Optional sections are omitted from --json output when empty.

KeySource
overviewADO build metadata + aw_info.json (engine, model, agent name, source, target).
task_domainAudit heuristics over the run’s prompts and outputs.
behavior_fingerprintHigher-level heuristics over the run’s behavior patterns.
agentic_assessmentsHigher-level assessments emitted by the analyzers.
metricsOTel JSONL (otel.jsonl) plus audit-time warning/error counts.
key_findingsHeuristic rules + analyzer findings (e.g. aggregate-gate rejection).
recommendationsFollow-up actions derived from findings.
performance_metricsDerived from metrics, runtime duration, tool usage, and firewall counts.
engine_configRuntime engine configuration from aw_info.json.
safe_output_summaryCounts of proposed / executed / rejected / not-processed items.
safe_output_executionPer-item trace joining proposal + detection + execution.
rejected_safe_outputsRollup of rejections by reason/threat flag.
detection_analysisContents of threat-analysis.json.
mcp_server_healthMCPG logs aggregated per server.
mcp_tool_usageMCPG logs aggregated per (server, tool).
mcp_failuresMCPG tool_error / server_error events.
jobsADO /timeline records filtered to type: Job.
firewall_analysisAWF Squid proxy logs aggregated by domain.
policy_analysisAWF policy artifacts aggregated into allow/deny summaries.
missing_tools / missing_data / noopsNDJSON entries from the corresponding SafeOutputs MCP tools.
downloaded_filesOne entry per file under <output>/build-<id>/.
errors / warningsRun-level error/warning aggregates.
tool_usageHigh-level tool-usage rollups derived from telemetry.
created_itemsSuccessfully executed items with extracted id/url/title.

When threat-analysis.json reports any threat flag, the audit treats the entire SafeOutputs batch as rejected by the aggregate gate and records each proposal with:

  • status: not_processed_due_to_aggregate_gate
  • applies_to_whole_batch: true
  • rejection_reason: the aggregate reasons[] from threat-analysis.json, joined with ;

One severity-high finding is also emitted summarizing the gate decision: which threat flags fired, how many proposals were dropped, and the full aggregate reasons.

<output>/build-<id>/run-summary.json is written after each successful run.

ScenarioBehavior
Cached ado_aw_version matches current CLIReport rendered from cache; download/analysis skipped.
Cache missing, unparseable, or from a different versionCache ignored; build reprocessed from scratch.
--no-cache passedAlways reprocesses.

The cache-hit info line is printed only in console mode (not with --json).

  • The initial build-metadata fetch is live ADO only. A 401/403 at this step is fatal.
  • If artifact listing or download returns 401/403 and at least one recognized artifact family exists locally, the audit continues from local cache and records a warning.
  • If artifact listing or download returns 401/403 and no local cache exists, the command emits a structured error pointing at the manual escape hatch:
Terminal window
az pipelines runs artifact download --run-id <id> --path <dir>