Skip to content

Pipeline IR

ado-aw no longer compiles pipelines by substituting strings into YAML template files. Every production target builds a typed Azure DevOps pipeline IR, resolves graph-level facts, lowers that IR to serde_yaml::Value, and serializes once with serde_yaml::to_string.

The implementation lives under src/compile/ir/. The canonical 5-job agentic-pipeline shape (Setup → Agent → Detection → SafeOutputs → Teardown) lives in src/compile/agentic_pipeline.rs and is shared by every target. Per-target wrappers handle only the envelope:

  • src/compile/standalone_ir.rs
  • src/compile/onees_ir.rs
  • src/compile/job_ir.rs
  • src/compile/stage_ir.rs

Those wrappers are the only place per-target shape (top-level PipelineShape, template parameters, 1ES extends:) should be assembled. Shared canonical-shape logic belongs in agentic_pipeline.rs. Shared target logic should be typed IR construction helpers, not string fragments.

src/compile/ir/ is split by responsibility:

  • ids.rs — typed StageId, JobId, and StepId newtypes. Constructors validate the ADO identifier grammar (^[A-Za-z_][A-Za-z0-9_]*$) so invalid names fail at compile time.
  • step.rsStep and concrete step structs: BashStep, TaskStep, CheckoutStep, DownloadStep, and PublishStep.
  • tasks/ — typed builder structs for 44 built-in ADO task steps spanning toolchain setup, file operations, build/test, artifact publishing, scripting, containers, package auth, Azure integrations, and pipeline control. Each builder exposes new(<required>), typed optional setters, and into_step(). Prefer these over hand-crafted TaskStep::new() calls — see Typed task helpers below.
  • job.rsJob, Pool, job variables, 1ES templateContext support, and target-job external dependsOn / condition wrapping.
  • stage.rsStage plus target-stage external dependsOn / condition wrapping.
  • env.rs — typed environment values (EnvValue) including ADO macros, pipeline variables, secrets, OutputRefs, Coalesce, and macro-form Concat.
  • condition.rs — the Condition / Expr AST and code generation to ADO condition syntax.
  • output.rsOutputDecl, OutputRef, and the output-reference lowering rules.
  • graph.rs — graph construction, dependsOn derivation, output validation, isOutput=true promotion, and cycle detection.
  • validate pass — there is no separate validate.rs module in the current tree; graph invariants live in graph.rs, shape checks live near the relevant lowering code in lower.rs, and target-specific validation stays in the target builder.
  • lower.rs — converts typed IR to a serde_yaml::Value tree.
  • emit.rs — calls lower::lower() and serde_yaml::to_string() for canonical YAML output.

The root type is Pipeline in src/compile/ir/mod.rs:

pub struct Pipeline {
pub name: String,
pub parameters: Vec<Parameter>,
pub resources: Resources,
pub triggers: Triggers,
pub variables: Vec<PipelineVar>,
pub body: PipelineBody,
pub shape: PipelineShape,
}

PipelineBody captures whether the emitted document has a top-level jobs: block or a top-level stages: block:

pub enum PipelineBody {
Jobs(Vec<Job>),
Stages(Vec<Stage>),
}

PipelineShape captures the wrapping rules that used to be split across template files:

pub enum PipelineShape {
Standalone,
OneEs { sdl, top_level_pool, stage_id, stage_display_name },
JobTemplate { external_params },
StageTemplate { external_params },
}

Shape is intentionally separate from body. For example, the 1ES target still builds the canonical job graph as PipelineBody::Jobs; the lowering pass wraps those jobs under the 1ES extends.parameters.stages[0].jobs shape.

All generated pipeline steps should use typed variants from src/compile/ir/step.rs:

pub enum Step {
Bash(BashStep),
Task(TaskStep),
Checkout(CheckoutStep),
Download(DownloadStep),
Publish(PublishStep),
RawYaml(String),
}

Use the typed structs whenever the compiler owns the step:

  • Step::Bash for inline bash (BashStep::script is the raw body, not a YAML block).
  • Step::Task for ADO task invocations such as UseNode@1, UsePythonVersion@0, or UseDotNet@2.
  • Step::Checkout for checkout: steps.
  • Step::Download for pipeline-artifact downloads.
  • Step::Publish for pipeline-artifact publishes. Under 1ES, lowering moves publish steps into templateContext.outputs so artifacts are published by the 1ES template machinery exactly once.
  • Step::RawYaml is reserved for user-authored setup/teardown YAML that the IR does not model. Do not use it for compiler-generated steps that need output refs, conditions, env rewriting, or graph-derived dependencies.

BashStep and TaskStep carry common compiler-owned fields:

  • id: Option<StepId> — emitted as ADO step name:; required when another step consumes an output from this step.
  • display_name: String — emitted as displayName:.
  • env: IndexMap<String, EnvValue> — typed environment values.
  • condition: Option<Condition> — typed ADO condition AST.
  • timeout: Option<Duration> and continue_on_error: bool.
  • outputs: Vec<OutputDecl> on BashStep.

Example:

let synth = Step::Bash(
BashStep::new("Resolve synthetic PR", script)
.with_id(StepId::new("synthPr")?)
.with_output(OutputDecl::new("AW_SYNTHETIC_PR_ID"))
.with_env("BUILD_REASON", EnvValue::ado_macro("Build.Reason")?),
);

src/compile/ir/tasks/ contains typed builder structs for 44 ADO built-in tasks. Each follows the same pattern:

Builder::new(required_params) // required inputs as typed positional args
.optional_setter(value) // chained typed setters — only emitted when called
.into_step() // returns a TaskStep

This eliminates hand-crafted TaskStep::new(…) + raw string inputs at every call site, makes the required/optional boundary explicit, and prevents invalid input combinations for command-dispatch tasks at compile time.

Command-dispatch tasks (Docker@2, DotNetCoreCLI@2, NuGetCommand@2, Npm@1, PowerShell@2, AzurePowerShell@5, AzureCLI@2, PythonScript@0, VSTest@2, GitHubRelease@1, PublishBuildArtifacts@1, UniversalPackages@1) take a typed command/mode enum rather than a plain string — applying an input to the wrong command variant is unrepresentable.

BuilderADO taskConstructor
DockerInstaller::new(docker_version)DockerInstaller@0Docker engine version (e.g. "26.1.4")
GoTool::new(version)GoTool@0Go version (e.g. "1.21")
JavaToolInstaller::new(version_spec, architecture, source)JavaToolInstaller@0JDK version; JdkArchitecture enum; JdkSource enum
UseDotNet::new()UseDotNet@2All inputs optional — chain .with_version(s) or .with_global_json()
UseNode::new(version)UseNode@1Node.js version spec (e.g. "22.x")
UsePythonVersion::new(version_spec)UsePythonVersion@0Python version spec (e.g. "3.x")
UseRubyVersion::new(version_spec)UseRubyVersion@0Ruby version range (e.g. ">= 2.4")
HelmInstaller::new()HelmInstaller@1All inputs optional — chain .helm_version("3.x") to pin a specific version
BuilderADO taskConstructor
ArchiveFiles::new(root_folder_or_file, archive_file)ArchiveFiles@2source path; output archive path
CopyFiles::new(contents, target_folder)CopyFiles@2glob pattern; destination folder
DeleteFiles::new(contents)DeleteFiles@1newline-separated glob patterns
DownloadPackage::new(package_type, feed, definition, version, download_path)DownloadPackage@1PackageType enum; feed; definition; version; local path
DownloadSecureFile::new(secure_file)DownloadSecureFile@1secure-file name or GUID from ADO Secure Files library
ExtractFiles::new(archive_file_patterns, destination_folder)ExtractFiles@1archive glob; destination folder
BuilderADO taskConstructor
AzureCli::new(azure_subscription, ScriptType, ScriptLocation)AzureCLI@2ARM service connection; ScriptType::Bash | Ps | PsCore | Batch enum; ScriptLocation::Inline(script) | ScriptPath(path) enum
AzurePowerShell::file(connection, script_path) / ::inline(connection, script)AzurePowerShell@5Azure RM service connection; .ps1 path or inline script
CmdLine::new(script)CmdLine@2inline script text
PowerShell::file(file_path) / ::inline(script)PowerShell@2.ps1 path or inline script body
PythonScript::file(script_path) / ::inline(script)PythonScript@0.py path or inline script body
BuilderADO taskConstructor
DotNetCoreCli::new(DotNetCommand)DotNetCoreCLI@2DotNetCommand::Build | Test | Publish | Restore | Pack | Run | Push | Custom
Gradle::new(gradle_wrapper_file, tasks)Gradle@3path to gradlew; space-separated task names
Maven::new(maven_pom_file)Maven@3path to pom.xml
PublishCodeCoverageResults::new(summary_file_location)PublishCodeCoverageResults@2glob path to coverage summary file
PublishTestResults::new(format, files)PublishTestResults@2TestResultsFormat enum; result files glob
VsTest::new(VsTestSelector)VSTest@2VsTestSelector::Assemblies | Plan | Run
VsBuild::new(solution)VSBuild@1path to .sln or glob; VsVersion, MsBuildArchitecture, LogFileVerbosity enums and all other inputs optional
BuilderADO taskConstructor
DownloadBuildArtifacts::new(download_path)DownloadBuildArtifacts@1local download directory; BuildType, DownloadType, BuildVersionToDownload enums and all other inputs optional
DownloadPipelineArtifact::new(target_path)DownloadPipelineArtifact@2local download path
PublishBuildArtifacts::new(path_to_publish, artifact_name, location)PublishBuildArtifacts@1source path; artifact name; PublishLocation enum
PublishPipelineArtifact::new(target_path)PublishPipelineArtifact@1path to publish
BuilderADO taskConstructor
AzureKeyVault::new(connected_service_name, key_vault_name)AzureKeyVault@2Azure RM service connection; vault name
AzureWebApp::new(azure_subscription, app_type, app_name, package)AzureWebApp@1Azure RM service connection; AppType enum; app name; package path
Docker::new(DockerCommand)Docker@2DockerCommand::BuildAndPush | Build | Push | Login | Logout
GitHubRelease::new(git_hub_connection, repository_name, action)GitHubRelease@1GitHub service connection; "owner/repo"; GitHubReleaseAction enum
BuilderADO taskConstructor
CargoAuthenticate::new(config_file)CargoAuthenticate@0path to config.toml
MavenAuthenticate::new()MavenAuthenticate@0all inputs optional
Npm::new(NpmCommand)Npm@1NpmCommand::Install | Ci | Publish | Custom
NpmAuthenticate::new(working_file)npmAuthenticate@0path to .npmrc file to authenticate
NuGetAuthenticate::new()NuGetAuthenticate@1all inputs optional
NuGetCommand::new(NuGetOp)NuGetCommand@2NuGetOp::Restore | Push | Pack | Custom
PipAuthenticate::new()PipAuthenticate@1all inputs optional
TwineAuthenticate::new()TwineAuthenticate@1all inputs optional
UniversalPackages::download(feed, package_name, spec) / ::publish(…, spec)UniversalPackages@1UniversalPackagesDownload / UniversalPackagesPublish carry per-command optionals; .workload_identity_service_connection() for cross-org feeds
BuilderADO taskConstructor
ManualValidation::new(notify_users)ManualValidation@1comma-separated notification addresses; all approval settings (.approvers(), .instructions(), .on_timeout(OnTimeout::Reject | Resume)) optional
use crate::compile::ir::step::Step;
use crate::compile::ir::tasks::{
dotnet_core_cli::{DotNetBuild, DotNetCoreCli, DotNetTest},
powershell::PowerShell,
publish_test_results::{PublishTestResults, TestResultsFormat},
use_node::UseNode,
};
// Install Node.js — simple builder, one required input
let node = Step::Task(UseNode::new("22.x").into_step());
// Build .NET — command-dispatch with optional inputs on the command variant
let build = Step::Task(
DotNetCoreCli::build(
DotNetBuild::new()
.projects("**/*.csproj")
.arguments("--configuration Release"),
)
.into_step(),
);
// Run tests (separate command variant; cannot accidentally set build-only inputs)
let test = Step::Task(
DotNetCoreCli::test(DotNetTest::new().projects("**/*Tests.csproj"))
.into_step(),
);
// Publish test results — required inputs are typed, not raw strings
let publish = Step::Task(
PublishTestResults::new(TestResultsFormat::VSTest, "**/*.trx").into_step(),
);
// Inline PowerShell with optional pwsh flag
let ps = Step::Task(
PowerShell::inline("Write-Output 'hello'")
.pwsh(true)
.into_step(),
);

When you need an ADO task that doesn’t have a typed builder yet:

  1. Create src/compile/ir/tasks/<snake_name>.rs. For a simple task, follow use_node.rs as the template. For a task with multiple commands or modes, follow docker.rs (canonical command-dispatch template).
  2. Export the new module from src/compile/ir/tasks/mod.rs with pub mod <snake_name>;.
  3. Map required inputs to positional pub fn new(…) parameters; optional inputs become chained setters that return Self.
  4. pub fn into_step(self) -> TaskStep emits only the inputs that are Some.
  5. Add an ADO task reference link in the module doc comment.
  6. Write a unit test asserting the task identifier, display name, and that required inputs are present and optional inputs are absent when unset.

Do not use Step::RawYaml for tasks the IR can model. Typed builders preserve all compiler-owned fields (condition, env, timeout, continue_on_error) and participate correctly in the graph pass.

A producer declares a step output with OutputDecl:

OutputDecl::new("AW_SYNTHETIC_PR_ID")
OutputDecl::secret("MCP_GATEWAY_API_KEY")

A consumer references it with OutputRef:

let r = OutputRef::new(StepId::new("synthPr")?, "AW_SYNTHETIC_PR_ID");
EnvValue::step_output(r)

The consumer does not choose the ADO expression syntax. output.rs::lower_outputref() chooses the correct syntax from the consumer and producer locations:

Consumer vs. producerLowered syntax
Same job$(stepName.X)
Sibling job in the same stage, or both jobs are stage-lessdependencies.<job>.outputs['stepName.X']
Different stagestageDependencies.<stage>.<job>.outputs['stepName.X']

This rule exists because Azure DevOps output variables are context-sensitive. The historical synthPr failures came from hand-written code using the wrong reference form for the consumer location. The IR centralizes that choice so new compiler code declares what it needs (OutputRef) rather than guessing how ADO will expose it.

graph.rs also sets OutputDecl::auto_is_output = true when any consumer reads the declaration. The producer can then emit ##vso[task.setvariable ...;isOutput=true] only when cross-step visibility is actually needed.

graph.rs::resolve() is the all-in-one pass for dependency derivation:

  1. Index every named step and its declared outputs.
  2. Walk every EnvValue::StepOutput, every output nested inside EnvValue::Coalesce / EnvValue::Concat, and every Expr::StepOutput inside conditions.
  3. Validate that each reference names an existing step with a matching OutputDecl.
  4. Lift step-output edges into job-level and stage-level dependencies.
  5. Detect cycles in the derived job and stage graphs.
  6. Merge the derived edges into Job::depends_on and Stage::depends_on while preserving any explicit values a target builder supplied.
  7. Mark producer outputs that need isOutput=true.

Same-job refs do not produce dependsOn entries because ADO orders steps by position. Cross-job refs add Job::depends_on; cross-stage refs add Stage::depends_on. The lowering pass reads those fields and emits canonical dependsOn: blocks.

condition.rs defines a small AST for ADO conditions:

pub enum Condition {
Succeeded,
Always,
Failed,
SucceededOrFailed,
And(Vec<Condition>),
Or(Vec<Condition>),
Not(Box<Condition>),
Eq(Expr, Expr),
Ne(Expr, Expr),
Custom(String),
}
pub enum Expr {
Literal(String),
Variable(String),
StepOutput(OutputRef),
}

Use constructors such as Condition::and([...]), Condition::or([...]), and Condition::not(...) when composing nested expressions. Codegen flattens nested And / Or nodes and quotes string literals for ADO expression syntax:

Condition::Eq(
Expr::Variable("Build.Reason".into()),
Expr::Literal("PullRequest".into()),
)

lowers to:

eq(variables['Build.Reason'], 'PullRequest')

Expr::StepOutput uses the same location-aware output-ref lowering as EnvValue::StepOutput. Condition::Custom is an escape hatch for expressions not yet modeled by the AST; codegen rejects embedded newlines and ADO pipeline-command markers (##vso[, ##[) before emitting it.

The extension trait lives in src/compile/extensions/mod.rs and now has exactly three surface methods:

pub trait CompilerExtension {
fn name(&self) -> &str;
fn phase(&self) -> ExtensionPhase;
fn declarations(&self, ctx: &CompileContext) -> Result<Declarations>;
}

Declarations is the typed aggregate for every signal an extension contributes:

  • agent_prepare_steps: Vec<Step>
  • setup_steps: Vec<Step>
  • agent_finalize_steps: Vec<Step>
  • detection_prepare_steps: Vec<Step>
  • safe_outputs_steps: Vec<Step>
  • network_hosts: Vec<String>
  • bash_commands: Vec<String>
  • prompt_supplement: Option<String>
  • mcpg_servers: Vec<(String, McpgServerConfig)>
  • copilot_allow_tools: Vec<String>
  • pipeline_env: Vec<PipelineEnvMapping>
  • awf_mounts: Vec<AwfMount>
  • awf_path_prepends: Vec<String>
  • agent_env_vars: Vec<(String, String)>
  • warnings: Vec<String>

Extension phases are System, Runtime, and Tool. The compiler sorts extensions by phase before merging declarations, so internal system plumbing lands first, runtime installs land before user tools, and tool extensions can assume requested runtimes are available.

Always-on extensions are collected in collect_extensions() before user-configured runtimes/tools:

  • AdoAwMarkerExtension
  • GitHubExtension
  • SafeOutputsExtension
  • AdoScriptExtension
  • ExecContextExtension
  • AzureCliExtension

lower.rs::lower() builds and validates a Graph, then converts the typed Pipeline into a serde_yaml::Value tree. The lowerer owns ADO wire shapes and canonical ordering: top-level identity and configuration keys first, then jobs: / stages:, with target-specific wrapping based on PipelineShape.

emit.rs::emit() is intentionally thin:

pub fn emit(pipeline: &Pipeline) -> Result<String> {
let value = super::lower::lower(pipeline)?;
serde_yaml::to_string(&value)
}

This gives all targets one serialization path and one canonical YAML style. Target compilers should return a complete typed Pipeline; they should not format YAML directly.

The production target wrappers are:

  • standalone_ir.rs — wraps the canonical shape in a top-level standalone pipeline.
  • onees_ir.rs — wraps the same canonical shape with PipelineShape::OneEs, causing the lowerer to emit the 1ES extends: wrapper and templateContext outputs.
  • job_ir.rs — wraps the canonical shape as a target-job template with external dependsOn / condition template parameters.
  • stage_ir.rs — wraps the canonical shape as a target-stage template with the stage-level external-parameter wrapper.

The canonical 5-job Setup → Agent → Detection → SafeOutputs → Teardown shape itself lives in agentic_pipeline.rs and is reused unchanged by every wrapper above; extensions plug into it via Declarations (steps, env, hosts, MCPG entries, and Agent-job condition clauses — see Declarations::agent_conditions).

When adding a target, follow the same pattern: parse and validate front matter, collect extension Declarations, build typed jobs/stages/steps, set the correct PipelineShape, and call the shared emit path.

The internal IR types (Pipeline, Job, Step, Graph, …) are intentionally tied to the compiler’s lowering needs and are not public API. src/compile/ir/summary.rs defines a parallel summary tree with #[derive(Serialize)] that provides agent-facing tooling with a stable JSON view of a compiled pipeline.

This is the schema consumed by:

  • ado-aw inspect <source> --json — returns a full PipelineSummary
  • ado-aw graph dump <source> --format json — returns a GraphSummary subset
  • ado-aw graph deps / ado-aw graph outputs — focused graph queries
  • ado-aw whatif — static downstream skip analysis built on graph reachability
  • ado-aw audit --json — the pipeline_graph field in AuditData
  • The author MCP server tools (inspect_workflow, graph_summary, graph_dump)

PipelineSummary::schema_version (currently 1) is the public schema version. It is bumped when the JSON shape changes in a backwards-incompatible way — renamed field, removed variant, or changed semantics. Additive changes (new optional fields) do not require a bump. New enum variants do require a bump because the serialized enums have no catch-all Unknown variants.

Internal IR types may change freely without bumping the summary version, as long as the summary.rs lowering keeps the existing field set populated correctly.

{
"schema_version": 1,
"name": "<pipeline name>", // ADO build-number format string
"shape": "standalone" | "1es" | "job-template" | "stage-template",
"body": { "kind": "jobs", "jobs": [...] },
// OR
{ "kind": "stages", "stages": [...] },
"graph": { ... } // see GraphSummary below
}

The body discriminant (kind) mirrors PipelineBody: flat pipelines (standalone, job-template) use "jobs", stage-wrapped pipelines (1es, stage-template) use "stages".

Each entry in body.jobs (or inside a stage’s jobs array):

FieldTypeDescription
idstringADO job identifier (name: in YAML)
stagestring or nullStage id this job belongs to; null for flat (non-stage) pipelines
display_namestringHuman-readable displayName:
depends_onstring[]dependsOn: entries — both explicitly declared and graph-derived
conditionstring or nullLowered ADO condition expression, e.g. "succeeded()"
poolobjectPool summary — see PoolSummary below
stepsobject[]Ordered list of step summaries — see StepSummary below

PoolSummary has one of two shapes depending on pool type:

// Microsoft-hosted
{ "kind": "vm_image", "image": "ubuntu-22.04" }
// Self-hosted or 1ES
{ "kind": "named", "name": "MyPool", "image": null, "os": "linux" }

Each entry in job.steps:

FieldTypeDescription
idstring or nullADO step name: — present when other steps read this step’s outputs
kindstringStep variant: "bash", "task", "checkout", "download", "publish", "raw_yaml"
display_namestring or nullHuman-readable displayName:
taskstring or nullFor task steps only: ADO task identifier, e.g. "UseNode@1"
conditionstring or nullLowered ADO condition expression
outputsobject[]Output variables declared by this step
env_refsobject[]Other steps’ outputs read via this step’s env: block
condition_refsobject[]Other steps’ outputs read via this step’s condition:

Each entry in outputs:

{
"name": "AW_SYNTHETIC_PR_ID", // ##vso[task.setvariable variable=...] name
"is_secret": false, // true → value is masked in logs
"auto_is_output": true // true → at least one cross-step consumer → emit isOutput=true
}

Each entry in env_refs / condition_refs:

{ "step": "synthPr", "name": "AW_SYNTHETIC_PR_ID" }

The graph field is a JSON-friendly view of the typed Graph built during lowering:

FieldTypeDescription
step_locationsobject[]Every named step with its job/stage location and declared outputs
job_edgesobject[]Derived job-level dependsOn edges
stage_edgesobject[]Derived stage-level dependsOn edges
outputs_needing_is_outputobject[]Producer steps whose outputs are read cross-step (need isOutput=true)

Each job_edges / stage_edges entry:

{ "consumer": "Agent", "producer": "Setup" }
// means: Agent dependsOn Setup

Each step_locations entry:

{ "step": "synthPr", "stage": null, "job": "Setup", "outputs": ["AW_SYNTHETIC_PR_ID"] }

Each outputs_needing_is_output entry:

{ "step": "synthPr", "outputs": ["AW_SYNTHETIC_PR_ID"] }

Running ado-aw inspect examples/sample-agent.md --json for a simple PR-triggered pipeline returns a summary like:

{
"schema_version": 1,
"name": "sample-agent",
"shape": "standalone",
"body": {
"kind": "jobs",
"jobs": [
{ "id": "Setup", "stage": null, "depends_on": [], ... },
{ "id": "Agent", "stage": null, "depends_on": ["Setup"], ... },
{ "id": "Detection", "stage": null, "depends_on": ["Agent"], ... },
{ "id": "SafeOutputs", "stage": null, "depends_on": ["Detection"], ... },
{ "id": "Teardown", "stage": null, "depends_on": ["SafeOutputs"], ... }
]
},
"graph": {
"job_edges": [
{ "consumer": "Agent", "producer": "Setup" },
{ "consumer": "Detection", "producer": "Agent" },
{ "consumer": "SafeOutputs", "producer": "Detection" },
{ "consumer": "Teardown", "producer": "SafeOutputs" }
],
...
}
}

The five jobs reflect the canonical Setup → Agent → Detection → SafeOutputs → Teardown shape for every compiled pipeline.