Compilation Process
This guide documents the internal compilation process that transforms markdown workflow files into executable GitHub Actions YAML. Understanding this process helps when debugging workflows, optimizing performance, or contributing to the project.
Overview
Section titled “Overview”The gh aw compile command transforms a markdown file with YAML frontmatter into a complete GitHub Actions workflow with multiple orchestrated jobs. This process involves:
- Parsing - Extract frontmatter and markdown content
- Validation - Verify configuration against JSON schemas
- Job Building - Create specialized jobs for different workflow stages
- Dependency Management - Establish job execution order
- YAML Generation - Output final
.lock.ymlfile
Compilation Phases
Section titled “Compilation Phases”Phase 1: Parsing and Validation
Section titled “Phase 1: Parsing and Validation”The compilation process reads the markdown file and:
- Extracts YAML frontmatter
- Parses workflow configuration
- Validates against the workflow schema
- Resolves imports from
imports:field - Validates expression safety (only allowed GitHub Actions expressions)
Phase 2: Job Construction
Section titled “Phase 2: Job Construction”The compilation process builds multiple specialized jobs:
- Pre-activation job (if needed)
- Activation job
- Main agent job
- Safe output jobs
- Safe-jobs
- Custom jobs
Phase 3: Dependency Resolution
Section titled “Phase 3: Dependency Resolution”The compilation process validates and orders jobs:
- Checks all job dependencies exist
- Detects circular dependencies
- Computes topological execution order
- Generates Mermaid dependency graph
Phase 4: Action Pinning
Section titled “Phase 4: Action Pinning”All GitHub Actions are pinned to commit SHAs for security:
- Check action cache for cached resolution
- Try dynamic resolution via GitHub API
- Fall back to embedded action pins data
- Add version comment (e.g.,
actions/checkout@sha # v4)
Phase 5: YAML Generation
Section titled “Phase 5: YAML Generation”The compilation process assembles the final workflow:
- Renders workflow header with metadata comments
- Includes job dependency Mermaid graph
- Generates jobs in alphabetical order
- Embeds original prompt as comment
- Writes
.lock.ymlfile
Job Types
Section titled “Job Types”The compilation process generates specialized jobs based on workflow configuration:
| Job | Trigger | Purpose | Key Dependencies |
|---|---|---|---|
| pre_activation | Role checks, stop-after deadlines, skip-if-match, or command triggers | Validates permissions, deadlines, and conditions before AI execution | None (runs first) |
| activation | Always | Prepares workflow context, sanitizes event text, validates lock file freshness | pre_activation (if exists) |
| agent | Always | Core job that executes AI agent with configured engine, tools, and MCP servers | activation |
| detection | safe-outputs.threat-detection: configured | Scans agent output for security threats before processing | agent |
| Safe output jobs | Corresponding safe-outputs.*: configured | Process agent output to perform GitHub API operations (create issues/PRs, add comments, upload assets, etc.) | agent, detection (if exists) |
| conclusion | Always (if safe outputs exist) | Aggregates results and generates workflow summary | All safe output jobs |
Agent Job Steps
Section titled “Agent Job Steps”The agent job orchestrates AI execution through these phases:
- Repository checkout and runtime setup (Node.js, Python, Go)
- Cache restoration for persistent memory
- MCP server container initialization
- Prompt generation from markdown content
- Engine execution (Copilot, Claude, or Codex)
- Output upload as GitHub Actions artifact
- Cache persistence for next run
Environment variables include GH_AW_PROMPT (prompt file), GH_AW_SAFE_OUTPUTS (output JSON), and GITHUB_TOKEN.
Safe Output Jobs
Section titled “Safe Output Jobs”Each safe output type (create issue, add comment, create PR, etc.) follows a consistent pattern: download agent artifact, parse JSON output, execute GitHub API operations with appropriate permissions, and link to related items.
Common safe output jobs:
- create_issue / create_discussion - Create GitHub items with labels and prefixes
- add_comment - Comment on issues/PRs with links to created items
- create_pull_request - Apply git patches, create branch, open PR
- create_pr_review_comment - Add line-specific code review comments
- create_code_scanning_alert - Submit SARIF security findings
- add_labels / assign_milestone - Manage issue metadata
- update_issue / update_release - Modify existing items
- push_to_pr_branch / upload_assets - Handle file operations
- update_project - Sync with project boards
- missing_tool / noop - Report issues or log status
Custom Jobs
Section titled “Custom Jobs”Use safe-outputs.jobs: for custom jobs with full GitHub Actions syntax, or jobs: for additional workflow jobs with user-defined dependencies.
Job Dependency Graphs
Section titled “Job Dependency Graphs”Jobs execute in topological order based on dependencies. Here’s a comprehensive example:
graph LR pre_activation["pre_activation"] activation["activation"] agent["agent"] detection["detection"] create_issue["create_issue"] add_comment["add_comment"] conclusion["conclusion"] pre_activation --> activation activation --> agent agent --> detection agent --> create_issue agent --> add_comment detection --> create_issue detection --> add_comment create_issue --> add_comment create_issue --> conclusion add_comment --> conclusion
Execution flow: Pre-activation validates permissions → Activation prepares context → Agent executes AI → Detection scans output → Safe outputs run in parallel → Add comment waits for created items → Conclusion summarizes results.
Safe output jobs without cross-dependencies run concurrently for performance. When threat detection is enabled, safe outputs depend on both agent and detection jobs.
Action Pinning
Section titled “Action Pinning”All GitHub Actions are pinned to commit SHAs (e.g., actions/checkout@b4ffde6...11 # v4) to prevent supply chain attacks and ensure reproducibility. Tags can be moved to malicious commits, but SHA commits are immutable.
Resolution process: Check cache (.github/aw/actions-lock.json) → Query GitHub API for latest SHA → Fall back to embedded pins → Cache result for future compilations. Dynamic resolution fetches current SHAs for tag references and stores them with timestamps.
Artifacts Created
Section titled “Artifacts Created”Workflows generate several artifacts during execution:
| Artifact | Location | Purpose | Lifecycle |
|---|---|---|---|
| agent_output.json | /tmp/gh-aw/safeoutputs/ | AI agent output with structured safe output data (create_issue, add_comment, etc.) | Uploaded by agent job, downloaded by safe output jobs, auto-deleted after 90 days |
| prompt.txt | /tmp/gh-aw/aw-prompts/ | Generated prompt sent to AI agent (includes markdown instructions, imports, context variables) | Retained for debugging and reproduction |
| firewall-logs/ | /tmp/gh-aw/firewall-logs/ | Network access logs in Squid format (when network.firewall: enabled) | Analyzed by gh aw logs command |
| cache-memory/ | /tmp/gh-aw/cache-memory/ | Persistent agent memory across runs (when tools.cache-memory: configured) | Restored at start, saved at end via GitHub Actions cache |
| patches/, sarif/, metadata/ | Various | Safe output data (git patches, SARIF files, metadata JSON) | Temporary, cleaned after processing |
MCP Server Integration
Section titled “MCP Server Integration”Model Context Protocol (MCP) servers provide tools to AI agents. Compilation generates mcp-config.json from workflow configuration.
Local MCP servers run in Docker containers with auto-generated Dockerfiles. Secrets inject via environment variables, and engines connect via stdio.
HTTP MCP servers require no containers. Engines connect directly with configured headers and authentication.
Tool filtering via allowed: restricts agent access to specific MCP tools. Environment variables inject through Dockerfiles (local) or config references (HTTP).
Agent job integration: MCP containers start after runtime setup → Engine executes with tool access → Containers stop after completion.
Pre-Activation Job
Section titled “Pre-Activation Job”Pre-activation enforces security and operational policies before expensive AI execution. It validates permissions, deadlines, and conditions, setting activated=false to skip downstream jobs when checks fail.
Validation types:
- Role checks (
roles:): Verify actor has required permissions (admin, maintainer, write) - Stop-after (
on.stop-after:): Honor time-limited workflows (e.g.,+30d,2024-12-31) - Skip-if-match (
skip-if-match:): Prevent duplicates by searching for existing items matching criteria - Command position (
on.command:): Ensure command appears in first 3 lines to avoid accidental triggers
Pre-activation runs checks sequentially. Any failure sets activated=false, preventing AI execution and saving costs.
Compilation Commands
Section titled “Compilation Commands”| Command | Description |
|---|---|
gh aw compile | Compile all workflows in .github/workflows/ |
gh aw compile my-workflow | Compile specific workflow |
gh aw compile --verbose | Enable verbose output |
gh aw compile --strict | Enhanced security validation |
gh aw compile --no-emit | Validate without generating files |
gh aw compile --actionlint --zizmor --poutine | Run security scanners |
gh aw compile --purge | Remove orphaned .lock.yml files |
gh aw compile --output /path/to/output | Custom output directory |
Debugging Compilation
Section titled “Debugging Compilation”Enable verbose logging: DEBUG=workflow:* gh aw compile my-workflow --verbose shows job creation, action pin resolutions, tool configurations, and MCP setups.
Inspect .lock.yml files: Check header comments (imports, dependencies, prompt), job dependency graphs (Mermaid diagrams), job structure (steps, environment, permissions), action SHA pinning, and MCP configurations.
Common issues:
- Circular dependencies: Review
needs:clauses in custom jobs - Missing action pin: Add to
action_pins.jsonor enable dynamic resolution - Invalid MCP config: Verify
command,args, andenvsyntax
Performance Optimization
Section titled “Performance Optimization”Compilation speed: Simple workflows compile in ~100ms, complex workflows with imports in ~500ms, and workflows with dynamic action resolution in ~2s. Optimize by using action cache (.github/aw/actions-lock.json), minimizing import depth, and pre-compiling shared workflows.
Runtime performance: Safe output jobs without dependencies run in parallel. Enable cache: for dependencies, use cache-memory: for persistent agent memory, and cache action resolutions for faster compilation.
Advanced Topics
Section titled “Advanced Topics”Custom engine integration: Create engines that return GitHub Actions steps, provide environment variables, and configure tool access. Register with the framework for workflow availability.
Schema extension: Add frontmatter fields by updating the workflow schema, rebuilding (make build), adding parser handling, and updating documentation.
Workflow manifest resolution: Compilation tracks imported files in lock file headers for dependency tracking, update detection, and audit trails.
Best Practices
Section titled “Best Practices”Security: Always use action pinning (never floating tags), enable threat detection (safe-outputs.threat-detection:), limit tool access with allowed:, review generated .lock.yml files, and run security scanners (--actionlint --zizmor --poutine).
Maintainability: Use imports for shared configuration, document complex workflows with description:, compile frequently during development, version control lock files and action pins (.github/aw/actions-lock.json).
Performance: Enable caching (cache: and cache-memory:), minimize imports to essentials, optimize tool configurations with restricted allowed: lists, use safe-jobs for custom logic.
Debugging: Enable verbose logging (--verbose), check job dependency graphs in headers, inspect artifacts and firewall logs (gh aw logs), validate without file generation (--no-emit).
Related Documentation
Section titled “Related Documentation”- Frontmatter Reference - All configuration options
- Tools Reference - Tool configuration guide
- Safe Outputs Reference - Output processing
- Engines Reference - AI engine configuration
- Network Reference - Network permissions