Anthropic Official Claude AI Certification

Claude Certified Architect
Foundations Certification Exam Guide

This Anthropic certification validates practitioners’ ability to make informed tradeoff decisions when building production-grade solutions with Claude AI. This site consolidates the Claude Certified Architect exam domains, scenarios, sample questions, and preparation strategies.

60 Questions
120 Minutes
720 Passing Score
5 Domains
6 Scenarios

What You Need to Know About the Claude Architect Certification

The Claude Certified Architect – Foundations is Anthropic's first architecture-level technical certification. It validates practitioners' ability to make informed design decisions and tradeoffs when building production-grade applications powered by Claude.

📋

Format

60 multiple-choice questions in 120 minutes. Closed-book, no AI assistance. Each question has one correct and three distractor responses. 4 of 6 scenarios randomly selected.

🎯

Scoring

Scaled score 100–1,000. Passing score: 720. No penalty for guessing — unanswered questions are scored as incorrect. Score report within 2 business days with section breakdowns.

👤

Target Candidate

Solution architects with 6+ months experience building with Claude APIs, Agent SDK, Claude Code, and MCP in production environments.

🔑

Registration

$99 exam fee ($0 for the first 5,000 Claude Partner Network employees). Register through Anthropic Academy on Skilljar. Score report delivered within 2 business days.

Core Technologies

Claude Agent SDK Model Context Protocol (MCP) Claude Code Claude API Message Batches API JSON Schema Pydantic CLAUDE.md Built-in Tools

5 Claude Certified Architect Exam Domains

The exam covers five weighted domains. Each domain tests specific knowledge and skills required for building production Claude applications.

D1
~27%

Agentic Architecture & Orchestration

Design and implement autonomous task execution systems using the Claude Agent SDK, including agentic loop lifecycle management, multi-agent coordinator-subagent patterns, hook interception mechanisms, and session state management.

Task Statements

  • 1.1 Design and implement agentic loops for autonomous task execution
  • 1.2 Orchestrate multi-agent systems with coordinator-subagent patterns
  • 1.3 Configure subagent invocation, context passing, and spawning
  • 1.4 Implement multi-step workflows with enforcement and handoff patterns
  • 1.5 Apply Agent SDK hooks for tool call interception and data normalization
  • 1.6 Design task decomposition strategies for complex workflows
  • 1.7 Manage session state, resumption, and forking
D2
~18%

Tool Design & MCP

Build effective tool interfaces and integrate with MCP servers, covering description best practices, structured errors, and tool distribution.

Task Statements

  • 2.1 Design effective tool interfaces with clear descriptions
  • 2.2 Implement structured error responses for MCP tools
  • 2.3 Distribute tools across agents and configure tool choice
  • 2.4 Integrate MCP servers into Claude Code and agent workflows
  • 2.5 Select and apply built-in tools effectively
D5
~15%

Context & Reliability

Manage context in production systems, covering escalation patterns, error propagation, and information provenance.

Task Statements

  • 5.1 Preserve critical information across turns
  • 5.2 Design escalation and ambiguity resolution
  • 5.3 Implement multi-agent error propagation
  • 5.4 Explore large codebases effectively
  • 5.5 Design human review workflows
  • 5.6 Preserve information provenance
D3
~20%

Claude Code Configuration & Workflows

Configure Claude Code for team development workflows, mastering the CLAUDE.md hierarchy, custom commands and skills, Plan Mode, and CI/CD pipeline integration.

Task Statements

  • 3.1 Configure CLAUDE.md with hierarchy, scoping, and organization
  • 3.2 Create custom slash commands and skills
  • 3.3 Apply path-specific rules for conditional convention loading
  • 3.4 Determine when to use plan mode vs direct execution
  • 3.5 Apply iterative refinement for progressive improvement
  • 3.6 Integrate Claude Code into CI/CD pipelines
D4
~20%

Prompt Engineering & Structured Output

Production-grade prompt engineering, including explicit criteria, few-shot prompting, guaranteed structured output via tool_use, validation-retry loops, and multi-pass review.

Task Statements

  • 4.1 Design prompts with explicit criteria for precision
  • 4.2 Apply few-shot prompting for output consistency
  • 4.3 Enforce structured output using tool use and JSON schemas
  • 4.4 Implement validation, retry, and feedback loops
  • 4.5 Design efficient batch processing strategies
  • 4.6 Design multi-pass review architectures

6 Production Scenarios in the Claude Certification Exam

The exam randomly presents 4 of these 6 scenarios. Each places you in a realistic production context requiring architectural decisions.

Scenario 1

Customer Support Resolution Agent

Building a customer support agent with Agent SDK and MCP tools (get_customer, lookup_order, process_refund, escalate_to_human). Target: 80%+ first-contact resolution rate.

D1 Agentic Architecture D2 Tool Design D5 Reliability

When should the agentic loop terminate?

✓ Correct Check stop_reason: continue looping on "tool_use", exit on "end_turn"
✗ Anti-pattern Parse the assistant's reply text for natural language signals like "done" or "completed"

How to enforce refund amount limits?

✓ Correct Use a PostToolUse hook to programmatically intercept over-limit refund calls and escalate to a human
✗ Anti-pattern Write "do not process refunds over $500" in the system prompt

When should the agent escalate to a human?

✓ Correct Customer explicitly requests a human, policy gaps exist, or the agent reaches capability limits
✗ Anti-pattern Escalate based on customer sentiment analysis or the agent's self-assessed confidence score

How to preserve customer info in long conversations?

✓ Correct Extract key facts (name, account ID, order number, amounts) into an immutable "case facts" block at the top of context
✗ Anti-pattern Rely on progressive summarization, losing critical values and identifiers through multi-turn compression
Scenario 2

Code Generation with Claude Code

Using Claude Code for code generation, refactoring, debugging, and documentation. Configuring custom commands, CLAUDE.md, and plan mode for a development team.

D3 Claude Code D5 Reliability

Where should team coding standards live?

✓ Correct Project-level .claude/CLAUDE.md (version-controlled, shared across the team)
✗ Anti-pattern ~/.claude/CLAUDE.md (personal only, not shared with teammates)

Plan mode vs direct execution?

✓ Correct Plan mode for multi-file architectural decisions; direct execution for well-scoped simple changes
✗ Anti-pattern Always use plan mode (wastes resources on simple tasks) or never use it (risky for complex changes)

How to achieve context isolation for complex refactoring?

✓ Correct Use skills with context: fork and allowed-tools restrictions
✗ Anti-pattern Use simple commands that run in the main session, polluting the main context with exploratory output

Most effective iterative refinement strategy?

✓ Correct TDD iteration: write failing test → implement → verify → refine while keeping tests green
✗ Anti-pattern Vague instructions like "make it better" without providing specific verification criteria
Scenario 3

Multi-Agent Research System

Building a coordinator-subagent research system: web search, document analysis, synthesis, and report subagents working in parallel to produce comprehensive cited reports.

D1 Agentic Architecture D2 Tool Design D5 Reliability

What architecture for parallel research tasks?

✓ Correct Hub-and-spoke: coordinator manages all inter-subagent communication, subagents have isolated contexts
✗ Anti-pattern Flat architecture where all agents share global state or full conversation history

How to pass context to subagents?

✓ Correct Pass only the context relevant to each subagent's specific task (explicit context passing)
✗ Anti-pattern Share the coordinator's full conversation history with every subagent

How to handle conflicting data from different subagents?

✓ Correct Track information provenance (source, confidence, timestamp), retain both data points with source annotations
✗ Anti-pattern Arbitrarily pick one result or average conflicting values

What happens when a subagent fails?

✓ Correct Return structured error context: failure type, attempted actions, partial results, and alternative suggestions
✗ Anti-pattern Silently return empty results or only report a generic "operation failed" message
Scenario 4

Developer Productivity with Claude

Building an agent to help engineers explore codebases, understand legacy systems, and generate template code. Uses built-in tools and MCP server integration.

D2 Tool Design D3 Claude Code D1 Agentic Architecture

Agent has 18 tools but keeps selecting the wrong one?

✓ Correct Keep 4-5 tools per agent, distribute the rest to specialized subagents
✗ Anti-pattern Lengthen tool descriptions, fine-tune the model, or upgrade to a larger model

Which built-in tool to read a config file?

✓ Correct Read tool (purpose-built for file reading)
✗ Anti-pattern Bash('cat config.json') — never use Bash when a dedicated tool exists

How to configure project-level MCP servers?

✓ Correct Use ${ENV_VAR} environment variables in .mcp.json, commit to version control
✗ Anti-pattern Hardcode API keys in the config file (will be committed to Git)

Edit vs Write for modifying existing files?

✓ Correct Edit for targeted modifications (preserves unchanged content)
✗ Anti-pattern Write overwrites the entire file — anything not included is lost
Scenario 5

Claude Code for CI/CD

Integrating Claude Code into CI/CD pipelines for automated code reviews, test generation, and PR feedback. Designing actionable prompts and minimizing false positives.

D3 Claude Code D4 Prompt Engineering

How to run Claude Code in a CI pipeline?

✓ Correct Use the -p flag for non-interactive mode, combined with --output-format json for structured output
✗ Anti-pattern Run in interactive mode or pipe commands via stdin (causes hanging)

How to review Claude's own generated code?

✓ Correct Use a separate, independent session for review (no generation-phase reasoning context, avoids confirmation bias)
✗ Anti-pattern Self-review in the same session — the reviewer retains the generator's reasoning memory

Nightly code audit: synchronous or batch?

✓ Correct Message Batches API for non-urgent tasks (saves 50% cost, completes within 24 hours)
✗ Anti-pattern Use real-time synchronous requests for non-urgent work (double the cost with no benefit)

How to enforce structured review output?

✓ Correct Use the --json-schema flag to force output conforming to a specific schema
✗ Anti-pattern Parse unstructured text output with regular expressions
Scenario 6

Structured Data Extraction

Extracting structured information from unstructured documents with JSON schema validation, handling edge cases, and integrating with downstream systems.

D4 Prompt Engineering D5 Reliability

How to guarantee valid JSON structure in output?

✓ Correct Use tool_use + JSON Schema + tool_choice to force a specific tool
✗ Anti-pattern Ask for "output as JSON" in the prompt (no guarantee) or post-process with regex (fragile)

Does tool_use guarantee data correctness?

✓ Correct No — tool_use guarantees structural compliance only. Semantic correctness requires additional business rule validation.
✗ Anti-pattern Assume that schema-matched output is automatically accurate

How to retry when extraction validation fails?

✓ Correct Append specific error details (which field, what error, expected vs actual value) before retrying
✗ Anti-pattern Generic retry: "there was an error, please try again" (model doesn't know what to fix)

How to handle unknown document types?

✓ Correct Include "other" in enums with a supplementary description field; provide 2-4 few-shot examples covering edge cases
✗ Anti-pattern Use a rigid enum without "other" (forces incorrect classification of unknown types)

Official Claude Certified Architect Sample Questions

These questions are drawn from the official exam guide to illustrate format and difficulty. Each includes a detailed explanation.

Scenario: Customer Support Agent Question 1 of 12

Production data shows your agent skips get_customer 12% of the time, calling lookup_order directly with the customer's name, occasionally causing incorrect identity matches and refunds. Which change most effectively addresses this reliability issue?

Explanation

When a specific tool calling sequence is essential for critical business logic (like verifying identity before issuing refunds), programmatic enforcement provides deterministic guarantees that prompt-based approaches cannot. Options B and C rely on probabilistic LLM compliance, which is insufficient when errors have financial consequences. Option D addresses tool availability rather than tool call ordering, which isn't the actual problem.

Scenario: Customer Support Agent Question 2 of 12

Logs show the agent frequently calls get_customer when users ask about orders (e.g., "check my order #12345") instead of lookup_order. Both tools have minimal descriptions ("Get customer info" / "Get order details") and accept similar identifier formats. What is the most effective first step to improve tool selection accuracy?

Explanation

Tool descriptions are the primary signal the LLM uses for tool selection. When descriptions are too brief, the model lacks context to distinguish between similar tools. B directly addresses the root cause with low effort and high leverage. Few-shot examples (A) add token overhead without fixing the underlying issue; a routing layer (C) is over-engineered and bypasses the LLM's natural language understanding; merging tools (D) is too heavy as a "first step."

Scenario: Customer Support Agent Question 3 of 12

Your customer support agent achieves only 55% first-contact resolution. Analysis reveals it escalates too aggressively for simple issues but fails to escalate complex cases that require human judgment. How should you calibrate escalation behavior?

Explanation

Explicit criteria combined with few-shot examples directly address both failure modes: over-escalation (by defining what doesn't warrant escalation) and under-escalation (by specifying clear triggers). Option B relies on self-assessed confidence, which is unreliable for calibration. Option C manipulates tool positioning, which doesn't address the decision logic. Option D uses sentiment as a proxy, but frustrated customers may have simple issues while calm customers may have genuinely complex ones.

Scenario: Claude Code Configuration Question 4 of 12

You want to create a custom /review slash command that runs your team's standard code review checklist. All developers should be able to use this command after cloning or pulling the repo. Where should the command file be placed?

Explanation

Project-level custom slash commands belong in the repository's .claude/commands/ directory, which is version-controlled and automatically available to all developers. ~/.claude/commands/ (B) holds personal commands not shared via version control. CLAUDE.md (C) is for project instructions and context, not command definitions. A commands array in .claude/config.json (D) describes a configuration mechanism that doesn't exist in Claude Code.

Scenario: Claude Code Configuration Question 5 of 12

You are tasked with refactoring your team's monolith into microservices. This will involve changes across dozens of files, requiring decisions about service boundaries and module dependencies. Which approach should you take?

Explanation

Plan mode is designed for complex tasks involving large-scale changes, multiple viable approaches, and architectural decisions — exactly what a monolith-to-microservices migration requires. It allows safe codebase exploration and design before committing changes. B risks costly rework when dependencies are discovered; C assumes you already know the correct structure without exploration; D ignores that the complexity is already evident in the requirements, not something that might appear later.

Scenario: Claude Code Configuration Question 6 of 12

Your project uses TypeScript for the frontend, Python for the backend, and Terraform for infrastructure. Each has different coding conventions and linting rules. How should you configure Claude Code to apply different conventions based on file type?

Explanation

The .claude/rules/ directory with glob patterns in YAML frontmatter is specifically designed for conditional convention loading based on file paths and patterns. This approach loads only the relevant rules when working with specific file types, keeping context clean and focused. A single CLAUDE.md (B) loads all conventions regardless of context, wasting tokens. Directory-level CLAUDE.md files (C) work for directory-based separation but don't handle mixed-language directories or cross-cutting conventions. Environment variables (D) describe a mechanism that doesn't exist in Claude Code.

Scenario: Multi-Agent Research System Question 7 of 12

Your multi-agent research system produces a report on "the impact of AI on creative industries," but the final output only covers visual arts, missing music, writing, and film. All subagents completed successfully. What is the most likely root cause?

Explanation

When all subagents complete successfully but the output is incomplete, the problem typically lies in the coordinator's task decomposition. If the coordinator decomposed "creative industries" into only visual arts categories, every subagent would diligently research only those areas. The coordinator is responsible for ensuring comprehensive coverage through proper task breakdown. The other options describe downstream issues that wouldn't explain all subagents converging on the same narrow topic.

Scenario: Multi-Agent Research System Question 8 of 12

The web search subagent times out while researching a complex topic. You need to design how failure information is communicated back to the coordinator. Which error propagation approach best supports intelligent recovery?

Explanation

Structured error context gives the coordinator the information needed to make intelligent recovery decisions — whether to retry with modified queries, try alternative sources, or proceed with partial results. B's generic status hides valuable context from the coordinator; C disguises failure as success, preventing any recovery; D unnecessarily terminates the entire workflow when recovery strategies may still work.

Scenario: Multi-Agent Research System Question 9 of 12

The synthesis subagent needs to verify specific facts from the research findings before including them in the final report. How should you provide this capability?

Explanation

A scoped verification tool follows the principle of giving each subagent exactly the tools it needs for its specific task. The synthesis subagent needs fact-checking, not full web search capabilities. Option B adds unnecessary coordination overhead; C violates tool scoping best practices by giving too many tools; D assumes perfect pre-verification, which may miss claims that only become questionable during synthesis.

Scenario: CI/CD Integration Question 10 of 12

Your team integrates Claude Code into the CI pipeline, but it hangs during execution waiting for user input. What is the correct fix?

Explanation

The -p flag is the official way to run Claude Code in non-interactive (headless) mode for CI/CD pipelines. It accepts a prompt as an argument and runs without requiring user input. Piping stdin (B) is fragile and not the designed interface. Timeout-based restarts (C) don't solve the root cause. --auto-accept (D) addresses permission prompts but doesn't switch to non-interactive mode.

Scenario: CI/CD Integration Question 11 of 12

Your engineering manager proposes using the Message Batches API for both the PR review workflow (triggered on every PR) and the nightly codebase audit. Which recommendation is correct?

Explanation

The Message Batches API provides 50% cost savings with a 24-hour completion window, making it ideal for non-time-sensitive workloads like nightly audits. PR reviews require timely feedback for developers and should use synchronous processing. B sacrifices developer experience for cost savings on a latency-sensitive workflow. C ignores valid cost optimization for the nightly audit. D inverts the correct mapping of urgency to processing mode.

Scenario: CI/CD Integration Question 12 of 12

A PR modifies 14 files in the inventory tracking module. Your single-pass review of all files produces inconsistent results: some files get detailed feedback while others receive superficial comments, obvious bugs are missed, and identical code patterns are flagged in one file but ignored in another. How should you restructure the review process?

Explanation

Splitting into focused multi-pass reviews directly addresses the root cause: attention dilution from processing too many files simultaneously. Per-file analysis ensures consistent depth, while a separate integration pass catches cross-file issues. B shifts the burden to developers without improving the system. C misunderstands that a larger context window doesn't solve attention quality problems. D would actually suppress detection of real bugs that are only caught intermittently by requiring consensus.

Quiz Complete!

0 / 12

Claude Certification Preparation Exercises

Complete these exercises to build practical familiarity with exam topics.

01

Build a Multi-Tool Agent with Escalation Logic

Objective: Practice agentic loop design, tool integration, structured error handling, and escalation patterns.

  1. Create an Agent SDK application with at least 3 MCP tools (e.g., get_customer, lookup_order, process_refund).
  2. Implement a proper agentic loop that checks stop_reason to determine continuation vs termination.
  3. Add a PostToolUse hook that enforces a refund limit and redirects to escalation when exceeded.
  4. Implement structured error responses with isError, errorCategory, and isRetryable fields in each tool.
  5. Define explicit escalation criteria and test edge cases where the agent should vs should not escalate.
Domains reinforced: D1, D2, D5
02

Configure Claude Code for a Team Workflow

Objective: Master CLAUDE.md hierarchies, custom commands, path-specific rules, and MCP integration.

  1. Set up a project with a three-level CLAUDE.md hierarchy: user-level (~/.claude/CLAUDE.md), project-level (.claude/CLAUDE.md), and directory-level files.
  2. Create at least two custom slash commands in .claude/commands/ (e.g., /review and /test).
  3. Configure path-specific rules in .claude/rules/ using YAML frontmatter glob patterns for different file types (e.g., TypeScript vs Python conventions).
  4. Create a skill with context: fork and allowed-tools restrictions for isolated codebase exploration.
  5. Add at least one MCP server in .mcp.json using ${ENV_VAR} for credential management.
Domains reinforced: D3, D2
03

Build a Structured Data Extraction Pipeline

Objective: Practice JSON schemas, tool_use for guaranteed structure, validation-retry loops, and batch processing.

  1. Define a JSON Schema for extracting structured data from unstructured documents (e.g., invoices, resumes, contracts).
  2. Use tool_use with tool_choice to guarantee the output conforms to the schema.
  3. Implement a validation-retry loop that appends specific field-level error details before each retry attempt.
  4. Design the schema with optional/nullable fields and an "other" category for unknown document types.
  5. Process a batch of documents using the Message Batches API and compare cost and latency vs synchronous processing.
Domains reinforced: D4, D5
04

Design and Debug a Multi-Agent Research Pipeline

Objective: Practice subagent orchestration, context passing, error propagation, and information provenance.

  1. Build a hub-and-spoke multi-agent system with a coordinator and at least 3 specialized subagents (e.g., search, analysis, synthesis).
  2. Implement explicit context passing: send only task-relevant information to each subagent, not the full coordinator history.
  3. Add structured error propagation so subagents return failure type, partial results, and recovery suggestions on failure.
  4. Require subagents to include provenance metadata (source, confidence, timestamp) with every finding.
  5. Test with scenarios that produce conflicting data from different subagents and verify provenance is preserved in the final output.
Domains reinforced: D1, D2, D5

Claude AI Technologies, Scope & Preparation

Technologies and Concepts

Claude Agent SDK Model Context Protocol (MCP) Claude Code Claude API Message Batches API JSON Schema Pydantic CLAUDE.md Built-in Tools Agentic Loops Hub-and-Spoke Orchestration Hooks (PreToolUse / PostToolUse) Session Management Task Decomposition Tool Descriptions Structured Error Responses tool_choice Configuration Custom Slash Commands Skills (context: fork) Path-Specific Rules Plan Mode CI/CD Integration (-p flag) Explicit Criteria Prompting Few-Shot Prompting Validation-Retry Loops Multi-Pass Review Progressive Summarization Escalation Patterns Error Propagation Information Provenance Confidence Calibration

In-Scope Topics

  • Designing agentic loops and multi-agent orchestration with the Agent SDK
  • Implementing hooks for tool call interception and enforcement
  • Writing effective tool descriptions and structured error responses
  • MCP server configuration and integration
  • CLAUDE.md hierarchy, commands, skills, and path-specific rules
  • Plan mode vs direct execution decision-making
  • CI/CD integration with non-interactive mode
  • Prompt engineering with explicit criteria and few-shot examples
  • Structured output via tool_use and JSON Schema
  • Batch processing with Message Batches API
  • Context management and preservation strategies
  • Escalation patterns and human review workflows
  • Error propagation across multi-agent systems
  • Information provenance and confidence calibration
🚫

Out-of-Scope Topics

  • Model fine-tuning or training
  • Internal model architecture or weights
  • Pricing or billing specifics
  • Competitive comparisons with other LLMs
  • General software engineering unrelated to Claude
  • Infrastructure provisioning (AWS, GCP, etc.)
  • Safety research or RLHF methodologies
  • Topics specific to other Anthropic products not listed in the tech stack

Exam Preparation Recommendations

  1. Build a complete agent application using Claude Agent SDK with tool calls, error handling, and session management. Practice spawning subagents and passing context between them.
  2. Configure Claude Code for a real project with CLAUDE.md hierarchies, path-specific rules in .claude/rules/, custom skills with frontmatter options, and at least one MCP server.
  3. Design and test MCP tools with clear descriptions that distinguish similar tools. Implement structured error responses with error categories and retryable flags.
  4. Build a structured data extraction pipeline using tool_use + JSON Schema for guaranteed structure, validation-retry loops, and Message Batches API for batch processing.
  5. Practice prompt engineering techniques: write few-shot examples for ambiguous scenarios, define explicit review criteria, and design multi-instance review architectures.
  6. Study context management patterns: extract structured facts from verbose tool output, implement scratchpad files for long sessions, and design subagent delegation for context capacity management.
  7. Master escalation and human review patterns: understand when to escalate (policy gaps, customer requests, capability limits) vs resolve autonomously. Design confidence-routed human review workflows.
  8. Complete the practice exam available through Anthropic Academy. It covers the same scenarios and question styles, with explanations to reinforce understanding.

Ready to Begin Your Claude Architect Certification Journey?

Become one of the first to earn Anthropic's official technical certification. Demonstrate your expertise in designing production-grade Claude solutions and establish professional credibility in AI architecture.