What does Engineering Level 3 (AI-Native) look like day-to-day?

AI is the execution layer; humans direct and validate. Work concentrates at two boundaries: front (specifications, alignment, clarification dialogues) and back (validation, edge-case sessions, first-user UX testing). The unit of work is a feature or user story handled end-to-end inside one agent loop; cycle time collapses to hours-to-days at the story level and days-to-weeks at the feature level.

What happens when AI-generated code is wrong at Level 3?

Recalibrate before debugging. Rebuild the AI's understanding via brainstorm and re-spec; the spec or context is usually the actual cause.

How do you measure AI impact at Level 3?

Unit economics: cost per merged unit (PR, ticket, transaction), AI gross margin at the team level, agent throughput per dollar of inference, mean time between human intervention. The question "did AI help?" has stopped being meaningful at this level — AI is the execution layer, not assistance.

Skill Progression Map

What Level 1, 2, and 3 look like for your role — concretely.

How to Use This Page

The Reference Framework defines three maturity levels. This page shows what those levels mean in practice for specific role families — not in theory, but in the work you do day to day.

For each role family:

Level 1 (AI-Assisted): You use AI as a tool. Same workflows, faster in spots.
Level 2 (AI-Integrated): AI is embedded in your workflow. Some of your work has been redesigned around what AI can do.
Level 3 (AI-Native): You define specifications and judge results. AI handles execution.

The individual tier scale adds intermediate stages: Tier 0.5 (AI-Curious), Tier 1.5 (AI-Building), and Tier 2.5 (AI-Advanced). If you find yourself between two levels below, you're likely at the .5 stage – actively transitioning. That's progress, not a gap.

Find your role family below. Identify where you are. Then use Transforming Your Role for the transition process and the Recognizing Your Pattern section to understand which structural forces are acting on your role.

A note on depth. The Engineering column has the most operational substance because that's where AI-native maturity is best documented. The Customer Service, Marketing, Sales, and Design columns reflect the framework's current understanding — directionally right, but lighter. Parallel readiness models for the discrete-task domains where this maturity exists most clearly — engineering, customer service, finance operations, legal review, knowledge research — are on the framework's roadmap; until they ship, treat the non-engineering columns as the floor of what your role looks like at each level, not the ceiling. See also the framework's broader note on this asymmetry.

Engineering

Dominant pattern: Elevation — from writing code to specifying what the code should do.

Level 1 — AI-Assisted

You use AI for code completion and quick lookups. Copilot or ChatGPT suggests lines; you accept or reject.

What this looks like:

AI autocompletes code as you type
You paste code into ChatGPT to debug or explain
AI outputs require significant manual review and editing
No shared configurations or prompt templates across the team
Your workflow is fundamentally the same as before AI

The data: 84% of developers use or plan to use AI tools, 51% daily. But trust has fallen to 29%, and 66% report spending more time fixing AI-generated code than they save (Stack Overflow, 2025). This is the Level 1 experience: AI helps in spots, but the net gain is uncertain because the workflow hasn't been redesigned.

Level 2 — AI-Integrated

AI is part of the development workflow, not just a helper. You direct multi-file changes, review AI-generated code at the PR level, and maintain shared context files.

What this looks like:

AI generates code from descriptions; you review and iterate
Shared prompt templates and context files exist for the codebase
AI handles tests, documentation, and boilerplate systematically
You spend more time on architecture and review, less on typing
Removing AI would break your development velocity

The shift: At Level 2, you accept ~30% of AI suggestions but retain 88% of generated characters (GitHub/Accenture, 2024). The skill is knowing what to accept, what to reject, and how to direct the generation.

Level 3 — AI-Native

AI is the execution layer; humans give direction and validate. Work is structured around a recurring operational unit (Context → Clarification → Execution → Validation → Recovery) and your value concentrates at the boundaries. This corresponds to Rungs 4–5 on the engineering scale.

Unit of work: the feature or user story, handled end-to-end inside one agent loop (architect → implement → review → merge). Not the line of code. Not the prompt.

Cycle time: stories ship in hours-to-days; features in days-to-weeks. "Behind on a deliverable" loses its old meaning — when a project stalls at L3, the cause is rarely human capacity.

What this looks like:

You define specifications, acceptance criteria, and constraints
AI produces the implementation, runs tests, opens the PR, and resolves review comments
A separate agent reviewer validates the PR; you intervene only on flagged issues or final UX validation
Validation gates are risk-graded — agent-only review for reversible work, human approval for irreversible changes (production deploys, sensitive data, customer-facing communications)
See the AI Lab for the operational unit in detail

Failure mode: when a deliverable stalls, the cause is usually the AI bottleneck — the agent has hit a structural limit (wrong direction, ambiguous spec, subjective edge case it cannot resolve alone). The recovery is recalibration, not debugging: a brainstorm session that rebuilds the AI's understanding of the problem, often with multiple humans bringing different perspectives. Throwing more humans at execution doesn't help.

Day shape: A typical L3 day concentrates work at the front and back boundaries. Morning: review yesterday's overnight agent output, validate two PRs the agent reviewed, run first-user UX testing on a feature that just shipped. Midday: write specs for two new stories; engage a clarification dialogue with the agent until no material ambiguity remains. Afternoon: a recalibration session on a stuck story; refine acceptance criteria for next sprint. Almost no time is spent watching the agent execute.

Metrics: throughput (PRs merged per week, stories shipped), quality (defects per story, scenario coverage), cost (token cost per merged PR, AI gross margin) — not "time saved by AI." See AI economics at maturity.

The critical warning: Level 1 without progression actively degrades quality. Analysis of 211 million lines of code shows that AI-assisted development without skill progression caused refactoring to drop from 25% to under 10% of changes, while code churn nearly doubled (GitClear, 2025). The tools make it easy to produce code and hard to produce good code. The productivity story is also bimodal: empirical studies measuring individuals at L1–L2 adding AI to existing workflows find gains are spurious or negative; case studies of teams restructured around AI execution (AMPECO, Monte Carlo, Every) report 4× gains and 73% PR-rate increases. Level 2 and 3 skills — review judgment, specification quality, test design, process design — are what unlock the second mode and prevent the first.

Self-assessment

Question	Level 1	Level 1.5	Level 2	Level 2.5	Level 3
How do you start a new feature?	Open editor, start coding, use AI for completion	Experiment with AI for parts of the feature, building and testing prompt workflows	Describe the feature to AI, review the output, iterate	Write detailed specs with constraints, moving toward test-verified output	Write a specification with constraints and test cases, let AI implement
What happens when AI code is wrong?	Fix it line by line	Iterate on the prompt, starting to build reusable templates	Improve the prompt/context and regenerate	Improve scenarios and verification systems	Recalibrate (rebuild the AI's understanding via brainstorm + re-spec) before debugging — the spec or context is usually the actual cause
What do you share with teammates?	Nothing AI-specific	Experiments that worked, prompt drafts	Prompt templates, context files	Specification patterns, verification approaches	Specification patterns, scenario libraries

Marketing

Dominant pattern: Specialization — shedding content production, deepening strategic judgment.

Level 1 — AI-Assisted

You use AI for first drafts and idea generation. Every output gets manually edited.

What this looks like:

AI generates blog post drafts, email copy, or social media posts
You edit 80%+ of AI output before publishing
No systematic workflow — AI is used ad hoc
Each team member uses AI differently (or not at all)
Campaigns are still planned and executed the traditional way

The data: 91% of marketing leaders say their teams use AI, with content creation (43%) as the top use case. But 86% edit AI-generated content before publishing (HubSpot, 2025). And 68% receive no formal AI training (Marketing AI Institute, 2025).

Level 2 — AI-Integrated

Campaign workflows are redesigned around AI. AI doesn't just draft — it generates variants, handles research, and produces analysis as a systematic step.

What this looks like:

Shared prompt libraries encode brand voice and positioning
AI generates campaign variants; you select and refine
Research, competitive analysis, and reporting are AI-first workflows
New team members are onboarded into AI-integrated processes
The team produces more with fewer people

The shift: You stop writing content and start directing content systems. Your value moves from production speed to strategic judgment: which angle, which audience, which positioning.

Level 3 — AI-Native

You define strategy, positioning, and constraints. Systems produce campaigns, variants, and reports.

What this looks like:

You specify the campaign: target, positioning, constraints, success metrics
AI produces the creative, copy, and distribution plan
You review, select, and adjust — not produce
The marketing team is significantly smaller but produces significantly more
Your role is strategy and taste, not execution

External validation: The Marketing AI Institute's own maturity survey maps almost directly to these levels: 40% of marketing teams are at Experimentation (Level 1), 26% at Integration (Level 2), 17% at Transformation (Level 3) (Marketing AI Institute, 2025).

Self-assessment

Question	Level 1	Level 1.5	Level 2	Level 2.5	Level 3
How do you create a campaign?	Plan it, then use AI for some drafts	Test AI for specific steps, building prompt libraries	Define the brief, AI generates variants, you curate	Define strategy and constraints, AI produces most deliverables with light editing	Define the strategy and constraints, AI produces the campaign
What's your bottleneck?	Writing and production	Finding which AI workflows stick	Review and strategic decisions	Defining the right constraints for consistent quality	Defining the right problem to solve
How much do you edit AI output?	80%+	50–70% (improving as workflows mature)	30–50%	15–25% (mostly selecting, not rewriting)	10–20% (selecting, not rewriting)

Customer Service

Dominant pattern: Elevation shifting to Convergence — from answering tickets to designing service systems.

Level 1 — AI-Assisted

AI suggests responses. Agents copy, paste, and edit. The workflow is the same, slightly faster.

What this looks like:

AI drafts reply suggestions for agents
Agents handle the same volume and types of interactions
AI handles only the simplest, most scripted inquiries
No role changes — everyone still does the same job
Quality depends on individual agents, not systems

Level 2 — AI-Integrated

AI handles routine inquiries autonomously. Agents shift from answering to training, reviewing, and handling complex cases. New roles emerge.

What this looks like:

AI resolves the majority of routine tickets without human involvement
Agents spend more time training AI systems than doing traditional support
New roles emerge: conversation analysts, knowledge managers, AI operations leads
Escalation logic is designed and documented, not improvised
The team handles significantly more volume with stable or reduced headcount

The data: 82% of support teams feel positive about AI collaboration. 60% say roles are evolving. 40% of teams report agents spend more time training AI systems than doing traditional support (Intercom, 2025). This is Level 2 in action.

Level 3 — AI-Native

Humans define service strategy, escalation logic, and quality standards. AI executes the vast majority of interactions.

What this looks like:

You define: what constitutes good service, when to escalate, what quality looks like
AI handles 80%+ of interactions
Human agents exist for judgment calls, relationship moments, and cases the system can't handle
The team is a fraction of its previous size, but service quality is equal or better
Your role is system design and quality ownership, not ticket resolution

Customer service is often the first function to reach Level 3. It shows the highest actual AI task coverage (Anthropic, 2026) and generates the largest share of AI value (38%, per BCG, 2025).

Self-assessment

Question	Level 1	Level 1.5	Level 2	Level 2.5	Level 3
What do you do most of the day?	Answer tickets	Test AI on ticket categories, build response templates	Train AI, handle escalations, review quality	Design escalation logic, monitor AI quality metrics	Design service strategy and escalation rules
What happens when AI gives a bad answer?	Fix it and move on	Build a better template or knowledge base entry	Update the training data or knowledge base	Redesign the quality criteria or training data	Redesign the escalation logic or quality criteria
How is your performance measured?	Tickets resolved, response time	Template quality, AI adoption rate	AI deflection rate, escalation quality	System design quality, quality at scale	Service quality metrics, system design effectiveness

Sales

Dominant pattern: Specialization — shedding administrative overhead, deepening relationship and deal judgment.

Level 1 — AI-Assisted

AI helps with email drafts and basic research. Sellers still spend most of their time on non-selling tasks.

What this looks like:

AI drafts cold emails and follow-ups
Research is semi-manual with AI assistance
CRM is updated by humans
70% of time goes to non-selling tasks (Salesforce, 2024)
The sales process hasn't changed, just individual tasks

Level 2 — AI-Integrated

AI automates research, outreach sequencing, and CRM enrichment. Sellers focus on relationships and complex deal strategy.

What this looks like:

AI handles prospecting research, outreach sequences, and follow-up timing
CRM is enriched automatically with AI-gathered data
Sellers focus on high-value conversations: qualification, negotiation, closing
AI users are 2.4× less likely to feel overworked
The non-selling overhead drops significantly

The shift: The value moves from activity volume (calls made, emails sent) to deal quality (pipeline accuracy, win rate, deal size). Level 2 sellers don't work harder — they work on the right things.

Level 3 — AI-Native

Humans define qualification logic, deal rules, and escalation thresholds. AI produces outreach, proposals, and pipeline analysis.

What this looks like:

You define: ideal customer profile, qualification criteria, pricing rules, escalation conditions
AI produces: outreach, follow-ups, proposals, competitive analysis
Your time goes to relationship building, strategic accounts, and judgment calls
By 2027, 95% of seller research workflows are predicted to begin with AI (Gartner, 2025)
Sellers who partner with AI are 3.7× more likely to meet quota

Self-assessment

Question	Level 1	Level 1.5	Level 2	Level 2.5	Level 3
How much time do you spend on admin?	70%+	50–60% (actively automating tasks)	30–40%	15–25% (most admin is system-handled)	Under 15%
How do you research a prospect?	Manually, with some AI help	Build AI research workflows, test automation	AI produces the research brief, you review	AI handles end-to-end research, you review and strategize	AI identifies and qualifies prospects, you handle relationships
What's your competitive advantage?	Activity volume	Workflow experimentation	Deal judgment	System design for the sales process	Specification of what "good" looks like

Design

Dominant pattern: Elevation — from pixel production to system direction.

Level 1 — AI-Assisted

AI generates mood boards, initial concepts, or copy. You refine everything manually.

What this looks like:

AI produces inspiration: mood boards, concept variations, style explorations
All production work (layouts, components, assets) is done manually
AI is a starting point, not a workflow participant
The design process is unchanged — AI adds a brainstorming step

Level 2 — AI-Integrated

AI handles production work. You shift to system thinking, brand direction, and quality judgment.

What this looks like:

AI generates layouts, asset variations, and responsive adaptations
You define design systems and brand constraints; AI operates within them
Production time drops dramatically; review and direction time increases
Entry-level production roles contract as AI absorbs that work
71% of UX professionals believe AI will shape the future of UX (UX Design Institute, 2025)

The shift: Your value moves from craft execution to taste and system design. You're not less of a designer — you're more of an architect.

Level 3 — AI-Native

You define systems, constraints, and brand rules. AI produces the artifacts.

What this looks like:

You specify: design system, brand parameters, constraints, quality criteria
AI produces: mockups, components, responsive layouts, asset libraries
You review, curate, and refine — not draw
Outcome-oriented design replaces pixel-level work
"Systems Architects" and "AI Directors" emerge as the high-value design roles (NN/g, 2025)

By Q3 2025, "manual pixel-pushing had effectively ended for commercial production" (UX Design Institute, 2025). The progression from Level 1 to Level 3 is happening faster in design than in most other functions.

Self-assessment

Question	Level 1	Level 1.5	Level 2	Level 2.5	Level 3
What do you produce?	Pixel-perfect deliverables	Mix of manual and AI-assisted deliverables	Design systems and direction, AI produces deliverables	Design systems and constraints, AI produces most assets	Specifications and quality criteria
What's your bottleneck?	Production time	Finding which AI tools work for your workflow	Making the right design decisions	Defining constraints that produce consistent brand quality	Defining the right constraints
What skills are growing?	Tool mastery	AI tool integration, prompt design	System thinking, brand judgment	Constraint specification, quality judgment at scale	Specification engineering, taste at scale

Cross-Cutting: The Skills That Matter at Every Level

Regardless of your role family, certain skills compound across the progression:

Level 1 → Level 2: The critical skill is recognizing which parts of your work are legacy patterns — repeatable execution that AI can absorb. The transition is about seeing the opportunity, not just using the tool.

Level 2 → Level 3: The critical skill is specification engineering — writing clear enough instructions that AI can execute without real-time supervision. This is the Universal Translation Rule applied to your individual work.

At every level: The five irreplaceable functions — Direction, Judgment, Taste, Relationship, Accountability — define what stays human. Your progression isn't about doing less. It's about concentrating on what only you can do.

Workers in AI-exposed roles earn up to 30% salary premiums (PwC, 2025). The market is already pricing the progression.

← Back to home · Transforming your role · AI Execution Standards · Specification Guide