The Reference Framework

This document consolidates the maturity model, the operating principle, and the two scales that structure the AI transformation.


The Universal Translation Rule

The operating principle of the entire transformation fits in one sentence:

Replace "the human produces the artifact" with "the human defines the spec → the system produces the artifact."

What This Means by Department

Engineering
The human defines...
Architecture, constraints, tests
The system produces...
The implementation
Marketing
The human defines...
Strategy, positioning, hypotheses
The system produces...
Campaigns, variants, reports
Sales
The human defines...
Qualification logic, deal rules
The system produces...
Outreach, follow-ups, proposals
Customer Service
The human defines...
Escalation logic, success criteria
The system produces...
Responses, health scoring, actions
Finance
The human defines...
Financial models, rules
The system produces...
Reports, forecasts, anomaly detection
HR
The human defines...
Hiring profiles, evaluation grids
The system produces...
Sourcing, screening, summaries
Product
The human defines...
Problem, constraints
The system produces...
Specs, test cases, drafts
Leadership
The human defines...
Direction, trade-offs
The system produces...
Scenarios, analyses

The Litmus Test

If this person disappeared, could a system execute 80% of their tasks?

  • If no → the role is still execution-based
  • If yes → the role is AI-native

This isn't "AI adoption." It's the shift from a labor-based company to a systems-based company.


Organizational Scale — Levels 1 to 3

This scale applies across the entire group — engineering, marketing, sales, finance, customer service.

What it looks like:

  • AI is a tool that individuals choose to use
  • Same structures, same processes, same roles
  • If AI disappeared tomorrow, nothing structural would change

Typical behaviors:

  • Using ChatGPT/Claude like Google or a spell checker
  • Isolated prompts, no iteration
  • AI outputs manually pasted into work
  • No shared prompts, no documentation
  • Adoption is uneven and optional

The gap is measurable: in technical roles, AI has 94% theoretical task coverage but only 33% actual usage. Level 1 organizations leave most of AI's capability untouched.

10-30% efficiency gains for those who adopt
Competitors at Level 3 get 10x leverage and make Level 1 non-viable

What it looks like:

  • AI is integrated into workflows and systems
  • Some processes redesigned around AI capabilities
  • Roles start shifting from "doing" to "directing" (see role evolution patterns)
  • If AI disappeared tomorrow, some workflows would break

Typical behaviors:

  • Saved prompts, templates, prompt libraries
  • AI used across multiple steps of a task, not just one
  • Tools like Copilot, Notion AI, Zapier, n8n in active use
  • Prompts and workflows shared among colleagues
  • AI usage is expected, not optional
2-3x output with the same headcount
Half-measures create confusion; uneven adoption limits gains
Level 3

AI-Native

What it looks like:

  • Organizational design assumes AI as a first-class resource
  • Roles are defined by judgment and direction, not execution
  • Headcount is a fraction of a traditional company at the same output
  • If AI disappeared tomorrow, the company couldn't function

Typical behaviors:

  • The starting question is: "What part should be automated?"
  • Agents, pipelines, and decision systems built (code or no-code)
  • Processes designed so humans handle judgment, AI handles execution
  • AI impact is measured (time saved, costs reduced, quality improved)
  • AI literacy is a condition of employment
10x leverage, structural cost advantage, speed that competitors can't match
Requires people who are hard to find; no room for passengers

Engineering Scale — Rungs 0 to 5

Engineering needs finer granularity. Based on Dan Shapiro's framework, this scale describes the progression of software development. The AI Lab details it and how it operates.

RungHuman's roleWho writes the codeWho reviews the code
0 — Assisted codingHuman codes, AI suggestsHumanHuman
1 — Scoped delegationHuman assigns scoped tasksAIHuman (everything)
2 — Supervised generationHuman supervises multi-file changesAIHuman (everything)
3 — Directed developmentHuman directs, reviews at feature/PR levelAIHuman (PR)
4 — Spec-driven developmentHuman writes the spec, verifies resultsAINobody (tests verify)
5 — Autonomous productionSpec goes in, software comes outAINobody (scenarios verify)

Mapping

Organizational scaleEngineering scale
Level 1 — AI-AssistedRungs 0-1
Level 2 — AI-IntegratedRungs 2-3
Level 3 — AI-NativeRungs 4-5

Diagnostic Questions

For the organization

"If AI disappeared tomorrow, what would change?"

  • Nothing structural → Level 1
  • Some workflows break → Level 2
  • The company can't function → Level 3

For leaders

"What would you remove from the org chart if AI were fully utilized?"

  • Can't answer → Tier 1
  • Mentions tasks → Tier 2
  • Mentions roles or processes → Tier 3

For individuals

"Show me something you've built or changed because AI exists."

  • Talks about prompts used → Tier 1
  • Shows workflows or templates → Tier 2
  • Shows systems or process changes → Tier 3

Acceptance Criteria

Level 2 — Achieved when ALL these criteria are met:

  • AI usage is a documented expectation for every role, not optional
  • Every department maintains a structured context file loaded before AI tasks
  • Shared prompt libraries or workflow templates exist and are in use
  • At least 1 workflow per department has been redesigned around AI (before/after documented)
  • KPIs include AI output metrics (not just activity)
  • "How did AI help?" is asked in reviews and retrospectives
  • If AI disappeared tomorrow, at least some workflows would break

Level 3 — Achieved when ALL these criteria are met:

  • Roles are defined by judgment and direction, not execution
  • Agents, pipelines, or decision systems are in production (not prototypes)
  • Non-trivial tasks have written specifications conforming to the execution standards
  • Every AI system in production has an assigned Spec Owner, Context Owner, and Evaluation Owner
  • AI impact is measured by department (time saved, costs reduced, quality improved)
  • Hiring profiles require Tier 2+ minimum
  • If AI disappeared tomorrow, the department couldn't function

The Transformation Path

Level 1 → Level 2

Prerequisites:

  • Leadership commits to AI as an operational standard, not optional
  • Investment in shared AI infrastructure (tools, templates, training)
  • Processes audited and redesigned for AI integration
  • KPIs updated to measure AI output
  • "How did AI help?" becomes a standard question

Timeline: 3-6 months with committed leadership

Level 2 → Level 3

Level 2 is the operational floor: every department must reach it. Level 3 is the organizational target. Non-engineering departments aim for Level 2 as their first milestone; engineering aims directly for Level 3 via the AI Lab.

Prerequisites:

  • Leadership is willing to eliminate roles, not just tasks (see the Role Decision Matrix)
  • Hiring profiles change to require Tier 2+ minimum
  • Product/service is redesigned assuming AI execution
  • Organizational structure flattens significantly

Timeline: 6-12 months

For engineering, the AI Lab lifecycle defines the specific phase sequence from Rung 3 to Rung 5.


Leadership Tiers

The company can't exceed the tier of its leadership. Leadership is the ceiling.

Publicly endorses AI. Uses it personally. Doesn't push adoption.

Sets expectations by role. Asks "how did AI help?". Funds automation before hiring.

Redesigns the organizational structure. Rewrites roles and KPIs. Makes AI literacy a condition of leadership.


Individual Tiers

"AI helps me do my job faster."

"AI helps us do this task better and more systematically."

"This role should exist differently because AI exists."

The difference between tiers is operational, not attitudinal. A Tier 1 person uses AI tools but has no current sense of where the human-agent boundary sits for their domain — they calibrated once (or never) and haven't updated. A Tier 2 person designs clean handoffs between human and agent work, maintains an accurate model of how agents fail for their specific tasks, and restructures workflows as capabilities shift. A Tier 3 person does all of this plus forecasts where the boundary will move next and allocates their attention where it creates the most value — treating human attention as the scarcest resource in an agent-rich environment.


← Back to home · AI Execution Standards · The AI Lab · Glossary