The Agentic Engineering Hierarchy: A Roadmap from Tools to Autonomous Partners

Q: What's the core difference between a 'tool' AI and an 'agentic' AI?

A tool AI (Level 0-1) executes a specific, user-initiated command and then stops. An agentic AI (Level 2+) takes a higher-level goal, breaks it down into sub-tasks, uses tools, maintains context, and works autonomously until the goal is met or it needs clarification.

Q: At what level are today's most advanced AI agents (like AutoGPT or Devin) operating?

Most exist in the Level 2 (Task-Automation) to Level 3 (Conditional Agency) range. They can decompose prompts and use tools but frequently fail at complex, multi-step tasks without human intervention and lack robust long-term planning or independent goal formation.

Q: Why is this framework more useful than the SAE Levels of Driving Automation for AI?

El-Edath's framework is domain-agnostic and emphasizes a collaboration spectrum, better capturing how AI integrates into creative and strategic work. It addresses the AI's internal reasoning and planning capabilities, which are more relevant to general intelligence than the control-transfer model of autonomous vehicles.

Q: What is the biggest barrier to moving from Level 3 to Level 4 (Strategic Agency)?

The leap requires solving 'chronic context loss' and achieving robust hierarchical planning. A Level 4 agent must hold a complex, evolving world model in memory over extended periods, dynamically re-plan when faced with obstacles, and manage competing sub-goals—demanding breakthroughs in long-term memory and self-critique.

Key Takeaways

A New Taxonomy for AI: Bassim El-Edath's "Levels of Agentic Engineering" provides a crucial 6-level framework to classify AI systems based on their autonomy and collaborative capability, moving beyond simplistic "AI vs. not AI" distinctions.
The Journey from Tool to Teammate: The framework illustrates a clear progression from deterministic, user-driven tools (Level 0) to fully autonomous agents that can set their own goals and operate in open-ended environments (Level 5).
Practical Development Implications: Each level imposes distinct engineering challenges, requiring advancements in planning, memory, tool use, and safety. Most current "AI agents" reside at Levels 1-3.
Ethical & Safety Imperatives: As we ascend the hierarchy, questions of control, alignment, and responsibility become paramount. The framework serves as a warning system for the societal impacts of increasingly agentic AI.
A Mirror for Human-AI Collaboration: The levels fundamentally redefine the user's role from micro-manager to strategic overseer, forcing us to reconsider the future of work and creativity.

Top Questions & Answers Regarding Agentic Engineering

What's the core difference between a "tool" AI and an "agentic" AI?

A tool AI (Level 0-1) executes a specific, user-initiated command and then stops. Think of a calculator or a simple chatbot that answers a single query. An agentic AI (Level 2+) takes a higher-level goal, breaks it down into sub-tasks, uses tools (like web search or code execution), maintains context across steps, and works autonomously until the goal is met or it needs clarification. The agent is proactive in its problem-solving approach.

At what level are today's most advanced AI agents (like AutoGPT or Devin) operating?

Most exist in the Level 2 (Task-Automation) to Level 3 (Conditional Agency) range. They can decompose a high-level prompt (e.g., "build a website") into a sequence of actions (write HTML, fetch styles, deploy). They have basic memory and can use tools. However, they frequently fail at complex, multi-step tasks without human intervention, lack robust long-term planning (Level 4), and cannot independently form their own goals in a novel environment (Level 5). They are sophisticated assistants, not independent actors.

Why is this framework more useful than the SAE Levels of Driving Automation for AI?

The SAE levels are domain-specific (driving) and focus on a replacement dynamic—the car takes over from the human. El-Edath's framework is domain-agnostic and emphasizes a collaboration spectrum. It better captures the varied ways AI integrates into creative, analytical, and strategic work. It also explicitly addresses the AI's internal reasoning and planning capabilities, which are more relevant to general intelligence than the control-transfer model of autonomous vehicles.

What is the biggest barrier to moving from Level 3 to Level 4 (Strategic Agency)?

The leap requires solving "chronic context loss" and achieving robust hierarchical planning. A Level 4 agent must hold a complex, evolving world model in memory over extended periods (days, weeks), dynamically re-plan when faced with unforeseen obstacles without losing sight of the primary objective, and manage competing sub-goals. This demands breakthroughs in long-term memory architectures, reliable self-critique, and perhaps a form of conceptual understanding that current LLMs still lack.

Decoding the Six Levels: From Scripts to Sentience

Bassim El-Edath's framework, detailed on his blog, provides a much-needed scaffold for an industry drowning in the hype-filled term "AI agent." By drawing a parallel to the established SAE levels for autonomous vehicles, it grounds a nebulous concept in a familiar progression of capability and autonomy. Our analysis expands on this core to explore the technical, philosophical, and practical implications of each rung on the agentic ladder.

0 No Agency

The Deterministic Tool. This is classic software—a compiler, a spreadsheet formula. It performs a fixed function with zero adaptability. Input X always yields output Y. The "user" is actually an operator, specifying every step.

1 Prompt-Driven Action

The Stateless Assistant. Modern LLM chat interfaces epitomize this level. A user provides a prompt ("write a poem"), the AI generates a completion, and the interaction resets. It has no memory of past exchanges and cannot chain actions without explicit, repeated user instruction.

2 Task Automation

The Script Runner. Here, agency emerges. The user gives a goal ("analyze this dataset and create a report"), and the AI plans and executes a sequence of steps (load data, clean, run analysis, format, save). It can use external tools (Python, APIs) and has short-term memory for the task. Most current "agent" frameworks (LangChain, etc.) target this level.

3 Conditional Agency

The Context-Aware Collaborator. This is a significant leap. The agent can handle ambiguity and conditional logic. It can ask clarifying questions, pivot strategies upon failure, and maintain context across multiple, related tasks over a longer session. It begins to resemble a junior team member who needs oversight but can work independently on a defined project.

4 Strategic Agency

The Project Manager. The agent operates over extended timelines, managing high-level objectives with numerous sub-goals. It proactively acquires new information, re-evaluates plans, and balances trade-offs. The user shifts to a strategic overseer, providing high-level direction rather than task-level input. This level remains largely aspirational, a frontier of current research.

5 Full Agency

The Autonomous Entity. The AI sets its own goals within broad constraints, operates in open-ended environments, and demonstrates continuous learning and adaptation. This is the realm of science fiction and profound philosophical debate. Achieving this safely is the central challenge of AI alignment and raises existential questions about control and coexistence.

Beyond the Ladder: The Unspoken Implications

While the levels themselves are descriptive, their true power lies in the analytical angles they unlock.

1. The Illusion of Agency & The "Hollow Level" Problem

Many systems today are marketed at Level 3 but functionally operate at Level 1 or 2. They have the superficial trappings of agency (a conversational interface, a list of executed steps) but lack the robust planning and error recovery that defines true conditional agency. This creates a hazardous expectation gap. Users may over-trust a system that appears more capable than it is, leading to failures in critical applications. The framework, therefore, acts as a necessary calibration tool for both developers and consumers.

2. The Inversion of the User Interface

At Level 0, the UI is a command line—precise, technical, and unforgiving. By Level 3, the UI becomes a goal-oriented dialogue ("Increase our website conversion rate"). At Level 4+, the interface may shift to strategy reviews and constraint setting—more akin to a board meeting than a software dashboard. This evolution forces a complete rethink of human-computer interaction (HCI), moving from imperative programming to declarative leadership.

3. The New Security Perimeter: Agentic Supply Chains

An agent at Level 3+ doesn't just execute code; it selects and sequences tools from an ecosystem. This creates a dynamic, AI-driven "supply chain" of software components. The security model shifts from protecting a static application to monitoring and governing an agent's tool choices and execution paths in real-time. A malicious tool or a manipulated piece of context could steer an otherwise benign agent toward harmful outcomes, a threat vector traditional cybersecurity is unprepared for.

Historical Context & The Road Ahead

The quest for agency is not new. It echoes through cybernetics, symbolic AI, and robotics. What's transformative today is the foundation of large language models, which provide a universal, adaptable "reasoning" substrate that previous approaches lacked. El-Edath's framework crystallizes this moment, showing us not just where we are, but the distinct valleys we must cross to reach the next peak.

The immediate industry focus is solidifying Level 3 (Conditional Agency), making agents reliable enough for real-world business processes. The next decade's grand challenge is the jump to Level 4 (Strategic Agency), which will require foundational advancements in AI memory, causal reasoning, and world modeling.

As we climb, the framework serves as an essential lighthouse. It reminds us that each increase in capability demands a commensurate increase in safety, oversight, and ethical foresight. The levels of agentic engineering aren't just a measure of what AI can do; they are a measure of our own maturity in building a future with powerful, autonomous partners.