Part 8: The fuller prototype

In the notebook you see how the feature works under the hood. In the prototype/ folder you turn the same ideas into importable modules and tests. That shows where the notebook shortcuts belong once the code leaves the notebook.

In the prototype you keep using ToyAIKit, but you wrap setup in create_agent() and split skill, command, and tool logic into separate files.

Install and run the prototype

Clone the public workshop repo with a sparse checkout and enter the prototype folder:

git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/alexeygrigorev/workshops.git workshops-reference
cd workshops-reference
git sparse-checkout set agent-skills/prototype
cd agent-skills/prototype
uv sync

Create .env with the provider keys you need:

OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ZAI_API_KEY=your_zai_api_key_here

The prototype pyproject.toml depends on:

  • python-dotenv
  • python-frontmatter
  • pyyaml
  • toyaikit
  • jupyter and pytest for development

The prototype pyproject.toml pins Python >=3.13, while the notebook path works with Python 3.10 or newer. Use the prototype pin if you run the prototype as-is.

Agent factory

prototype/src/main.py creates the configured agent:

def create_agent(
    project_dir: Path | str = None,
    provider: str = "openai",
    model: str = "gpt-4o",
) -> RealAgent:
    assert project_dir is not None
    project_dir = Path(project_dir)

    loader = SkillLoader()
    skill_tool = SkillToolsWrapper(loader=loader)
    project_tools = AgentTools(project_dir)

    return RealAgent(
        agent_name="toai",
        project_tools=project_tools,
        skill_tool=skill_tool,
        provider=provider,
        model=model,
    )

The factory packages the notebook steps:

  • Create the skill loader.
  • Wrap it as a skill tool.
  • Create project file and shell tools.
  • Create the real agent with a provider and model.

Provider configuration

prototype/src/agent.py supports multiple OpenAI-compatible provider setups:

PROVIDERS = {
    "openai": {
        "api_key_env": "OPENAI_API_KEY",
        "model": "gpt-4o-mini",
    },
    "zai": {
        "api_key_env": "ZAI_API_KEY",
        "base_url": "https://api.z.ai/api/paas/v4/",
        "model": "glm-4.7",
    },
    "anthropic": {
        "api_key_env": "ANTHROPIC_API_KEY",
        "model": "claude-sonnet-4-20250514",
    },
}

The agent loads the configured API key from the environment. It creates an OpenAI-compatible client, registers tools, and starts a ToyAIKit runner.

Prompt building with skills

The prototype builds the system prompt dynamically:

def _build_prompt(self, base_prompt: str) -> str:
    if not self.skill_tool:
        return base_prompt

    skills = self.skill_tool._loader.list()
    if not skills:
        return base_prompt

    return base_prompt + "\n\n" + self.skill_tool._loader.description

This is the same approach as the notebook: the prompt receives the compact skill list, not the full skill bodies.

Run slash commands before the model sees them

The prototype's RealAgent.chat() implements the command approach that the notebook deferred:

def chat(self, user_message: str) -> str:
    if user_message.startswith("/"):
        parts = user_message.split(" ", 1)
        command_name = parts[0][1:]
        arguments = parts[1] if len(parts) > 1 else ""

        prompt = execute_command(command_name, arguments)
        if prompt is None:
            return f"Command not found: /{command_name}"
        user_message = prompt

    result = self.runner.loop(prompt=user_message)
    self._last_result = result
    return result.last_message

Here the model never sees /review src/agent.py. It sees the rendered command prompt instead, which is closer to the way real coding agents usually handle commands.

Template processing

prototype/src/commands.py implements $1, $2, and $ARGUMENTS in _process_template(template, arguments), and you import this code rather than type it. The full source is in the code repo at agent-skills/prototype/src/commands.py.

The function splits arguments into positional tokens, then fills the placeholders in the template:

  • $1, $2, and so on take one positional argument each.
  • The last numbered placeholder soaks up every remaining argument.
  • $ARGUMENTS is replaced with the full raw argument string.

So with the review template body Review the code at $1 for:, the input /review src/agent.py renders to a prompt that starts Review the code at src/agent.py for:. The model sees that rendered prompt, not the raw /review line.

This turns a command file into a practical prompt template.

Example prototype commands

The prototype includes commands/review.md:

---
description: Review code for quality and suggest improvements
---

Review the code at $1 for:
1. Code quality and readability
2. Potential bugs or issues
3. Performance considerations
4. Best practices violations

Provide specific, actionable suggestions.

It also includes commands/test.md:

---
description: Run tests with coverage
---

Run the full test suite with coverage report. Focus on any failing tests and suggest fixes.

These are smaller than the /kid and /parent demo commands. They're better tests because the expected behavior is easy to assert.

Unit tests for skills

prototype/tests/test_skills.py checks deterministic loader behavior.

It verifies these behaviors:

  • The loader discovers hello, joke, counter, and deploy_app.
  • Each skill has a name, description, and content.
  • Missing skills return None from get().
  • The tool description includes the skill list.
  • deploy_app contains file references to scripts and templates.

Run the skill tests:

uv run pytest tests/test_skills.py -v

These tests don't call the LLM, so they should be stable.

Unit tests for commands

prototype/tests/test_commands.py verifies command discovery and template rendering.

It checks these behaviors:

  • review and test are discovered.
  • Missing commands return None.
  • execute_command("review", "src/agent.py") returns a prompt containing src/agent.py.
  • $1 is replaced.

Run the command tests:

uv run pytest tests/test_commands.py -v

The file also contains an LLM-backed full-flow test that creates an agent and feeds a rendered review prompt into it. Treat that as an integration test.

Integration tests for skills and commands

The LLM-backed tests are in:

  • tests/test_agent_run.py
  • tests/test_agent_with_skills.py
  • tests/test_agent_with_commands.py

They assert behavior such as:

  • The agent calls read_file when asked to read README.md.
  • The agent loads deploy_app when asked how to deploy an application.
  • The agent loads coding_standards and reads function_template.md.
  • The agent runs check_standards.sh when asked to check coding standards.
  • /review src/agent.py produces a code-review response.
  • /test src/skills.py produces a testing-oriented response.
  • Missing commands return Command not found.

Run an integration file when you have an API key configured:

uv run pytest tests/test_agent_with_skills.py -v -s

These tests call an LLM, so they're slower and can be less deterministic than the loader tests. They're still useful because the behavior under test isn't only parsing files: the model has to decide to call the right tool.

Continue with Q&A: skills and commands for side questions from the live session.

Questions & Answers

Sign up to ask questions, track your progress, and get access to other workshops · Already have an account? Sign in