Part 8: The fuller prototype

The notebook shows how the feature works under the hood. The prototype/ folder turns the same ideas into importable modules and tests. It shows where the notebook shortcuts belong once the code leaves the notebook.

The prototype uses ToyAIKit too, but it wraps setup in create_agent() and keeps skill, command, and tool logic in separate files.

Install and run the prototype

Clone the public workshop repo with a sparse checkout and enter the prototype folder:

git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/alexeygrigorev/workshops.git workshops-reference
cd workshops-reference
git sparse-checkout set agent-skills/prototype
cd agent-skills/prototype
uv sync

Create .env with the provider keys you need:

OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ZAI_API_KEY=your_zai_api_key_here

The prototype pyproject.toml depends on:

  • python-dotenv
  • python-frontmatter
  • pyyaml
  • toyaikit
  • jupyter and pytest for development

The prototype pyproject.toml pins Python >=3.13, while the notebook path works with Python 3.10 or newer. Use the prototype pin if you run the prototype as-is.

Agent factory

prototype/src/main.py creates the configured agent:

def create_agent(
    project_dir: Path | str = None,
    provider: str = "openai",
    model: str = "gpt-4o",
) -> RealAgent:
    assert project_dir is not None
    project_dir = Path(project_dir)

    loader = SkillLoader()
    skill_tool = SkillToolsWrapper(loader=loader)
    project_tools = AgentTools(project_dir)

    return RealAgent(
        agent_name="toai",
        project_tools=project_tools,
        skill_tool=skill_tool,
        provider=provider,
        model=model,
    )

The factory packages the notebook steps:

  • Create the skill loader.
  • Wrap it as a skill tool.
  • Create project file and shell tools.
  • Create the real agent with a provider and model.

Provider configuration

prototype/src/agent.py supports multiple OpenAI-compatible provider setups:

PROVIDERS = {
    "openai": {
        "api_key_env": "OPENAI_API_KEY",
        "model": "gpt-4o-mini",
    },
    "zai": {
        "api_key_env": "ZAI_API_KEY",
        "base_url": "https://api.z.ai/api/paas/v4/",
        "model": "glm-4.7",
    },
    "anthropic": {
        "api_key_env": "ANTHROPIC_API_KEY",
        "model": "claude-sonnet-4-20250514",
    },
}

The agent loads the configured API key from the environment, creates an OpenAI-compatible client, registers tools, and starts a ToyAIKit runner.

Prompt building with skills

The prototype builds the system prompt dynamically:

def _build_prompt(self, base_prompt: str) -> str:
    if not self.skill_tool:
        return base_prompt

    skills = self.skill_tool._loader.list()
    if not skills:
        return base_prompt

    return base_prompt + "\n\n" + self.skill_tool._loader.description

This is the same approach as the notebook: the prompt receives the compact skill list, not the full skill bodies.

Run slash commands before the model sees them

The prototype's RealAgent.chat() implements the command approach that the notebook deferred:

def chat(self, user_message: str) -> str:
    if user_message.startswith("/"):
        parts = user_message.split(" ", 1)
        command_name = parts[0][1:]
        arguments = parts[1] if len(parts) > 1 else ""

        prompt = execute_command(command_name, arguments)
        if prompt is None:
            return f"Command not found: /{command_name}"
        user_message = prompt

    result = self.runner.loop(prompt=user_message)
    self._last_result = result
    return result.last_message

Here the model never sees /review src/agent.py. It sees the rendered command prompt. This is closer to the way real coding agents usually handle commands.

Template processing

prototype/src/commands.py implements $1, $2, and $ARGUMENTS:

def _process_template(template: str, arguments: str) -> str:
    try:
        args = shlex.split(arguments) if arguments else []
    except ValueError:
        args = arguments.split() if arguments else []

    placeholder_regex = re.compile(r"\$(\d+)")
    placeholders = placeholder_regex.findall(template)
    last = int(placeholders[-1]) if placeholders else 0

The replacement function gives the last positional placeholder the remaining arguments:

    def replace_placeholder(match):
        index = int(match.group(1)) - 1
        if index >= len(args):
            return ""
        if match.group(1) == str(last):
            return " ".join(args[index:])
        return args[index]

    template = placeholder_regex.sub(replace_placeholder, template)
    template = template.replace("$ARGUMENTS", arguments)

    return template

This turns a command file into a practical prompt template.

Example prototype commands

The prototype includes commands/review.md:

---
description: Review code for quality and suggest improvements
---

Review the code at $1 for:
1. Code quality and readability
2. Potential bugs or issues
3. Performance considerations
4. Best practices violations

Provide specific, actionable suggestions.

It also includes commands/test.md:

---
description: Run tests with coverage
---

Run the full test suite with coverage report. Focus on any failing tests and suggest fixes.

These are smaller than the /kid and /parent demo commands, but they are better tests because the expected behavior is easy to assert.

Unit tests for skills

prototype/tests/test_skills.py checks deterministic loader behavior. It verifies that:

  • The loader discovers hello, joke, counter, and deploy_app.
  • Each skill has a name, description, and content.
  • Missing skills return None from get().
  • The tool description includes the skill list.
  • deploy_app contains file references to scripts and templates.

Run the skill tests:

uv run pytest tests/test_skills.py -v

These tests do not call the LLM, so they should be stable.

Unit tests for commands

prototype/tests/test_commands.py verifies command discovery and template rendering. It checks that:

  • review and test are discovered.
  • Missing commands return None.
  • execute_command("review", "src/agent.py") returns a prompt containing src/agent.py.
  • $1 is replaced.

Run them:

uv run pytest tests/test_commands.py -v

The file also contains an LLM-backed full-flow test that creates an agent and feeds a rendered review prompt into it. Treat that as an integration test.

Integration tests for skills and commands

The LLM-backed tests are in:

  • tests/test_agent_run.py
  • tests/test_agent_with_skills.py
  • tests/test_agent_with_commands.py

They assert behavior such as:

  • The agent calls read_file when asked to read README.md.
  • The agent loads deploy_app when asked how to deploy an application.
  • The agent loads coding_standards and reads function_template.md.
  • The agent runs check_standards.sh when asked to check coding standards.
  • /review src/agent.py produces a code-review response.
  • /test src/skills.py produces a testing-oriented response.
  • Missing commands return Command not found.

Run an integration file when you have an API key configured:

uv run pytest tests/test_agent_with_skills.py -v -s

These tests call an LLM, so they are slower and can be less deterministic than the loader tests. They are still useful because the main behavior is not only parsing files. The model must decide to call the right tool.

Continue with Q&A: skills and commands for side questions from the live session.

Questions & Answers (0)

Sign in to ask questions