Part 8: The fuller prototype
The notebook shows how the feature works under the hood. The prototype/
folder turns the same ideas into importable modules and tests. It shows where
the notebook shortcuts belong once the code leaves the notebook.
The prototype uses ToyAIKit too, but it wraps setup in create_agent() and
keeps skill, command, and tool logic in separate files.
Install and run the prototype
Clone the public workshop repo with a sparse checkout and enter the prototype folder:
git clone --depth 1 --filter=blob:none --sparse \
https://github.com/alexeygrigorev/workshops.git workshops-reference
cd workshops-reference
git sparse-checkout set agent-skills/prototype
cd agent-skills/prototype
uv sync
Create .env with the provider keys you need:
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ZAI_API_KEY=your_zai_api_key_here
The prototype pyproject.toml depends on:
python-dotenvpython-frontmatterpyyamltoyaikitjupyterandpytestfor development
The prototype pyproject.toml pins Python >=3.13, while the notebook path
works with Python 3.10 or newer. Use the prototype pin if you run the prototype
as-is.
Agent factory
prototype/src/main.py creates the configured agent:
def create_agent(
project_dir: Path | str = None,
provider: str = "openai",
model: str = "gpt-4o",
) -> RealAgent:
assert project_dir is not None
project_dir = Path(project_dir)
loader = SkillLoader()
skill_tool = SkillToolsWrapper(loader=loader)
project_tools = AgentTools(project_dir)
return RealAgent(
agent_name="toai",
project_tools=project_tools,
skill_tool=skill_tool,
provider=provider,
model=model,
)
The factory packages the notebook steps:
- Create the skill loader.
- Wrap it as a skill tool.
- Create project file and shell tools.
- Create the real agent with a provider and model.
Provider configuration
prototype/src/agent.py supports multiple OpenAI-compatible provider setups:
PROVIDERS = {
"openai": {
"api_key_env": "OPENAI_API_KEY",
"model": "gpt-4o-mini",
},
"zai": {
"api_key_env": "ZAI_API_KEY",
"base_url": "https://api.z.ai/api/paas/v4/",
"model": "glm-4.7",
},
"anthropic": {
"api_key_env": "ANTHROPIC_API_KEY",
"model": "claude-sonnet-4-20250514",
},
}
The agent loads the configured API key from the environment, creates an OpenAI-compatible client, registers tools, and starts a ToyAIKit runner.
Prompt building with skills
The prototype builds the system prompt dynamically:
def _build_prompt(self, base_prompt: str) -> str:
if not self.skill_tool:
return base_prompt
skills = self.skill_tool._loader.list()
if not skills:
return base_prompt
return base_prompt + "\n\n" + self.skill_tool._loader.description
This is the same approach as the notebook: the prompt receives the compact skill list, not the full skill bodies.
Run slash commands before the model sees them
The prototype's RealAgent.chat() implements the command approach that the
notebook deferred:
def chat(self, user_message: str) -> str:
if user_message.startswith("/"):
parts = user_message.split(" ", 1)
command_name = parts[0][1:]
arguments = parts[1] if len(parts) > 1 else ""
prompt = execute_command(command_name, arguments)
if prompt is None:
return f"Command not found: /{command_name}"
user_message = prompt
result = self.runner.loop(prompt=user_message)
self._last_result = result
return result.last_message
Here the model never sees /review src/agent.py. It sees the rendered command
prompt. This is closer to the way real coding agents usually handle commands.
Template processing
prototype/src/commands.py implements $1, $2, and $ARGUMENTS:
def _process_template(template: str, arguments: str) -> str:
try:
args = shlex.split(arguments) if arguments else []
except ValueError:
args = arguments.split() if arguments else []
placeholder_regex = re.compile(r"\$(\d+)")
placeholders = placeholder_regex.findall(template)
last = int(placeholders[-1]) if placeholders else 0
The replacement function gives the last positional placeholder the remaining arguments:
def replace_placeholder(match):
index = int(match.group(1)) - 1
if index >= len(args):
return ""
if match.group(1) == str(last):
return " ".join(args[index:])
return args[index]
template = placeholder_regex.sub(replace_placeholder, template)
template = template.replace("$ARGUMENTS", arguments)
return template
This turns a command file into a practical prompt template.
Example prototype commands
The prototype includes commands/review.md:
---
description: Review code for quality and suggest improvements
---
Review the code at $1 for:
1. Code quality and readability
2. Potential bugs or issues
3. Performance considerations
4. Best practices violations
Provide specific, actionable suggestions.
It also includes commands/test.md:
---
description: Run tests with coverage
---
Run the full test suite with coverage report. Focus on any failing tests and suggest fixes.
These are smaller than the /kid and /parent demo commands, but they are
better tests because the expected behavior is easy to assert.
Unit tests for skills
prototype/tests/test_skills.py checks deterministic loader behavior. It
verifies that:
- The loader discovers
hello,joke,counter, anddeploy_app. - Each skill has a name, description, and content.
- Missing skills return
Nonefromget(). - The tool description includes the skill list.
deploy_appcontains file references to scripts and templates.
Run the skill tests:
uv run pytest tests/test_skills.py -v
These tests do not call the LLM, so they should be stable.
Unit tests for commands
prototype/tests/test_commands.py verifies command discovery and template
rendering. It checks that:
reviewandtestare discovered.- Missing commands return
None. execute_command("review", "src/agent.py")returns a prompt containingsrc/agent.py.$1is replaced.
Run them:
uv run pytest tests/test_commands.py -v
The file also contains an LLM-backed full-flow test that creates an agent and feeds a rendered review prompt into it. Treat that as an integration test.
Integration tests for skills and commands
The LLM-backed tests are in:
tests/test_agent_run.pytests/test_agent_with_skills.pytests/test_agent_with_commands.py
They assert behavior such as:
- The agent calls
read_filewhen asked to readREADME.md. - The agent loads
deploy_appwhen asked how to deploy an application. - The agent loads
coding_standardsand readsfunction_template.md. - The agent runs
check_standards.shwhen asked to check coding standards. /review src/agent.pyproduces a code-review response./test src/skills.pyproduces a testing-oriented response.- Missing commands return
Command not found.
Run an integration file when you have an API key configured:
uv run pytest tests/test_agent_with_skills.py -v -s
These tests call an LLM, so they are slower and can be less deterministic than the loader tests. They are still useful because the main behavior is not only parsing files. The model must decide to call the right tool.
Continue with Q&A: skills and commands for side questions from the live session.