Q&A: skills and commands
These questions came up during the workshop. They sit outside the main flow, but the answers are useful to keep.
Skill selection
Q: How do you decide what belongs in a skill instead of relying on the model's existing knowledge?
Most general knowledge is already inside the model. A skill is useful when you have a preferred way to do something and don't want to steer the agent through that path every time.
In practice, you work with the agent, correct its approach, and arrive at something you like. Then you ask it to save those steps as a skill. Next time, the agent can load the saved instructions instead of rediscovering the process.
Course management is a concrete example. By default the model doesn't know your
internal API for creating course projects and homework. A skill can document how
to call it, what JSON structure to send, and which curl commands to use.
Skill loading
Q: Where does the skill instruction run?
It runs locally in the runner, exactly like a normal tool call. The
model sees the skill list in the prompt, decides that hello is relevant, and
asks to call the skill tool with name="hello".
Then Python runs SkillsTool.skill("hello"), loads the file from disk, and
adds the tool result to the message history. The next model call sees the skill
content and answers according to it.
Skills versus commands
Q: Where do you draw the line between skills and commands?
Skills are implicit from your point of view. You can say "hello" or "I need to deploy my app" without knowing which skill exists, and the agent decides whether a skill matches.
Commands are explicit, since you type /kid, /review, or /test because you
want that exact reusable prompt. The system never has to infer which command to
use.
Skills versus tools
Q: Are skills normal tool calling?
Yes, in this workshop a skill is loaded through normal tool calling. Lazy loading makes it useful: the prompt includes the skill names and descriptions, not every full skill body.
This keeps context smaller, because the full markdown is added only when the model chooses a matching skill.
Model support
Q: Which models support skills?
Any model that supports tool calls can support this style of skills. We use
gpt-4o-mini in the notebook to keep cost low and make how it works visible.
A stronger model may follow the instructions more reliably, but the
tool-calling flow is the same.
Q: Are Claude models fine-tuned for skills?
Claude Code and the Claude models are closed, so treat their internal training as unknown. What you can observe is tool calling. If a model is good at tool calling, it can use this skill pattern.
Reliability
Q: Have you seen 100 percent conformance to skill steps being executed?
We don't include a benchmark here. In practice, when a skill should apply, the agent usually loads it. Treat that as experience, not a measured guarantee.
If skill adherence matters, test it like the prototype does. Look at the
recorded tool calls and assert that the expected skill call happened.
Command discovery
Q: Do we send an explicit list of commands in the system prompt?
In the notebook you don't send a list. You give the model generic
instructions: when you see /command, call execute_command with the command
name. If the command doesn't exist, the tool returns Command not found.
For skills, the list matters because the agent has to infer when a skill is useful. For commands, you already selected the command by typing the slash syntax.
Hot-loading commands
Q: Can commands be hot-loaded?
In the Claude Code demo, restart the session with claude -c after fetching
the command files so Claude can discover them.
For your own agent, hot loading is an implementation choice. If the command loader reads from disk on every command run, newly added files can be available immediately. If you cache command lists at startup, you need a reload step.