Beyond the Prompt
A system prompt is only one part of an agent's configuration. Temperature, tool access, and model selection work together with the prompt to define how an agent behaves. Getting the prompt right but the configuration wrong will produce an agent that feels broken.
Temperature
Temperature controls how the model selects tokens. Lower values make output more deterministic; higher values introduce more variation.
| Temperature | Behavior | Best For |
|---|---|---|
| 0.0 – 0.3 | Highly deterministic | Factual lookup, classification, structured output |
| 0.4 – 0.6 | Mostly deterministic with slight variation | Code generation, technical writing, analysis |
| 0.7 – 0.8 | Balanced — reliable but with personality | Conversational agents, knowledge bases, support |
| 0.9 – 1.0 | High variation, surprising word choices | Creative writing, brainstorming, storytelling |
How Temperature Interacts with Prompts
A prompt saying “be creative and expressive” at temperature 0.2 will produce bland output.
Low temperature overrides the prompt's intent by always selecting safe, common tokens.
A prompt saying “be precise and accurate” at temperature 1.0 will produce unreliable output.
High temperature introduces randomness that undermines precision.
Rule of thumb: Your temperature should match the freedom your prompt gives the agent.
The Three Built-in Agents
Warm, conversational, but grounded in real features
Maximum creative variation
Deterministic, accurate technical output
Tool Access
Tools give agents the ability to take actions — check the weather, search the web, access calendars. But more tools isn't always better.
The Principle
Every enabled tool is a classification candidate. When the model receives a user message, it decides whether to call a tool or respond with text. More enabled tools means more potential for misclassification.
| Agent | Tools | Reasoning |
|---|---|---|
| @guide | All enabled | Needs to demonstrate every feature |
| @writer | All disabled | Should generate language, never trigger actions |
| @coder | Web search only | Needs current docs, nothing else |
Enable a tool when:
- ✓The agent's core job requires it
- ✓Users would naturally expect the capability
- ✓It enhances responses rather than distracting
Disable a tool when:
- ✗It has no relation to the agent's purpose
- ✗It could cause misclassification of core queries
- ✗The agent should focus purely on language generation
Common Mistakes
This is the most common configuration error. An agent with 11 tools enabled will misclassify more often than one with 2.
A research agent without web search, or a scheduling agent without calendar access, will frustrate users.
A writing agent with weather tools might interpret “Write about a stormy night” as a weather query.
Model Selection
Different models have different strengths. The model tier determines the quality ceiling for your agent.
| Tier | Model | Best For |
|---|---|---|
| Apple Intelligence | Foundation Models | Highest quality, best reasoning, nuanced output |
| Gemma 3 4B | 2.5GB on-device | Strong general purpose, good for most agents |
| Gemma 3n E4B | 2.7GB on-device | Efficient text generation, good balance |
| Gemma 3n E2B | 1.5GB on-device | Lightweight, fast responses, simpler tasks |
| Qwen 2.5 0.5B | Smallest on-device | Quick answers, limited reasoning depth |
How Model Choice Affects Prompts
Smaller models have shorter context windows and less reasoning depth. This means:
- •Shorter prompts work better on smaller models — a 200-word system prompt might consume too much context
- •Simpler directives are more reliably followed — complex conditionals may be lost
- •Explicit examples help more on smaller models than abstract instructions
Putting It All Together
The best agents have coherent configuration — every setting reinforces the same intent.
Coherent Configuration (Good)
Prompt: Creative writing assistant Temp: 1.0 — supports creativity Tools: None — pure language generation Model: Apple Intelligence — highest quality
Every setting says “creative freedom.”
Incoherent Configuration (Bad)
Prompt: Creative writing assistant Temp: 0.2 — suppresses creativity Tools: All enabled — will trigger on prompts Model: Qwen 0.5B — too small for creative work
The prompt says creativity, but every other setting fights it.