Beyond the Prompt

A system prompt is only one part of an agent's configuration. Temperature, tool access, and model selection work together with the prompt to define how an agent behaves. Getting the prompt right but the configuration wrong will produce an agent that feels broken.

Temperature

Temperature controls how the model selects tokens. Lower values make output more deterministic; higher values introduce more variation.

Temperature	Behavior	Best For
0.0 – 0.3	Highly deterministic	Factual lookup, classification, structured output
0.4 – 0.6	Mostly deterministic with slight variation	Code generation, technical writing, analysis
0.7 – 0.8	Balanced — reliable but with personality	Conversational agents, knowledge bases, support
0.9 – 1.0	High variation, surprising word choices	Creative writing, brainstorming, storytelling

How Temperature Interacts with Prompts

Mismatch (Bad)

A prompt saying “be creative and expressive” at temperature 0.2 will produce bland output.

Low temperature overrides the prompt's intent by always selecting safe, common tokens.

Mismatch (Bad)

A prompt saying “be precise and accurate” at temperature 1.0 will produce unreliable output.

High temperature introduces randomness that undermines precision.

Rule of thumb: Your temperature should match the freedom your prompt gives the agent.

The Three Built-in Agents

@guide

0.8

Warm, conversational, but grounded in real features

@writer

1.0

Maximum creative variation

@coder

0.4

Deterministic, accurate technical output

Tool Access

Tools give agents the ability to take actions — check the weather, search the web, access calendars. But more tools isn't always better.

The Principle

Every enabled tool is a classification candidate. When the model receives a user message, it decides whether to call a tool or respond with text. More enabled tools means more potential for misclassification.

Agent	Tools	Reasoning
@guide	All enabled	Needs to demonstrate every feature
@writer	All disabled	Should generate language, never trigger actions
@coder	Web search only	Needs current docs, nothing else

Enable a tool when:

✓The agent's core job requires it
✓Users would naturally expect the capability
✓It enhances responses rather than distracting

Disable a tool when:

✗It has no relation to the agent's purpose
✗It could cause misclassification of core queries
✗The agent should focus purely on language generation

Common Mistakes

Leaving all tools enabled “just in case”

This is the most common configuration error. An agent with 11 tools enabled will misclassify more often than one with 2.

Disabling tools the agent needs

A research agent without web search, or a scheduling agent without calendar access, will frustrate users.

Enabling tools that create ambiguity

A writing agent with weather tools might interpret “Write about a stormy night” as a weather query.

Model Selection

Different models have different strengths. The model tier determines the quality ceiling for your agent.

Tier	Model	Best For
Apple Intelligence	Foundation Models	Highest quality, best reasoning, nuanced output
Gemma 3 4B	2.5GB on-device	Strong general purpose, good for most agents
Gemma 3n E4B	2.7GB on-device	Efficient text generation, good balance
Gemma 3n E2B	1.5GB on-device	Lightweight, fast responses, simpler tasks
Qwen 2.5 0.5B	Smallest on-device	Quick answers, limited reasoning depth

How Model Choice Affects Prompts

Smaller models have shorter context windows and less reasoning depth. This means:

•Shorter prompts work better on smaller models — a 200-word system prompt might consume too much context
•Simpler directives are more reliably followed — complex conditionals may be lost
•Explicit examples help more on smaller models than abstract instructions

Putting It All Together

The best agents have coherent configuration — every setting reinforces the same intent.

Coherent Configuration (Good)

Prompt:  Creative writing assistant
Temp:    1.0 — supports creativity
Tools:   None — pure language generation
Model:   Apple Intelligence — highest quality

Every setting says “creative freedom.”

Incoherent Configuration (Bad)

Prompt:  Creative writing assistant
Temp:    0.2 — suppresses creativity
Tools:   All enabled — will trigger on prompts
Model:   Qwen 0.5B — too small for creative work

The prompt says creativity, but every other setting fights it.

Configuration Checklist

Temperature: Does it match the freedom level in my prompt?

Tools: Does every enabled tool serve this agent's purpose?

Model: Is it capable enough for what my prompt asks?

Prompt length: Is it appropriate for the model tier?

Consistency: Do all settings reinforce the same intent?