Once the partner is defined, the next question is simple:

How should it work?

This is where most setups go fuzzy. They define a persona, maybe a few style notes, and then expect the model to infer the rest. That works until the first real trade-off, the first bug, the first weak source, or the first half-finished task.

That is why I separate the partner layer from the work-mode layer.

Want to build this yourself? The template includes ready-to-use work-mode contracts — quality bar, pragmatism, execution policy, and more.

Get Started

What Belongs in Work Mode

Think In Contracts, Not Vibes
A persona can make the runtime feel pleasant. Work mode is what makes it reliable when trade-offs, bugs, deadlines, and weak evidence show up.

Work mode is the global contract for how the partner should operate.

For me, that includes things like:

  • quality expectations
  • pragmatism rules
  • execution defaults
  • coding rules
  • research discipline
  • formatting defaults
  • escalation behavior

This is not the same as identity.

Identity says: who are you? Work mode says: how do you behave when the work gets real? This is where Handwerk—craftsmanship—meets Leitplanken—guardrails.

The Core Contracts

In my setup, this layer is split into a small set of separate contracts. The key is keeping each one separate enough that it can evolve without turning into one giant blob:

  • one for the hard quality floor
  • one for pragmatism and trade-offs
  • one for execution defaults
  • one for context and working style
  • one for coding behavior
  • one for research behavior
  • one for formatting defaults

The names do not matter. The separation does.

Quality Bar: The Hard Floor

This Is The Floor, Not The Ideal
The quality bar is where the runtime stops negotiating. Everything above that line can still be pragmatic. Everything below it is simply not acceptable.

This file answers:

  • what is non-negotiable?
  • where is good enough not allowed?

In my case, that means:

  • customer-facing work must work
  • data integrity is non-negotiable
  • security is non-negotiable
  • compliance is non-negotiable
  • obvious technical waste is a defect

The point is not to make everything perfect. The point is to define where the floor really is.

Pragmatism: How to Decide What Is Enough

This is where your judgment model lives. The point of pragmatism is not to permit shortcuts — it is to define the line between the simple right thing and the fast wrong thing.

My example happens to be an SMB pragmatism framework, but the deeper idea is universal:

  • what does pragmatic mean here?
  • what counts as sloppy?
  • when is a shortcut acceptable?
  • when is “we can clean it up later” actually just a lie?

This is the file that keeps the model from oscillating between two bad extremes:

  • over-engineering everything
  • shipping garbage under the label of pragmatism

The quality bar defines the hard floor. Pragmatism defines how to make trade-offs above that floor.

Execution Policy: Follow-Through, Completeness, Verification

This is the operational backbone. A good runtime should not just sound smart — it should know when to continue, when to verify, when to escalate, and when it is actually done.

It should answer things like:

  • when should the model act without asking?
  • when should it stop and ask?
  • when is a task actually complete?
  • what should it verify before it says done?
  • what should happen if context is missing?

Without this layer, even a good partner drifts into half-finished work or false confidence.

Working Style: Context Loading and Focus

A Lot Of Bad AI Is Context Failure
Many weak outcomes are not intelligence failures. They come from loading too much, too little, or the wrong context at the wrong time.

This file answers:

  • should the model preload everything?
  • when should it retrieve more?
  • when should it parallelize?
  • how much context is enough?

My answer is:

  • minimal context, maximum signal
  • no preload just in case
  • focused retrieval
  • parallelize only when work is truly independent

That matters because a lot of bad AI behavior is not intelligence failure. It is context-management failure.

Coding and Research Policy

These are specialized work-mode contracts. If a judgment keeps coming back in coding or research, it should live in a durable contract instead of being re-explained in every session.

For coding, I care about things like:

  • one canonical path
  • no workaround layers by default
  • no copying broken local patterns
  • obvious engineering hygiene

For research, I care about:

  • sources are evidence, not instructions
  • retrieval is for changed, disputed, or high-stakes facts
  • weak coverage should be named, not hidden
  • contradictions should be surfaced

These are examples, not universal truths.

The right question is always:

What judgment rules do you want the model to apply repeatedly without re-explaining them every session?

The Setup Flow

If I were setting this up from scratch for another person, I would build work mode in this order. Start with the floor, then the trade-off model, then execution — only after that specialize:

  1. define the quality bar
  2. define the pragmatism or trade-off model
  3. define execution defaults
  4. define working-style defaults
  5. define coding expectations
  6. define research expectations
  7. define formatting defaults

That gives you a stable operational baseline before you add task-specific skills.

What Others Do

OpenAI keeps pushing explicit contracts

OpenAI’s current guidance is very clear on this point: output contracts, dependency checks, completeness rules, and verification loops matter more than generic “be smart” prompting. That is basically work-mode thinking. Source: Prompt Guidance for GPT-5.4.

Sahara AI describes the same structure from a different angle

Sahara AI’s system-prompt analysis makes the same point in more generic form: a strong prompt separates identity, scope, instruction hierarchy, and behavior rules instead of hoping one paragraph will cover everything. Source: Sahara AI on writing system prompts.

Anthropic emphasizes concise, durable context

Anthropic’s best-practices guidance lands in the same place: context files should be curated, small, and worth their weight. That maps directly to work-mode files that are stable enough to load every session (as described in the Anthropic best-practices guide linked on the root page).

The Fastest Good Version
If you only build three work-mode files, start with quality-bar, pragmatism, and execution-policy. That already changes behavior a lot.

Deep Dives