Northstars

The foundational layer of the quality system. Every audit, every test, and every behavioral evaluation runs against northstars.

Small title

Medium length section heading goes here

Lorem ipsum dolor sit amet consectetur. Tempor gravida ultricies ut iaculis eget lacus non. Sagittis elementum aliquam ultricies in.

What are northstars?

Northstars are explicit, auditable rules that define what correct agent behavior looks like. They cover two dimensions:

Behavioral northstars govern how an agent communicates and operates - communication style, information handling, tool usage sequencing, hard contradictions, and multi-step workflow requirements. Always confirm the waybill before disclosing shipment data. Never read credentials aloud. Always verify identity before account actions.

Business northstars are operational outcomes the agent is accountable for - resolution on first contact, containment rate above target, correct intent classification, and adherence to tool invocation requirements.

Both are defined per agent, versioned alongside the workflows they govern, and evaluated automatically, at scale, against every sampled production session.

the happyrobot platform generates northstars for you

Automatic northstar extraction

When you build an agent, the behavioral rules that should govern it are already embedded in your prompt and workflow structure - the steps the agent must follow, the tools it must invoke, the things it must never do, the sequence that matters. Our proprietary extraction system automatically surfaces these as suggested northstar rules, categorized and ready to review. Each extracted northstar maps back to the specific prompt node or workflow component it came from, so there's full traceability between a behavioral rule and the agent logic that defines it. You review, refine, and activate and the system does the work of translating your operating procedures into machine-checkable standards. No more manually assembled lists that may miss edge cases or fall out of sync when the workflow changes.

northstar categories

A northstar can be from one of these categories

Notes

Information the agent must retain and apply throughout the conversation. Use this for anything the agent should remember and act on, such as confirming a customer's name before proceeding, noting a preference expressed earlier in the call, carrying context across turns.

Style

Communication tone, register, and persona. Use this for how the agent should sound - not what it does, but how it does it. Professional but empathetic. Direct without being abrupt. Never dismissive of frustration.

Tool

Conditions under which specific tools must be invoked. Use this when the agent should never rely on its own reasoning for a particular actions - always run the lookup, always verify through the system, never quote from memory.

Sequential

Required order of operations. Use this when sequence matters - verifying identity before any account action, confirming a waybill number before disclosing shipment data, offering a human handover only after a second failed resolution attempt.

what makes an effective northstar - if you choose to write them yourself

Writing effective northstars

Be precise and checkable

A well-written northstar has a clear pass/fail condition. "Always confirm the waybill number before disclosing any shipment status" is checkable. "Handle shipment queries well" is not.

Match the category to the intent

A sequential rule written as a style rule will be evaluated incorrectly. If the rule is about order of operations, it belongs in Sequential, not Notes or Contradiction.

Set priority based on real business impact

Priority determines where your team focuses when issues surface. Reserve high priority for rules where a violation carries genuine operational, compliance, or reputational risk.

Add examples early

Don't wait for audit failures to calibrate. Add at least one positive and one negative example when creating each northstar. The evaluator uses them from the first audit onward.

All pre-deployment testing and post-deployment auditing is done to measure agent behavior against northstars. Click below to learn more about all the governance-related features of the platform.

Putting agents to work in complex environments