The foundational layer of the quality system. Every audit, every test, and every behavioral evaluation runs against northstars.
Lorem ipsum dolor sit amet consectetur. Tempor gravida ultricies ut iaculis eget lacus non. Sagittis elementum aliquam ultricies in.

Northstars are explicit, auditable rules that define what correct agent behavior looks like. They cover two dimensions:
Behavioral northstars govern how an agent communicates and operates - communication style, information handling, tool usage sequencing, hard contradictions, and multi-step workflow requirements. Always confirm the waybill before disclosing shipment data. Never read credentials aloud. Always verify identity before account actions.
Business northstars are operational outcomes the agent is accountable for - resolution on first contact, containment rate above target, correct intent classification, and adherence to tool invocation requirements.
Both are defined per agent, versioned alongside the workflows they govern, and evaluated automatically, at scale, against every sampled production session.

When you build an agent, the behavioral rules that should govern it are already embedded in your prompt and workflow structure - the steps the agent must follow, the tools it must invoke, the things it must never do, the sequence that matters. Our proprietary extraction system automatically surfaces these as suggested northstar rules, categorized and ready to review. Each extracted northstar maps back to the specific prompt node or workflow component it came from, so there's full traceability between a behavioral rule and the agent logic that defines it. You review, refine, and activate and the system does the work of translating your operating procedures into machine-checkable standards. No more manually assembled lists that may miss edge cases or fall out of sync when the workflow changes.

Information the agent must retain and apply throughout the conversation. Use this for anything the agent should remember and act on, such as confirming a customer's name before proceeding, noting a preference expressed earlier in the call, carrying context across turns.
Communication tone, register, and persona. Use this for how the agent should sound - not what it does, but how it does it. Professional but empathetic. Direct without being abrupt. Never dismissive of frustration.
Conditions under which specific tools must be invoked. Use this when the agent should never rely on its own reasoning for a particular actions - always run the lookup, always verify through the system, never quote from memory.
Required order of operations. Use this when sequence matters - verifying identity before any account action, confirming a waybill number before disclosing shipment data, offering a human handover only after a second failed resolution attempt.
A well-written northstar has a clear pass/fail condition. "Always confirm the waybill number before disclosing any shipment status" is checkable. "Handle shipment queries well" is not.
A sequential rule written as a style rule will be evaluated incorrectly. If the rule is about order of operations, it belongs in Sequential, not Notes or Contradiction.
Priority determines where your team focuses when issues surface. Reserve high priority for rules where a violation carries genuine operational, compliance, or reputational risk.
Don't wait for audit failures to calibrate. Add at least one positive and one negative example when creating each northstar. The evaluator uses them from the first audit onward.
All pre-deployment testing and post-deployment auditing is done to measure agent behavior against northstars. Click below to learn more about all the governance-related features of the platform.