Behaviorally Anchored Rating Scales (BARS): The Complete Manager’s Guide

shares

Most performance rating scales fail before the review conversation even begins. A five-point scale that moves from “Below Expectations” to “Exceeds Expectations” looks rigorous on paper, but leaves every manager to define those labels independently. One manager’s “Meets Expectations” is another’s “Needs Improvement.” The result is a system that feels objective while producing wildly inconsistent outcomes — and employees who feel judged by standards they were never shown. Behaviorally Anchored Rating Scales (BARS) solve this problem by replacing abstract labels with specific, observable examples of what performance actually looks like at each level. They are harder to build than a standard rating scale, but they produce reviews that are fairer, more defensible, and far more useful for development.

What Are Behaviorally Anchored Rating Scales (BARS)?

Behaviorally Anchored Rating Scales (BARS) are a performance appraisal method that combines quantitative ratings with qualitative behavioral descriptions. Each point on the rating scale is “anchored” to a specific behavioral example — a concrete description of what an employee actually does (or fails to do) that places them at that level. BARS were first introduced by Smith and Kendall in 1963 and have been used in structured performance management ever since. Unlike graphic rating scales or purely narrative reviews, BARS give both managers and employees a shared vocabulary for describing performance levels.

Why Standard Rating Scales Break Down

Before understanding why BARS work, it helps to understand why conventional rating scales fail. The most common format — a numbered scale from 1 to 5 with labels like “Unsatisfactory,” “Needs Improvement,” “Meets Expectations,” “Exceeds Expectations,” and “Outstanding” — has three fundamental problems.

First, the labels are undefined. What does it mean to “exceed expectations” as a software engineer? As a customer success manager? As a finance analyst? Without behavioral anchors, these labels mean whatever the individual manager thinks they mean — and research consistently shows that managers differ significantly in their interpretation of performance language.

Second, they invite central tendency bias. When ratings are undefined, managers default to the middle. They give most employees a 3 because it feels safe, defensible, and conflict-avoiding. The result is a compressed rating distribution that tells employees almost nothing about where they actually stand.

Third, they cannot drive development. Telling an employee they scored a “3 out of 5” on “Communication” gives them no actionable information. They cannot improve a number. They can improve a behavior. BARS make the feedback developmental by design — because the anchor itself describes the behavior that needs to change.

How BARS Work: The Core Structure

A BARS instrument is built around performance dimensions — the behavioral areas most relevant to a role. For each dimension, the scale runs from low performance to high performance, with each rating point anchored to a behavioral example written in observable terms.

Here is a BARS example for the dimension Client Communication in a consulting role:

5 — Exceptional: Proactively identifies when client expectations are drifting from project scope and initiates a structured conversation to re-align before issues compound. Clients consistently report feeling informed and confident throughout engagements.

4 — Strong: Communicates project status clearly and on schedule. Addresses client concerns directly and follows up in writing within 24 hours. Rarely requires escalation for communication issues.

3 — Adequate: Communicates project status when asked and responds to client questions within agreed timelines. Occasionally misses proactive outreach on complex issues.

2 — Developing: Communication is reactive rather than proactive. Clients sometimes report feeling uninformed about project status. Requires manager coaching to improve follow-through.

1 — Unsatisfactory: Client communication failures have resulted in escalations, damaged relationships, or project delays. Manager has intervened directly on multiple occasions.

Notice what this scale does that a generic five-point scale cannot: it describes behavior, not impressions. A manager and an employee can look at these anchors together, discuss specific incidents, and reach a shared understanding of where the employee sits — and what moving to the next level actually requires.

Managers reviewing behaviorally anchored rating scales during a performance calibration session

How to Build a BARS Instrument: Step-by-Step

Step 1: Identify the Performance Dimensions

Start by identifying the 5–8 behavioral dimensions that most significantly drive performance in the role. These should be observable, meaningful, and distinct from each other. Common dimensions for knowledge workers include: problem-solving, communication, collaboration, ownership and follow-through, technical expertise, client or stakeholder management, and leadership (for people managers). Involve current high performers and experienced managers in this step — they know which behaviors actually separate strong performance from weak performance in practice, which theorists at headquarters often do not.

Step 2: Generate Critical Incidents

For each dimension, collect “critical incidents” — specific examples of effective and ineffective behavior that have actually been observed on the job. The classic technique is to ask experienced managers: “Think about the best and worst performers you have managed in this role. Without naming them, describe a specific thing you observed them doing that illustrates their performance level.” These incidents become the raw material for your behavioral anchors.

Step 3: Sort and Cluster the Incidents

Have a separate group of raters — different from those who generated the incidents — sort the incidents into dimensions. Incidents that are consistently placed in the same dimension by independent raters are retained. Incidents that raters disagree on are refined or discarded. This step builds the validity of your scale: if different people cannot agree which dimension an incident belongs to, the dimension boundaries are unclear.

Step 4: Scale the Anchors

Have another group of raters assign each incident to a performance level on your rating scale. Ask them: “If a typical employee in this role exhibited this behavior, where would you rate their performance?” Average the ratings across raters and use the average to place each incident on the scale. Incidents with high rater agreement about their level make the strongest anchors.

Step 5: Finalize the Instrument and Pilot It

Select the 1–3 best anchors for each rating point on each dimension. Write them in clear, observable, third-person behavioral language: “The employee does X” rather than “Shows commitment” or “Has good skills.” Pilot the instrument with a small group of managers before full rollout — ask them to rate the same employees independently using the new BARS and compare results. High inter-rater agreement is the sign that your anchors are working.

BARS vs. Other Performance Appraisal Methods

BARS do not exist in isolation. Understanding where they fit relative to other methods helps you decide when to use them and when a simpler approach might serve just as well.

BARS vs. Graphic Rating Scales: Graphic rating scales (the standard numbered scale with labeled points) are faster to build and administer but far more susceptible to rater bias. BARS take longer to build but produce more consistent, defensible ratings. For roles where performance variation has significant consequences — senior technical roles, client-facing positions, people management — BARS justify the investment.

BARS vs. Management by Objectives (MBO): MBO focuses on outcomes: did the employee hit their targets? BARS focus on behaviors: how did the employee approach their work? Both matter. A high-outcome, low-behavior employee may be hitting numbers through approaches that are unsustainable or damaging to team culture. A high-behavior, low-outcome employee may be working effectively in circumstances outside their control. The most complete performance picture combines both, which is why BARS work well alongside OKR-based goal tracking.

BARS vs. 360-Degree Feedback: 360-degree feedback collects behavioral input from multiple rater sources — peers, direct reports, clients. BARS define how to score that input consistently. They are complementary: 360 data is more useful when the behaviors being rated are defined by BARS anchors rather than abstract competency labels.

When BARS Are Worth the Investment

Building a proper BARS instrument takes time — typically 20–40 hours of facilitated work across the critical incidents generation, sorting, and scaling phases. That investment is not always justified. BARS make the most sense when:

  • The role is complex enough that performance varies significantly across behavioral dimensions
  • Multiple managers rate the same type of employee, making consistency across raters a real concern
  • The ratings have significant downstream consequences — promotion decisions, compensation adjustments, performance improvement plans
  • You have experienced high-performers and managers willing to contribute to the development process
  • The role is stable enough that the behavioral anchors will remain relevant for 2–3 years

BARS are less justified for highly variable, project-based roles where the relevant behaviors shift constantly, for roles with very small populations (fewer than 10–15 employees), or for organizations that are changing their performance management approach frequently.

Common Mistakes When Building BARS

Writing Trait-Based Rather Than Behavioral Anchors

The most common mistake is writing anchors that describe traits or dispositions rather than behaviors. “Is a strong communicator” is a trait. “Sends written follow-up summaries after client meetings without being asked, and reviews them with the client to ensure alignment” is a behavior. Behavioral anchors describe what someone does, not what they are. If your anchors cannot be directly observed by a manager watching an employee work, they are too abstract.

Building Anchors Without Practitioner Input

BARS built solely by HR or by consultants who have never worked in the role produce anchors that feel generic and unrecognizable to the people who will use them. The critical incidents must come from people who have observed real performance variation in the real role. Without this grounding, your anchors will describe theoretical behaviors rather than the actual performance differences that matter.

Using BARS Without Training Managers

Even with well-written anchors, managers need to understand how to use BARS — how to match observations to anchors, how to handle cases where an employee’s behavior falls between two levels, and how to use the anchors to structure the development conversation rather than just assign a number. Build BARS training into your broader performance review preparation process.

Neglecting to Update the Instrument

Jobs change. Behaviors that defined excellent performance in 2020 may be baseline expectations in 2026. Schedule a formal BARS review every 2–3 years — or whenever the role changes significantly — to ensure the anchors still reflect the behaviors that actually drive performance in the current environment.

Integrating BARS Into Your Review Process

BARS work best when integrated into a structured review process rather than treated as a standalone form. Consider these integration points:

Pre-review: Share the BARS instrument with employees before the review cycle begins. Let them know which dimensions they will be rated on and what the behavioral anchors look like. This transforms the review from a judgment event into a developmental tool — employees can self-assess against the anchors throughout the year and request feedback on specific behaviors.

During the review: Use the BARS anchors to structure the conversation, not just to assign the final rating. Walk through each dimension with the employee: “Here are the anchors for this dimension. Based on what I have observed this year, I think you are consistently at this level. Let me share the specific examples that led me there.” Invite the employee to share their own observations and any evidence that might shift the rating.

Post-review: Use the behavioral anchors as the foundation for the development plan. If an employee is rated at level 3 on a dimension and the goal is to reach level 4, the anchor for level 4 defines exactly what needs to change. This makes goal-setting for development far more specific and actionable than vague improvement goals.

BARS and Legal Defensibility

One underappreciated advantage of BARS is their legal defensibility. When a performance rating decision is challenged — particularly in the context of a termination, demotion, or performance improvement plan — BARS provide documented evidence that the rating criteria were defined in behavioral terms, communicated in advance, applied consistently across employees, and based on observable evidence rather than subjective impressions. Organizations that have invested in BARS are significantly better positioned to defend performance-based employment decisions than those using generic rating scales. This matters particularly in industries subject to employment litigation or in organizations with a history of inconsistent performance management.

Frequently Asked Questions About BARS

How many dimensions should a BARS instrument cover?

Most practitioners recommend 5–8 dimensions per role. Fewer than 5 may miss important performance areas; more than 8 creates rating fatigue and reduces the accuracy of individual ratings. For most knowledge worker roles, the most meaningful dimensions include: technical/functional expertise, communication, problem-solving, collaboration, ownership and accountability, and (for managers) people leadership. Additional dimensions should be added only if they represent genuinely distinct behavioral areas that the existing dimensions do not capture.

Can BARS be used for all roles, or only certain types?

BARS work best for roles where performance is driven by consistent, observable behaviors that can be described in advance. They are highly effective for client-facing roles, managerial roles, technical individual contributor roles, and any role where inter-rater consistency matters. They are less effective for highly creative or entrepreneurial roles where the most valued behaviors are novel and unpredictable, and for roles that change so rapidly that behavioral anchors become outdated quickly.

How long does it take to build a BARS instrument?

A properly developed BARS instrument typically requires 20–40 hours of work, spread across 3–4 sessions involving subject matter experts (high performers in the role), experienced managers, and an HR facilitator. The critical incidents generation session typically takes 2–3 hours. Sorting and scaling sessions take another 3–4 hours each. Final writing and piloting add another 4–8 hours. Shortcuts — particularly skipping the sorting and scaling validation steps — significantly reduce the quality and inter-rater reliability of the final instrument.

What is the difference between BARS and behavioral observation scales (BOS)?

Both BARS and Behavioral Observation Scales (BOS) use behavioral descriptions of performance. The key difference is in how they work: BARS use behavioral descriptions as anchor points at specific rating levels (the anchor defines what a 3 vs. a 4 looks like). BOS present a list of behavioral statements and ask managers to rate the frequency with which they observe each behavior (from “almost never” to “almost always”). BOS can be easier to administer and may produce more granular data, but BARS remain more widely used for formal performance appraisals because they map more cleanly to a single overall rating per dimension.

Bottom Line

Behaviorally Anchored Rating Scales replace vague performance labels with specific, observable behavioral descriptions — giving managers and employees a shared language for performance discussions. They reduce rater bias, improve inter-rater consistency, make feedback more actionable for development, and provide stronger legal defensibility than generic rating scales. Building them requires real investment: facilitated sessions with practitioners, sorting and scaling validation, and manager training. But for complex roles where performance ratings carry real consequences, BARS are one of the most useful tools available for making performance management fair, consistent, and genuinely useful. Pair them with structured calibration meetings and a continuous feedback culture to get the full benefit.

Leave a Reply

Your email address will not be published. Required fields are marked *

Simplifying performance evaluations with actionable insights, customizable templates, and AI-powered summaries to drive growth and success.

@2025 Evalio. All rights reserved.