Rater Reliability

How to Ensure We Have the Most Accurate Data Possible


What is Rater Reliability?

Rater reliability is the concept that when something is being assessed, it would get the same accurate score, no matter who is doing the rating. If you were evaluating one of your Team Members, a sign of high rater reliability would be if another Manager gave the same scores in the evaluation of that same Team Member. Essentially, if you had the same information in front of you each time, you would give the same score. Ideally, we want to have high rater reliability as it means that we have accurate data and a clear picture of employee performance.


Example of high rater reliability:

  • John Manager and Jane Manager both gave their Team Member Bobbi Blast from Sales an A, indicating that they objectively saw Bobbi as a leading contributor.

What is Rater Bias?

So, what happens when rater reliability is absent? When objectivity is removed, rater bias takes place. This means that the rater gives a score that would not be completely accurate because they allowed emotions, fatigue, low focus, etc. to influence their evaluation process. If there is unchecked rater bias occurring during the evaluation process, then the data is skewed and does not present an accurate picture of employee performance.

Two creative millenial small business owners working on social media strategy using a digital tablet while sitting in staircase

Examples of rater bias:

  • John Manager just discovered that he won the lottery. He is so overjoyed and is giving each of his Team Members perfect scores in their evaluations.
  • Jane Manager is exhausted and hungry after working overtime. She can barely focus on who she is doing the evaluations for and finds herself just going through the motions and filling in the bubbles closest to her hand.

What are the types of Rater Bias?

Everyone is guilty of practicing bias at one point or another. However, we can try to mitigate the presence of bias in our evaluations as much as possible. Below are several different types of Rater Bias to watch out for:

Halo bias

 This is when a Manager gives a Team Member more favorable ratings than would be accurate.

John Manager is rating Bobbi Blast as if he were perfect. While Bobbi, did put together a great presentation for a client, he also slacked on his other projects, so John Manager's rating is not accurate.

Horns bias

This is when a Manager gives a Team Member more unfavorable ratings than would be accurate.

Jane Manager is rating Sarah Slick as if she can't do anything right. While Sarah did struggle with her marketing strategy pitch, she was also very proactive with her other projects and even got a client shoutout for her quick problem solving, so Jane Manager's rating is not accurate.

Primary bias

This is when a Manager gives a Team Member a rating that only reflects their work at the start of a project or period, even if their work changed later.

Cathy Cooke recently finished a work project. At the beginning of her work, Cathy was a bit all over the place with her time management, but she pulled it out in the end and put together a great final piece. John Manager is rating Cathy unfavorably, completely ignoring her recent progress, therefore giving an inaccurate rating.

Recency bias

This is when a Manager gives a Team Member a rating that only reflects the work at the end of a project or period, even if their work at the beginning was different.

Donny Dunn is a very consistent high performer. However, he recently flopped on a presentation. Jane Manager is rating Donny pretty harshly, even though this presentation is not representative of Donny's overall work, therefore the rating is inaccurate.

Normative bias

This is when a Manager rates all of their Team Members the same, not taking into consideration any individual differences.

Donny, Cathy, and Bobbi are working on a project together. Each person has a different role and a different amount of contributions expected. John Manager is rating each Team Member the same, even though Donny, Cathy, and Bobbi performed at different levels.

Comparative Bias

This is when a Manager compares their Team Members against each other rather than rating the individual's own performance.

Sarah and Leon are working on a project together. Each person has a different role and a different amount of contributions expected. Jane Manager is rating new employee Sarah more critically since she did not contribute as much as Leon, a veteran employee who was the head of the project, even though Sarah exceeded expectations in her performance tasks. Sarah's rating is lower than it would have been if Jane Manager had rated Leon's and Sarah's individual performances.

Affinity Bias

This is when a Manager rates a Team Member more favorably if the Manager believes they have much in common with the Team Member.

John Manager really likes Leon Lowe since they both like their local football team and frequently bond by talking about the most recent game. Since they get along so well, John Manager tends to rate Leon more favorably than would be accurate.

Alienation Bias

This is when a Manager rates a Team Member more unfavorably if the Manager believes they have little in common with the Team Member.

Jane Manager and Heather Hanks have very different personalities. Jane Manager is very chatty and social, whereas Heather is quieter and more of a loner. They rarely have anything outside of work to talk about. Since they don't get along particularly well, Jane Manager finds herself being more critical in her evaluations of Heather, giving her inaccurate ratings.

Situational Bias

This is when a Manager rates a Team Member more favorably or unfavorably based on an event that was out of the Team Member's control.

One of Sarah's clients went under, so she lost their business.  Jane Manager is evaluating Sarah as if she lost her client due to negligent actions, even though it was just unfortunate extenuating circumstances. 

If we can become aware of the different types of bias that may be affecting our evaluations, then we can actively check our thinking as we are rating individuals. We aren't just going through the motions and clicking the circle that we feel is right, but asking ourselves why we believe this is an accurate reflection of the employee.

How can we practice Rater Reliability in Truvelop?

As you can tell from the examples, completing the evaluations in a neutral state is really important if you are going to gain accurate data. Truvelop is only a tool, it's what you do with it as a Manager that will impact your results. With that in mind, the Truvelop team would like to share some questions to ask yourself so that you may maximize the effectiveness of the tool, fully utilizing the evaluation process every time.

Woman watching videos

Before Starting an Evaluation, Ask Yourself:

  • What is my current mood?
    • If you are in a more emotional state, try taking a couple of deep breaths until you feel yourself calmed down. We don't want anger to produce a horns bias in our rating, or for good news to create a halo bias in our rating.
  • What is my current energy level?
    • If you notice you are tired or are overly energized, this may not be the time to fill out the observation! Maybe you need that morning coffee to wake up or to go for a walk to get some of those jitters out before you start the evaluation.
  • Am I able to give this evaluation my full attention right now?
    • If there is something distracting you, let's address that first. Inattentiveness can drastically affect the accuracy of an evaluation. When we aren't paying attention, bias can sneak into our thoughts without challenge.
  • Am I able to fill out this evaluation objectively?
    • As we learned in "What are the types of Rater Bias?", sometimes we need to separate the actions from the person so that we can give an accurate and objective assessment. Ask yourself, if another employee had shown the same behaviors, would I be giving the same rating? Go back to your comments and see if those are in alignment with the overall rating.
  • Can I think of an example that would exemplify this rating?
    • We want these ratings to be as objective as possible. If we are able to think of concrete documented behaviors, actions, instances, etc., where the employee earned this rating, then we will likely have more accurate data. If we can provide evidence for a rating, then we are mitigating rater bias.


One last piece to keep in mind: the more data we collect, the more accurate it will be. When we frequently complete objective evaluations, then we will be able to have a clearer picture of each individual employee.


Screen Shot 2021-07-11 at 8.12.31 PM-1