In my last blog post, I offered a look at the process HireVue takes when developing a pre-hire assessment model for each type of job. HireVue has always been committed to doing good science that creates a level playing field for all candidates and helps companies consider a larger and more diverse set of candidates than ever before. We created HireVue Assessments in part to eliminate the issues associated with unconscious human bias in screening interviews -- and it worked. Our customers have shared stories of how they’ve increased the diversity of their hires as a result. But what about algorithmic bias?
How HireVue Works to Prevent Algorithmic Bias
If you judged AI/machine-learning algorithms solely by the headlines these days, you might think that all algorithms are as biased as we humans are - or worse. Without deliberately working to eliminate bias that may reside in an algorithm’s training data or its data scientist creators, algorithms are absolutely at risk of inheriting the biases of humans. From the earliest days of designing HireVue Assessments, the team has been deeply committed to an ethical, rigorous, and ongoing process of testing and mitigating for the presence of bias in our HireVue Assessments models (also referred to as algorithms).
When we create a job-specific algorithm or model, a primary focus of our process is finding and eliminating factors that cause bias. For every model built, the HireVue team undergoes these steps:
- Auditing the performance data that is being used to train the model. We actively look for any bias in the input that can lead to a biased output.
- Carefully testing for the presence of adverse impact in the predictions of the model.
- Removing any factors that are proven to be contributing to biased results.
- Retraining the model.
- Iterating until adverse impact is addressed.
This process is standard operating procedure for every model HireVue builds, one per job type.
There are a number of misconceptions about of these assessment models and other AI products in the media every week. Let’s talk about some of the most common misconceptions here below.
Misconception: A machine is deciding who gets hired and eliminating person-to-person interviews.
This is simply not accurate. First of all, a HireVue Assessments model/algorithm is not a robot, but a form of AI/machine learning that has a single, specific, early-stage evaluation to perform. Its only focus is determining which subset of candidates within a given pool are most likely to be successful when compared to people already performing the job. That information is then provided to human recruiters as decision support.
Those top candidates then move on from the screening stage to the person-to-person interviewing stages. Skilled recruiting professionals continue to decide which candidate gets the job after the completion of multiple stages in the hiring process.
Misconception: Algorithms simply replicate human bias.
In the case of HireVue Assessments models, the algorithm is paying attention only to those factors of the interview that research has proven to be predictive of success in the job.
On the other hand, human interviewers are often distracted by many other factors in an interview, factors that are unrelated to job success for that particular job. In addition, humans tend to have weakly defined definitions of success in job roles, and all too frequently revert to “gut instinct” that can often be driven by unconscious bias.
Here’s an example: The model might notice that most of a company’s top technical support representatives tend to speak more slowly than the rest. It may also happen to be the case that speaking slowly is more common in men than women, and this might skew the results so that the model rates men more highly than women. If we find this during testing, we can “shut off” the feature that measures for the speed of spoken communications in order to prevent men being given higher scores than women based on this feature. We then retest the model to ensure we’ve addressed the adverse impact.
Misconception: Algorithms are inherently unfair.
Decades of research have shown that traditional interviews are full of implicit and explicit bias, and tremendous inconsistency. The HireVue approach has been proven to be measurably more accurate at predicting performance than human evaluators and is audited, tested, retrained, and audited again to ensure that there is no adverse impact.
Misconception: HireVue Assessments are using facial recognition technology in scoring candidates.
HireVue does not use facial recognition technology nor track facial features for identity recognition purposes. HireVue Assessments does use expression recognition technology to study facial expressions, which represent just one category of characteristics reviewed by the model in order to predict whether a person is likely to be successful in a job. Tens of thousands of factors - including audio and language content - come from a candidate’s interview and are available for consideration in a given model, but only those scientifically validated as being predictive of job performance are included in the algorithm or model for that specific job role.
Misconception: Someone who isn’t charismatic on camera won’t score well.
This is even less true with an algorithm performing the evaluation than in human interviewing, because the models have been built and trained to “notice” and evaluate only the characteristics that are significant to job success. Charisma may matter in some jobs, but most of the time it will not be important, and therefore would not be considered by the model.
Misconception: If you make the wrong facial expression or blink too much, you won’t get hired.
Once again, the algorithms are looking at tens of thousands of factors and the training and auditing processes reveal which factors are not predictive of job success and can be removed. One little expression or individual factor makes very little difference in the overall score as to how likely a candidate is to be successful in a job.
Misconception: People who get nervous interviewing won’t do well.
As with the other misconceptions above, human interviewers are far more likely to negatively judge a candidate based on nervousness than a machine-learning model is. The truth is that most of us are nervous doing any kind of interview; in fact, most people don’t even like interviewing. Interviews, assessments, and resumes represent a variety of ways that companies get to know people. The difference here is that the HireVue Assessments model can overlook the parts of your performance that really don’t make a difference, whereas human recruiters may not be able to avoid noticing them.
These misconceptions really beg a critical question: Which should we prefer? The world where hiring is influenced by a human with an unclear definition of job success asking inconsistent questions (who may or may not be paying attention to the answers the candidate is giving) evaluating on unknown criteria, OR a data-driven method that’s fairer, consistent, auditable, improvable, and inclusive?
I know that I speak for everyone on the HireVue product development and data science team when I say that we are committed to actively working to reduce bias in the hiring process and to open up more opportunities for a wider variety of well-qualified people. Look for more blog posts on our process, our commitment to scientific standards, and our work on AI ethics in the coming months.