MEASURING UNCONSCIOUS BIAS
by KIERAN SNYDER
Kieran Snyder holds a PhD in linguistics and has held product and design leadership roles at Microsoft and Amazon. She has authored several studies on language, technology, and document bias.
Most recently, Kieran built a multifunctional team in analytics, program management, and design for Amazon’s advertising organization. In her prior product leadership roles at Microsoft, Kieran created a linguistic services platform for developers, introducing new language detection, spell-checking, and other natural language processing capabilities to Windows developers for the first time. She also led a cross-company engineering effort for the native integration of Bing into Windows search. In her time at the company, Kieran was involved with search and natural language projects across several Microsoft products, including Windows, Bing, Office, and Visual Studio.
Kieran earned her doctorate in linguistics and cognitive science from the University of Pennsylvania and has published original research on gender bias in performance reviews and conversational interruptions in the workplace over the last year. She participates actively in Seattle-based STEM education initiatives and women in technology advocacy groups.
Male: Next up, we have Kieran Snyder. She is the CEO of Textio, a machine learning company that analyzes job listings to make them more effective. She holds a Ph.D. in Linguistics from the University of Pennsylvania, and has published widely on text analytics and technology. She participates actively in Seattle-based STEM education initiatives and women in technology advocacy groups. Glad to have you, Kieran.
Kieran: Thanks so much for having me today. I'm really excited to talk to you about unconscious bias and in particular, how we measure it. Unconscious bias is a hot topic in technology these days and in particular, in technology hiring. And it's something that everybody has. And this talk is going to talk about how we quantify it and how we get past it.
I started out working on this topic a few years back when my daughter, who was in preschool at the time, came home from school and told me very proudly that she was the best girl at math in her class. And my initial instinct, hearing the story, was to say, "That's awesome. I have a daughter who loves math." But as I thought about it a little bit more deeply, I realized there's something interesting in the fact that she feels the need to highlight that she's the best girl at math, not the best kid at math, or not that she loves math. And I started thinking, gosh, even at age four, which she was at the time, this level of unconscious bias, these assumptions that we make about what's typical behavior and ways that it's demographically marked, shows up really early.
So I'm going to go ahead and begin with some conversation of what is unconscious bias and how does it differ from conscious bias, what I call semi-conscious bias. So I'm going to use age as an example here, because in contrast to things like gender and race, there are many places in hiring where people are explicitly biased on the basis of age and don't necessarily believe that it's a mistake to do that. So think about, on the one hand, highly conscious bias in the age space. So you might see a job listing with text like, "This job is perfect for new grads," or "Are you a student? We have the best after-school job for you. We're interested in recent graduates." And this language is very deliberate. It selects, generally, for young applicants. The job seeker, in their minds, knows that what they want to do is hire a person who is just out of school, which is typically a fairly young person. And it's conscious.
If you go a little bit further over the spectrum, with what I call semi-conscious bias. So this is not overt, but it includes language that communicates pretty clearly that what we're trying to do here is hire a young person. So, "This job requires a minimum of 3.5 GPA," or "meet these SAT scores," maybe things like, "You love the startup lifestyle," which is generally, stereotypically associated with lots and lots of work, all hours of the day, all days of the week. And so this language, it doesn't say overtly, "Only apply if you're a new graduate." But it communicates it nonetheless. So I've had a career in the industry for a decade and a half. My GPA is not super relevant to hiring me at this point. And so I would see this and say, "Okay, this isn't the job for me." So it selects for young applicants.
Then you go a little bit further along the spectrum and you have what we call unconscious bias. And this is language that ends up selecting for young applicants, but it might be very covert in how it works. So things like, "For this job, you need mad skills," or "We want people who are going to bring the hustle." And it's just a way of communicating, a way of talking, vocabulary nomenclature that tends to select for younger people. So I use age, as I said, as an example just because it's a place where in job listings today, we see all kinds of patterns represented from conscious to unconscious.
This talk is going to focus primarily on unconscious bias. The work that we do at Textio and the tools that we do to help optimize job listings, really focus strongly on the unconscious bias space where the person who is doing the hiring may have biases, but they're actually trying to get past them. And so that's what we're going to talk about here today.
A lot of the work that we do at Textio and the work that I've done in work that I've published, focuses on written text. And there's a few reasons that we focus on written text. The first really big one is that it's written down. So in contrast to behavioral studies, where people are trying to find patterns of unconscious bias or speech studies, potentially, text is written down. It's very clear what was being communicated. There are no encoding challenges, and it's really easy to search and find patterns.
So a few years ago, when I got deeply involved in the bias research space, my background is in linguistics, and so text was a really natural place for me to start. The second reason text is helpful is not just the fact that the medium is written down, but that it's pervasive. And so over the last 20 years, text is all over the Internet in a way that it wasn't. So when I was a student doing empirical linguistics work, I still worked with text and I still tried to measure the differences between different kinds of text and understand documents deeply and empirically. But it was a lot harder to do so, because text of all kinds wasn't so freely available. So now, text is abundant and it's democratically available. So it means that anybody can do really interesting bias research with text because the barrier to entry is no longer that you need to be an academic. You really need a browser and the right data science for empirical skills.
And the third really important piece for us when we think about text, is that it provides a record of how language will change over time. So what works today in recruiting and marketing content, may be very different than what worked a year ago, six months ago, certainly five years ago. My favorite technology example is the phrase "big data." So it turns out that if you included the phrase "big data" in a technology or engineering job listing 18 months ago, it made the job extremely popular. You were likely to get many more applicants. It was seen as a very favorable term. Well, if you include it today, in August of 2015, it's a totally neutral term because the term has become so saturated in job listings that it's no longer something that speaks out as a special word.
A more extreme example of this is the word "synergy." So this isn't technology specific, but synergy is our...we call it a gateway term at Textio. It's a term that usually indicates that there are a lot of other jargony terms in your job listing. Well, five, six, ten years ago, synergy was seen as a hot term in business and consulting. It got oversaturated. It passed through a period of neutrality, and now it's so saturated that it's a cliche. So it ends up including language like synergy tends to drive down the effectiveness for you job listing. So this is all to say written text provides a record that makes bias research really fruitful.
So I'm going to look at a few particular examples. At Textio, we focus heavily in documents about people. And a lot of my research over the last few years, even prior to Textio, has focused here. So let's talk a little bit about the really hallmark document about people that lots and lots of companies use, which is performance reviews. Performance reviews are these documents that companies deploy a little bit differently, and some companies are now making headlines for doing away with them or changing the process. But historically, a performance review is a document where an employee gets a written narrative summarizing their contributions for the year. And it's usually attached to compensation in some way.
So a couple years back, I had a premise. I was at lunch with an engineering manager friend of mine and he was talking through promotions that he wanted to give his team, and he had two people on his team. One was a man and one was a woman. He thought they were both extremely strong and was hoping to give them both promotions during the cycle. And as he described to the man, he said, "You know, he's going to be a pretty easy case. Technical skills are pretty good. He's produced a lot of great work. He gets a little impatient sometimes, but you know, every talented person does, and he's easy."
And then he described the woman on his team and said, "You know, she's really talented. She's my most talented on the team technically, but she rubs a lot of people the wrong way. She's a bit abrasive. I'm going to have a much harder time getting her promotion through." And it raised the question for me, well, what's the difference between some who's impatient some of the time and someone who's abrasive? Well, maybe the difference is simply in the description.
And so, as a woman in the industry who has certainly been called abrasive more than once, and who also got fairly strong reviews when I was in the corporate world, I wondered whether this was really a quantifiable pattern. And so I put out a broad call for submissions on social media, and I asked people if they would be willing to share their performance reviews for a study. I didn't reveal that it was a study about gender. At the time, I had not published as much as I've published now about gender. So I didn't reveal it. I just wanted to see who would submit their reviews. I said if you've worked in technology, you're willing to share your reviews, please do.
I bet on the fact that most of the reviews that were submitted would end up being strong reviews, the idea that people would be more willing to share high performing reviews, and that turned out to be true. So the majority of reviews that I received in the study were positive. Many resulted in promotions. And that was true for the men and the women who submitted reviews. However, when I did even very basic linguistic analysis on the review text that was submitted, and these were reviews from over 25 different companies. They varied in size quite a bit. The patterns didn't vary.
It turned out that the men were significantly less likely to receive critical feedback of any kind than the women were. It also turned out that where the men received critical feedback at all, it was much more likely to be constructive feedback. And so this was feedback like, "You've done a great job this year. For your next step, we'd love to see you continue to develop your people management skills." Whereas the women's reviews had more critical feedback, and much more of it was negative personality feedback. So terms like abrasive, aggressive, bossy, turned up much more frequently in women's reviews, many of them exclusively in women's reviews and multiple times. My favorite term here is the term aggressive, which turned up several times in women's reviews and only a couple of times in the men's reviews, with an exhortation to be more of it. So whereas the women were dinged for this, the men were apt to show the characteristic more.
And so it was really eye opening to me because it gave data, quantitative text data, behind the pattern that showed there's some real gender bias in the industry. And so as we got going on Textio, we have this performance review document in place and documentation in place, we were working heavily on job listings, which I'll talk about in a moment. But we've also done some look at resumes and gender language within the context of resumes. So just like the performance review study, this was a broad call for submissions on social media, and it's very simple. "Do you work in the technology industry? Are you willing to share your resume for a study?" And I collected in all about 1,100 resumes, and people are generally pretty willing to give their resumes for a study. It's not secret material.
In contrast to the performance review study, where I really did very rudimentary research, after the resume study, I used a lot of our core Textio technology and looked at some syntactic characteristics as well. And what I was trying to understand is when men and women have similar jobs, similar background and qualification, do they present themselves differently in resumes? And the answer was abundantly yes.
So men's resumes tended to be shorter, significantly shorter, so just over 400 words compared to nearly 750 for the women's resumes. But on the other hand, they tended to contain much more explicit detail about the achievements that they had on the job and this was often reflected in bulleted list content, which I thought was really interesting. So the men's resumes were overall shorter, but they were punchier with their details. So in a man's resume, you would have language like, "I designed a platform for e-commerce and implemented it in over a six months period," or "My work drove click-through rate up by 4% last quarter." Very concrete, very specific achievements. Many fewer personal attributes, so the men were significantly less likely to include a personal interest section, and when they did, it tended to be one line and not longer. So you kind of get a sense of the men's resumes. They conform to what many of us are taught in school makes a good resume. Be short, keep it to one page, show lots of bullets.
The women's resumes, even for people who had the same, in many cases actually the same job and very similar job history, looked very different. They told their story very differently. So they tended to be longer. They told a story with prose instead of bullets. They created a narrative. There was often less specific detail about their own achievements and more recognition of the team vision and accomplishment. So instead of saying, "I designed and implemented this platform," the women's resumes were more likely to say, "Contributed to a new e-commerce platform that increased customer satisfaction 50%," or "Helped craft four patent applications." They were much more broad. They were much more team-centric and much less specific detail except in the area of personal attributes. The women's resumes were much more likely to contain summaries at the top, the executive summaries that say the kind of job that you're looking for and who you are.
And this was really interesting to us because the resume is really your bulletin board as a candidate when you are applying for a job. In many ways, it's an anachronistic document. There aren't too many places where we rely on a static representation of who we are in the real world, but resumes still matter at a lot of companies. And so we wondered if this was going to influence the pipeline. We talk about pipeline challenge. Well, what if there are plenty of women in the pipeline but even qualified women have resumes that don't get selected in our screening process because of how they're written.
And so there's really interesting ways to interpret the study. You might say women should update how they write their resumes. But I think I'd rather say that hiring managers ought to look more broadly at a range of skills, because both resume styles actually show skills that are pretty important for teams. You want people who can execute with a lot of precision on the details. And you also want people who can tell narrative stories that make great product experiences. So this is pretty interesting to us.
But we really focus heavily on evaluating the favorability and bias of your job listing text. And in fact, a lot of what we saw with resumes is mirrored in what we see with job listings. So one of my favorite facts about strong job listings, generally they have about a third bulleted list content, not more and not less. And it probably has something to do with the visual silhouette of the listing. People look at your listing and decide very quickly whether it's worth engaging with.
Well, the interesting thing here is that when you go above that amount of bulleted content, you tend to drive down the proportion of women who will apply for a role. And when you go below that amount of bulleted content, you tend to drive down the proportion of men who will apply for a role. So we see some of the same bias indicators at work on both sides of the table. So what happens for somebody writing a job listing? You're going to appeal to different populations differently, depending on how you structure the listing. And the same bias is implied in how resumes get interpreted. So pretty interesting stuff here.
In general, there are two broad approaches that get taken to combating bias in this kind of document. And I break them down into checklists versus software. So checklists and trainings are unconscious bias trainings, so maybe you'll have a group of 20 employees who will sit together and be led through, guided and interactive exercises showing that they have bias. In the best case, these are great conversation starters. Inherently, they're a bit limited in scope. So when you've seen job lists published that say avoid the phrases, "ninja" and "rock star" if you want to attract women to your team, inherently any checklist is small or it doesn't apply. It's hard to execute on a 20,000 word checklist. So these are very limited in scope. They can give a false sense of security that you've gone through the bias training, your problems are solved now. Because they are so limited in scope that even for people with good intentions, a lot of the fact that the bias is unconscious means that it's not really gone away.
I'm a huge believer in software solutions to address bias. It's what we do at Textio, so that as you're writing your listing, we're giving you feedback to show the patterns of bias in the language you're using right now. It's like SpellCheck but powered by machine learning data. In the best case, software is great because it's a real bias interrupter. If, as you are writing your document, you are getting feedback and suggestions, it has the potential to be really systemic and adaptive, and habit changing, because in the process of producing your work, you're having your habit and behavior disrupted in a way that points out bias patterns to you.
Worth noting that software can't fully solve your problem either. Software helps well intentioned people, but it can also create a false sense of security that problems are really solved. But it's worth thinking about how these two approaches relate. Being on the highly quantitative, empirical side myself, I often look to the qualitative research to help me understand places where real corporate analysis is going to be fruitful. So I think the two approaches can work in tandem.
So we'll talk briefly about Textio specifically. Textio focuses on improving your job listings. A job listing is really just marketing content for your employment brand and for your company. And there are lots of things that you want it to represent. You want it to represent that your company is a vibrant and enjoyable place to work. You want it to represent career opportunity. You increasingly want it to represent inclusivity, and so like the work with resumes and performance reviews, but at a much bigger scale, the approach that we take is very quantitative. And it's quantitative because we think when you're talking about distinctions as subtle as you see with unconscious bias, you really need to measure to understand patterns of behavior.
So we've mined data from over 10,000 companies. We take a job listing and we have some information about how that listing has performed. Was it popular? Did lots of people apply? Did a lot of the people who applied, were they good enough to screen? Were they qualified people? What was the demographic mix of people that came through the door? Maybe you got lots of qualified people, but maybe they're all one particular demographic group. And because we use core machine learning and computational linguistics techniques, you start extracting as you go through this full data set pattern.
Just like with the resumes, we can see that men and women tend to write listings differently, or write resumes differently. We can see that companies tend to write job listing differently when they're attracting men or attracting women or attracting a mix. So a few interesting patterns to highlight here. I talked about the bullet data before, which is a really important thing that shows there's a sweet spot that works for all demographic groups. But then at the boundary, the behavior becomes very gender specific. The specific words that you choose, of course, matter quite a bit. We've detected over 40,000 distinct phrases that change the favorability or bias of a listing overall. And some of them are actually quite subtle. So we do confirm a lot of the qualitative research that's out there. You want to avoid your egregiously gendered language, the ninjas and the rock stars.
But there's a lot of other categories that emerge. So frequent mistakes that people make when they're thinking about gender bias in their listings. The use of cliched sports or military metaphors, show up quite a bit in corporate listings, but they end up pretty male gendered. So these are things like "leave it all on the field," or "take it to the max," or "mission critical." And you can change the language in very small and subtle ways, and you change the proportion of men and women who are likely to apply for your job. So instead of saying this job prepares mission critical presentations for the CEO of our company, you can think there may be other phrases that you would use to describe that. And we've measured the difference between mission critical and high profile, or mission critical and essential, or mission critical and business critical, which is really small difference, but it changes the mix of applicants that you get.
And so for us, the quantitative approach where we're really measuring things at skill is the way that we're able to show as you're typing how do you might make changes that change the effectiveness of your listing. Some other very common examples here. The difference between manage a team and develop a team. They do mean subtlely different things, but having managed many teams, I can tell you that I thought that my job was both to manage the team and to develop the team. Well, if you describe the function as developing the team, you're significantly more likely to attract women to apply for the position. And if you describe the function as managing the team, then you're significantly more likely to attract men for the position. So some of these are highly nuanced in their differences. So I definitely encourage you to check it out if you're interested.
I wanted to wrap up by sharing a few other pieces of research and a reading list that I thought might be interesting for people. And all of these are worth reading. I have some of my stuff, but I really want to talk about stuff on the other page. I talked about performance reviews and resumes. I've also done some work, it's not a text based medium, but on conversational interruptions in the workplace. And this is really interesting, so I sat in on many corporate technology meetings with mixed composition, different levels, different genders represented in the room. And it has largely been conjectured that women get interrupted more often than men. It turns out to be overwhelmingly true. It also turns out that the biggest interrupters of women are other women, which echoes what we saw in the performance review study where women got dinged more for personality feedback and that was even true if they had a woman as their manager.
So there's definitely some real deep behavioral bias that shows up here. So I encourage you to check some of this out. I've also included some survey research on both, why women leave technology, which has hundreds of interviews. Hear women's stories. Headline is that they're not leaving to have babies, which is the stereotype, but they're leaving because they can't figure out how to work in the environment and still have other things in their life. And then I've also looked to tell a more positive story of women like me, who are mothers who have had long-term careers in technology and what kept them to stay. So I would love for you to check that out.
There's a few other pieces that I really encourage you to look at. Joan C. Williams has done some phenomenal work following on the original performance review study that I mentioned showing real intersectionality. So she essentially has shown that gender bias exists in science and technology fields, and it is significantly worse for women of color than it is for any other group, which is disappointing and echoes a lot of people's experience on the ground. But it's a great study, again, very, very empirical and really data driven, so worth checking out.
Sue Gardner, last year, wrote a much more comprehensive study than I did about why women are leaving the tech industry. And she interviewed many, many women and mirrored some of the same findings. And then, I love Joelle Emerson's work. She has a great company called Paradigm. She does diversity consulting. So when I talked about the way qualitative and quantitative can work together, we love working with her style of consulting. And she has a bunch of really practical suggestions in the hiring pipeline for how you remove bias, not just in your job listing, but from your practical operations and interviewing that are definitely worth checking out.
So thank you so much for attending the talk, and please contact me if you have any questions. Give Textio a try. We'd love to work with you. And if you have comments on the talk, go ahead and hash tag them on Twitter @Techtalentsummit. Thank you very much.