Bias in AI Explained

By: Sabine Steenwinkel-den Daas

This text is translated from Dutch with the help of AI - "Tell me honestly, what's your biggest fear when it comes to algorithms and AI?" When I ask people this question, the response often is, "AI-systems discriminate!" And of course, discriminating is far from desirable. But how do we ensure that algorithms don't discriminate?

First and foremost, it's important to understand that algorithms themselves do not discriminate. However, algorithms can be developed in ways that lead to unwanted or undesired distinctions in representation of groups in the result. This is commonly referred to as "bias" in technical terms.

Essentially, if you have control over the bias within the algorithm, you also minimalize discrimination. In this insight, I will explain how to achieve this. Let's start by answering the question: What is bias?

What is Bias?

Algorithm designers use the term "bias" to refer to discrimination exhibited by an algorithm. But bias simply signifies a systematic difference in representation. The proportions between groups (such as male/female, high/middle/low, or rich/poor) are not equally divided: not every subgroup represents an equal portion of your data (or sample).

In practice, it's rare, for example, that all residents of a town or all respondents of a survey have the same male/female ratio as the national and the global average. Unequal division in these groups are thus inevitable in the real world.

Therefore, bias (in the context of "inequality") is not inherently strange or directly discriminatory. It's a factual observation that mainly women give birth compared to men. Is a midwife discriminating by assisting foremost women in childbirth. Similarly, it's a factual observation that men generally don't wear dresses. Not to say here that they cannot, but I am I discriminating as a fashion consultant if I represent most man in the latest fashion catalogue with an outfit featuring pants?

Even if a company strives for the highest quality candidate to fill a vacancy and has no explicit preference for a male candidate, but just a preference for a candidate with the best expected sector knowledge. The male/female ratio among the job applications that request this explicit knowledge won't be equal. Imagine a sector where women are underrepresented. This doesn't mean the company is intentionally discriminating; often, it's influenced by underlying history of choices by the potential labor force in for example education and previous applications.

Numerous examples in hiring demonstrate unequal male/female ratio’s. But when does this inequality become discrimination? When do we no longer tolerate this 'discrimination'? And how do we teach an algorithm which disparities are acceptable and which are considered 'discrimination'?

Why is bias a risk in AI?

AI-systems are developed mostly to identify groups, a process often referred to as classification. These algorithms aim to recognize patterns in data that indicate differences between classes and similarities within a class.

Imagine we create an algorithm to identify whether someone is male or female. Differences between males and females might be found in characteristics like beard growth, sperm production, interest in dresses, and educational choices. These patterns allow us to predict the class to which an individual belongs – for instance, if we know that Person XX has a beard frequency of 0 and has no sperm production, based on the aforementioned patterns, we can infer that XX is most likely female. If Person XY frequently purchases dresses in an on-line shop, does our algorithm predict that XY is male or female? And are two classes sufficient to represent all persons in my data?

The examples above shows directly the suspect factors that can cause generalization errors and thus gender discrimination. However, we use algorithms specifically to identify complex patterns within vast amounts of data. These patterns are based on class differences, making algorithms inherently designed to distinguish, even among groups where differentiation isn't desired.

In her book 'Weapons of Math Destruction,' Cathy O’Neil[1] discusses the far-reaching consequences of algorithm use in university admissions processes. Prospective students are admitted based on their likelihood of success. The odds of success are lower for children of non-educated or low-educated parents. As a result, these young individuals are selected less frequently compared to students from highly educated families. The admissions algorithm subsequently observes more instances of success in the new admissions among whom mostly students from highly educated families. Consequently, the algorithm selects fewer and fewer candidates from non-educated or low-educated families. The gap between these two worlds widens.

How can I responsible use AI?

From this point forward, we'll discuss three types of bias to get you to the root of the bias problem:

Disproportional Data Bias: Bias in the data
Algorithmic Bias: Bias amplified by the model
Confirmation Bias: Bias stemming from incompleteness and subjectivity

I will illustrate the difference between these three types using the hiring process of fictional company ABC Engineering as an example. Even though this company strives for the highest quality candidate to fill their electrical engineering vacancy, the unequal ratio between male and female among graduated electrical engineers suggests that the male/female ratio won't be equal among applying candidates for a single position.

The company employs an algorithm to support their selection process. On average, for every 20 vacancies, ABC Engineering selects one female candidate, while the other 19 positions are filled by males. This doesn't necessarily mean the company discriminates – due to underlying educational choices, women are underrepresented in this profession. The bias in this situation is caused by the uneven distribution between male and female electrical engineering candidates (the disproportional data bias). The algorithm doesn't exacerbate the inequality (algorithmic bias).

In this example, alongside the disproportional data bias, we also have a confirmation bias. ABC Engineering selects candidates based on obtained diplomas. The company opts for an incomplete perspective: ABC Engineering deems candidates without prior education unsuitable for the role of electrical engineer, thereby overlooking and continually missing potential candidates without a formal education who could still be qualified for the position.

Do you believe ABC Engineering should take measures to strive for a more equal male/female ratio? Would you still want to be advised by an algorithm if discrimination might be a concern? And what bias do you accept?

How to measure bias?

To determine whether to accept bias, you must first be able to measure it. Next, you need to consider whether the bias is appropriate within the context in which the algorithm is used. The political sphere raises questions about high impact algorithms according to the AI Act. And rightfully so, as the outcomes of these algorithms often impact citizens' lives, affecting their sense of self-worth or their wallets.

Ignorance of intrinsic bias within the data and a lack of mitigative measures while using algorithms can fuel discrimination. If the algorithm doesn't consider the entire population but instead relies on a subset or sample: How random was the selection process? Did each group in your population have sufficient representation in the chosen sample? The complexity lies in translating numbers and choices into meaningful implications and effects for those involved. How do you interpret the calculated bias in the example below?

Data bias is the percentage difference between the expected proportional representation of a group and the actual proportions of the groups in the data. In the case of ABC Engineering, the actual proportion of female applicants is 1 in 20, while we would expect a 1 in 2 ratio (50-50) for two groups (men and women). This results in a bias of -90%: there are 90% fewer female applicants compared to a 50-50 male/female ratio.

Algorithm bias is the percentage difference between the actual proportions and the proportions of these groups per estimated class. In the case of ABC Engineering, the algorithm bias is 0%. Let's say the ABC Engineering algorithm suggests filling 1 in 10 of its vacancies with a female candidate. The actual proportion remains 1 in 20. The proportion of women in the 'fill vacancy' class is now 1 in 10. This makes the algorithm bias +100% between the actual 1 in 20 and the suggested 1 in 10: ABC Engineering's algorithm selects female candidates twice as often as the market's available talented women.

In these examples, we're dealing with a clearly measurable bias. If you want to ensure whether the algorithm bias generated by your developed algorithm indicates a significant difference, you can make a good estimate using some statistics, like the student's t-test. Seek out the formula or use the function in your favorite programming language.

Despite of the fact it is technically possible to measure bias. Not all personal data, especially sensitive personal data, are available in an organization or for use within this purpose. In their paper, van Bekkum and Zuiderveen Borgesius[2] examined whether the GDPR needs a new exception to the ban on using special categories of data, such that an organization can mitigate discrimination by artificial intelligence.

Allowing organizations to collect special category data and use these for the purpose of measuring bias is not guaranteed. Without knowing what bias is in the data and what is produced by the Ai-system, organizations cannot assess or debias their algorithms.

In the end, it is a political decision how the balance between the different laws and regulations. And technical solutions, like multiparty computation, might offer a way to compromise between the necessity of bias testing and right on privacy.

What impact does bias have within the context of your algorithm? Is this in line with what your organization stands for? Have you taken measures to ensure that your algorithm doesn't discriminate?

How do I limit the impact of bias?

To mitigate the impact of bias, it's crucial to first understand the bias in the data and the amplification factor of the algorithm. This can be achieved, for example, by measuring different types of bias as an integral part of quality controls during algorithm development.

Has the algorithm resulted in more bias than desired? The suggestion from ABC Engineering's algorithm to hire women for 1 out of 10 positions is only suitable if it can be qualitatively established that these candidates possess better or more fitting qualifications, experience, and competencies for the role. When algorithmic bias is undesired, you need to return to the drawing board: the input.

Below, I provide six concrete actions you can start with today to improve these input.

Ensure the variable is not included in the algorithm. To limit discrimination based on gender, the algorithm should first be unable to make direct distinctions based on the variable ‘gender’ to determine candidate suitability.
Transform the data in a way that focuses on behavioral elements. This can involve using input such as obtained certificates, motivation and extracurricular activities.

While behaviors also differ between men and women, this translation ensures that men who exhibit more 'female' behavior and women who exhibit more 'male' behavior are assessed equally by your algorithm.
Create the most complete picture to limit confirmation bias. Avoid using confounding variables, which are correlated with the algorithm's outcome but lack a causal relationship.

For instance, the number of ice creams sold daily at the beach indirectly correlates with the number of tickets sold for the pool. Rather than using the ice cream sales to predict pool ticket sales, focus on the direct factor: the weather.

The recruiter at ABC Engineering indicates that they never consider extracurricular activities to determine if a candidate possesses relevant competencies to be invited for an interview. This is primarily because it has proven most effective to only schedule interviews with candidates having the appropriate educational background and/or similar work experience.

However, by incorporating information about candidates' extracurricular activities, algorithm developers provide the algorithm an opportunity to recommend candidates based on a more complete understanding of their behavioral elements. A tip: don't get too lost in too detailed information. While my purchased groceries from the past year may offer insights into who I am, even if allowed, should data on my grocery shopping behavior truly influence ABC Engineering's selection algorithm?
Explore random outcomes that don't emerge from the algorithm to minimize further confirmation bias. In addition to the selection algorithm, ABC Engineering could periodically evaluate candidates with lower matching scores (blindly) and invite them for an interview.
Discuss the necessary bias-mitigation choices within your organization and ensure they are documented. By documenting these choices, you have the chance to continuously reassess and adjust decisions based on evolving insights or views.
Take the initiative to reduce bias in your population – bias in the data. ABC Engineering has opted for an algorithm that doesn't intrinsically discriminate. Additionally, the organization can play a role in reducing the overrepresentation of men among applicants in the future, for example, by motivating young women to pursue the necessary educational credentials and encouraging career changes.

And what about educational institutions that use their selection algorithms like O’Neill described not to determine whom to admit, but to find ways to support young individuals to increase their chances of obtaining a diploma, selecting candidates based on their actual potential?

Now you know what bias is, why it is a risk in AI and how to take the responsibility to overcome undesired bias, I wish you all the best develop, using and/or deploy your sensible-by-design AI-systems!

Want to know more?

Read further on what we do in the field of Responsible AI.

References

[1] Book: Weapons of Math Destruction by Cathy O'Neill

[2] Using sensitive data to prevent discrimination by artificial intelligence: Does the GDPR need a new exception? - Marvin van Bekkum, Frederik Zuiderveen Borgesius; computer law & security review 48 (2023) 105770

Related insights