Recent research has revealed the persistence of racial and gender bias in advanced artificial intelligence (AI) systems such as ChatGPT. According to an exclusive study published by MIT Technology Review, it was found that broad language models (LLMs) manifest patterns of discrimination when generating responses conditioned to the user's name. These biases, present in approximately 0.1% of interactions, reflect a wider systemic problem related to the reliance on biased data during AI training.
The study points out that biases in AI systems such as ChatGPT are largely rooted in the data used to train these models. During the training process, language models are fed information from millions of texts, including books, websites and social networks. As a consequence, the tendencies inherent in these texts are also internalized by the models and, although significant efforts have been made to filter out inappropriate content, the complete removal of these subtle tendencies remains a complex challenge.
In particular, it has been observed that names with specific ethnic connotations often lead to less accurate or stereotype-laden responses. For example, names associated with certain ethnicities tend to trigger more negative or less detailed responses compared to names considered culturally “neutral” or belonging to majority groups. Similarly, names considered feminine have been identified as eliciting responses with a different, often condescending, tone compared to masculine names.
Prejudice in AI
The presence of prejudice in AI systems has profound implications at both a social and technological level. Firstly, the use of language models that incorporate discriminatory prejudices not only perpetuates existing inequalities, but also has the potential to amplify them. This is particularly worrying given the increasing use of AI in automated decision-making processes, such as recruitment, credit assessment or the provision of public services.
In a commercial context, where companies rely on models such as ChatGPT to interact with users, racial and gender biases can lead to unequal experiences. In the area of recruitment, for example, if the system discriminates on the basis of a candidate's name, this could limit their job opportunities. Similarly, companies that use AI to provide customer support may be providing an unequal quality of service based on the perceived identity of users, which can lead to discriminatory treatment and lower customer satisfaction.
OpenAI, the developer of ChatGPT, has taken several steps to reduce the biases present in its language models. According to the MIT Technology Review article, ongoing efforts are being made to adjust training data and apply more sophisticated moderation mechanisms to identify and correct discriminatory responses. However, experts recognize that completely eradicating these biases is extremely difficult due to the complex nature of data and AI models.
A key strategy for mitigating biases has been human intervention in model fitting. Through a technique known as Reinforcement Learning from Human Feedback (RLHF), developers seek to teach the model to avoid biased responses through the intervention of human evaluators. However, this approach has its limitations, as the evaluators themselves may have unconscious biases, making it difficult to achieve absolute neutrality in the system.
The problem of bias in AI is not just limited to ChatGPT, but represents a systemic challenge for the technology sector as a whole. As AI systems become more widespread in society, the debate on how to ensure that these systems are fair, equitable and non-discriminatory is intensifying. Technology companies are under increasing pressure to develop AI models that are not only accurate and efficient, but that also respect the principles of equality and fairness.
To move towards a more inclusive AI, it is essential that developers adopt a conscious and ethical approach during model design and training. This includes diversifying development teams, improving the quality of training data and implementing independent audits to assess model performance for potential biases.