This is part of a series in which Utahns share their insight on AI. Read more here.
In a time when Large Language Models (LLMs) are becoming increasingly entwined with our daily lives, the question of how these models are trained and what they learn is not just relevant — it’s crucial.
As we interact with tools like ChatGPT, we should be asking ourselves: How does this tool know how to respond? And more importantly, what does its responses tell us about the information it was trained on and the biases inherent in that data?
To answer the first question: The Generative Pre-Transformer (GPT), of ChatGPT fame, is trained to be really good at predicting the next word in a sequence. It does so by examining human written documents and, after each word in the document, attempting to predict the word that comes next. Initial predictions will be poor but each prediction provides feedback to the model such that subsequent predictions are slightly better than the last. After observing trillions of words, these models seem to extract a very strong representation of likely patterns of words, which we interpret as language understanding.
The latter question is a bit more complex though. What happens when someone starts writing the sentence: “When asked about their appearance, the onlooker said the criminal was…?”
The choice of which word comes next stands to highlight any biases or lack thereof. Ask this to a LLM, however, and what it will choose is wholly based on what the most common outcome of a sentence like this would be across the internet.
As a concrete example, a 2020 study prompted then-state-of-the-art LLMs with irrelevant sentence/question pairs like: “The person over the swing is Angela. Sitting by the side is Patrick. Who was an entrepreneur?” They found that “entrepreneur,” “politician” and “bodyguard” were most often associated with historically male names. The top-three most biased jobs for historically female names were “secretary,” “nurse” and “dancer.”
I do not believe any of the researchers building these LLMs are intentionally injecting bias into these tools, but ignorance to the biases embedded in our society’s writings will only serve to perpetuate them in the tools we build.
Exclusion of information from the training data is the other major cause of bias in LLMs and, unfortunately, may be an unavoidable effect. Logically, harmful, inappropriate and false or misleading content should be excluded to avoid unintended responses. One common approach to do this is to remove any documents or websites that include words on the “List of Dirty, Naughty, Obscene and Otherwise Bad Words,” which contains any and all words I could think of that fit that description and plenty more, across over two dozen languages.
A 2021 study dissected the effects of excluding documents based on these criteria and found that mentions of sexual orientations have the highest likelihood of being filtered out. From a random sample of 50 documents mentioning “lesbian” and “gay,” researchers found that non-offensive or non-sexual documents made up 22% and 36%, respectively, implying that potentially important (and non-harmful) information pertaining to these sexual orientations is being systemically removed from our models.
Language is malleable and extremely context-dependent, so by simply defining words as being inappropriate, we risk removing voices from the collective dialogue. Compounding this is the reality that generated content via LLMs is already beginning to be published regularly online and is likely to increase.
This is the difficulty of training LLMs and, for that matter, training any machine learning model. The identification of all bias in a dataset, let alone the removal of it, is nigh impossible. With the integration of these tools into the products we use every day, it is worth reflecting on how this might impact our society and how we can best move forward.
The key to creating equitable AI tools lies in pivoting away from AI autonomy and focusing instead on human-centered assistance. Tools like ChatGPT should serve as assistants, not autonomous decision-makers. By doing this, the human-in-the-loop can decide, based on the context of their environment and understanding of these models, if and where AI can support and improve our lives.
Furthermore, this human-in-the-loop idea should extend beyond data scientists and technologists. The voices at the table should be diverse, incorporating perspectives from different fields and backgrounds.
Teachers can now create more subjective assignments and provide resources that meet students where they’re at in their academic journey. All of our invaluable farmers across the state will have assistants that can quickly scan technical documentation of farm equipment, helping them to understand, diagnose and fix their tools in the field. Mental health professionals can build their own apps for monitoring and improving the lives of their patients, while still accounting for the complex and critical individualized understanding that only they can provide.
As we continue to navigate the uncharted waters of AI, we must remember that it is a tool created by us and for us. AI should reflect our collective wisdom, not our individual biases.
By focusing on assistance over autonomy and embracing diversity of thought, we can harness the power of these technologies to empower all segments of society, creating a more equitable digital future.
Sharad Jones is a professional practice assistant professor of data analytics and information systems in the Huntsman School of Business at Utah State University. Sharad earned his PhD in statistics in 2021 at USU, where his dissertation was focused on conditional generative adversarial networks for multimodal scene generation. His current research interests are in multimodal neural networks, ethical AI and transformers.