Mozilla adviser's views on AI industry influence
Mozilla adviser's views on AI industry influence
In the AI sector, 'scaling up' has become a popular term as technology firms strive to enhance their AI capabilities by leveraging increasingly vast internet data sets.
Mozilla's Abeba Birhane, an AI expert, has long criticized the values and practices of her field, highlighting the potential global impact of its influence.
Her recent research indicates that the expansion of online data for training popular AI image-generation tools disproportionately produces racist outputs, particularly targeting Black men.
Birhane serves as a senior adviser on AI accountability at the Mozilla Foundation, the nonprofit organization behind the Firefox web browser. Originating from Ethiopia and residing in Ireland, she also holds the position of adjunct assistant professor at Trinity College Dublin.
Her discussion with The Associated Press has been shortened and clarified for conciseness.
Q: Could you share how your journey in the AI field began?
A: My academic background is in cognitive science, which doesn't typically have its own dedicated department. At my institution, cognitive science was housed within the computer science department. I found myself in a lab surrounded by machine learning researchers. While they were conducting impressive work, I noticed a lack of focus on the data itself. This struck me as both amusing and fascinating because I believe data is critical to the success of any model. I found it odd that there was little attention paid to questions like, 'What comprises my dataset?' This curiosity led me to this field, and I eventually began conducting audits of large-scale datasets.
Q: Could you elaborate on your research concerning the ethical principles underlying artificial intelligence?
A: There are diverse perspectives on machine learning's nature. While AI practitioners often view it as a purely mathematical and neutral field, social scientists argue that, like any technology, it reflects the values of its developers. To investigate this, we conducted a comprehensive analysis of the top hundred influential machine learning papers to rigorously understand the field's core concerns.
Q: One of the core values identified was the emphasis on scaling up.
A: Scale is often regarded as the ultimate benchmark for success, with prominent researchers from leading firms such as DeepMind, Google and Meta asserting that increasing scale mitigates noise and balances inconsistencies. The premise is that as datasets expand, they should stabilize, approximating a normal distribution or aligning more closely with the actual truth.
Q: Your research has investigated the negative implications of scaling up. What are some of the adverse effects identified?
A: Our research indicates that as datasets are scaled, so too does the prevalence of harmful content such as hate speech and toxicity. Specifically, we observed a significant increase in hateful content when datasets were expanded form 400 million to 2 billion entries. This finding challenges the notion that scaling datasets simply balances out biases and inaccuracies. Additionally, our analysis revealed a disproportionate allocation of suspicious or criminal labels to darker-skinned individuals, particularly women.
Q: How hopeful are you about the AI field embracing the modifications you have recommended?
A: The impact of these outputs is not confined to mathematical and technical aspects; they also play a crucial role in shaping societal norms and influences. Our recommendations advocate for the incorporation of values like justice, fairness and privacy into AI systems. Nevertheless, I am profoundly skeptical that the industry will embrace these suggestions. The industry's track record shows a tendency to overlook societal concerns unless compelled by legal requirements or substantial public pressure that could damage their reputation.
Labels: Abeba Birhane