Skip to main content

how multimodel agents design AI experiments interatively?

How Multimodal Agents Design AI Experiments Interatively?

Multimodal Agents Design
With the growing prevalence of artificial intelligence models across sectors such as healthcare, finance, education, transportation and entertainment, comprehending their underlying mechanisms is essential. This understanding facilitates the auditing of AI models for safety and biases, potentially enhancing our grasp of the principles underlying intelligence.

Consider the possibility of directly probing the human brain by manipulating each neuron to understand its role in object perception. While this experiment would be too invasive for the human brain, it is more feasible in artificial neural networks. However, like the human brain, artificial models with millions of neurons are too complex to study manually, posing significant challenges for large-scale interpretability.

To tackle this issue, researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) have adopted an automated approach for interpreting artificial vision models that assess various image properties. They developed 'MAIA' (Multimodal Automated Interpretability Agent). a system that automates multiple neural network interpretability tasks using a vision-language model backbone with tools for experimenting on other AI system.

This research has been released on the arXiv preprint archive.

"We aim to develop an AI researcher capable of autonomously conducting interpretability experiments. Current automated interpretability methods are limited to labeling or visualizing data in a single step. In contrast, MAIA can formulate hypotheses, design experiments to test them, and enhance its understanding through iterative analysis," explains Tamar Rott Shaham, a postdoctoral researcher in electrical engineering and computer science (EECS) at MIT's CSAIL and co-author of a new research paper.

Our multimodal system leverages a pre-trained vision-language model alongside a library of interpretability tools to respond to user inquiries by conducting tailored experiments on particular models, iteratively enhancing its process until it delivers a thorough answer.

The automated agent demonstrates proficiency in three critical tasks: labeling distinct elements within visioin models and elucidating the visual concepts that activate them, refining image classifiers by excising irrelevant features to improve robustness in new scenarios, and identifying concealed biases in AI system to expose potential fairness concerns in their outputs.

"A key strength of MAIA is its versatility," states Sarah Schwettmann, Ph.D., a research scientist at CSAIL and co-lead of the research. "We have demonstrated MAIA's utility in specific tasks, but since it is built on a foundation model with broad reasoning abilities, it can respond to a variety of interpretability questions from users and design experiments in real-time to investigate them."

Neural Element by Neural Element

In a sample task, a human user requests that MAIA elucidate the concepts a specific Neuron within a vision model detects. To address this inquiry, MAIA initially employs a tool to retrieve 'dataset examplas' from the image Net dataset, which most strongly activate the Neuron. For this Neuron, the images depict individuals in formal attire and close ups of their chins and necks. MAIA generates various hypotheses regarding the Neuron's activity drivers: facial expressions, chins, or neckties. Subsequently, MAIA utilizes its tools to design experiments to test each hypothesis individually by creating and modifying synthetic images. In one experiment, adding a bow tie to an image of a human face enhances the Neuron's response.

"Using this approach, we can identify the exact driver of the neuron's activity, much like performing a real scientific investigation," Says Rott Shaham.

The evaluation of MAIA's explanations of neuron behaviors is conducted through two primary approaches. First, synthetic systems with known ground-truth behaviors are utilized to validate the accuracy of MAIA's interpretations. Second, for genuine neurons in trained AI systems without ground-truth description, the authors develop an innovative automated evaluation protocol to measure how accurately MAIA's descriptions forecast neuron behavior on unseen data.

The CSAIL-led approach excelled beyond baseline methods in describing individual neurons within diverse vision models such a ResNet, CLIP, and the vision transformer DINO. Additionally, MAIA showed impressive results on the new dataset of synthetic neurons with known ground-truth descriptions. In both real and synthetic systems, the generated descriptions were often equivalent ot those written by human experts.

What is the utility of describing AI system components like individual neurons?

"Comprehending and pinpointing behaviors within large AI systems is crucial for auditing these systems for safety prior to deployment---in certain experiments, we demonstrate how MAIA can identify and eliminate neurons with undesirable behaviors," states Schwettmann. "Our goal is to develop a more robust AI ecosystem where tools for understanding and monitoring AI systems evolve in tandem with system scaling, allowing us to investigate and hopefully comprehend unforeseen challenges posed by new models."

Exploring the internal structure of neural networks

The emerging discipline of interpretability is evolving into a distinct research domain as 'black box' machine learning models proliferate. How can researchers deconstruct these models to gain insights into their operations?

Current techniques for examining the inner working of AI models are often constrained by either their scale or the precision of their explanations. Additinally, these methods are typically tailored to specific models and tasks. Consequently, researchers posed the question: How can we develop a universal system that assists users in addressing interpretability queries about AI models, integrating the flexibility of human experimentation with the scalability of automated methods?

A crucial aspect they aimed for this system to address was bias. To ascertain whether image classifiers exhibited bias against specific image subcategories, the team analyzed the final layer of the classification stream (in a system designed to categorize or label items, similar to a machine that determines if a photo depicts a dog, cat, or bird) and the probability scores of input images (the confidence levels assigned by the machine to its predictions).

To identify potential biases in image classification, MAIA was tasked with isolating a subset of images within specific categories (such as 'labrador retriever') that were prone to incorrect labeling by the system. In this instance, MAIA discovered that images of black labradors were more likely to be misclassified, indicating a bias in the model towards yellow-furred retrievers.

Since MAIA relies on external tools for experiment design, its performance is dependent on the quality of these tools. As the quality of tools like image synthesis models improves, so will MAIA's effectiveness. MAIA also sometimes exhibits confirmation bias, incorrectly validating its initial bypotheses. To mitigate this, researchers developed an image-to-text tool that uses a different instance of the language model to summarize experiment results. Another failure mode is overfitting to particular experiments, where the model may draw premature conclusions from limited evidence.

"I believe the logical progression for our lab is to extend our experiments beyond artificial systems and apply them to human perception," remarks Rott Shaham. "Traditionally, this type of testing necessitated the manual design and testing of stimuli, a labor-intensive process. Our agent, however, enables us to scale up to scale up this process, allowing for the simultaneous design and testing of numerous stimuli. This capability may also facilitate comparisons between human visual perception and artificial systems."

"The complexity of neural networks, with their hundreds of thousands of neurons each displaying sophisticated behavior, makes them difficult for humans to understand. MAIA facilitates this by developing AI agents that can autonomously analyze these neurons and present their findings in a clear and concise manner," explains Jacob Steinhardt, Assistant Professor at the University of California, Berkeley, who was not involved in the study. "Scaling these techniques could be vital for enhancing our comprehension and oversight of AI systems."

Further details: Tamar Rott Shaham and colleagues, 'A Multimodal Automated Interpretability Agent,' arXiv, 2024.

Source


Comments

Popular posts from this blog

NASA chile scientists comet 3i atlas nickel mystery

NASA and Chilean Scientists Study 3I/ATLAS, A Comet That Breaks the Rules Interstellar visitors are rare guests in our Solar System , but when they appear they often rewrite the rules of astronomy. Such is the case with 3I/ATLAS , a fast-moving object that has left scientists puzzled with its bizarre behaviour. Recent findings from NASA and Chilean researchers reveal that this comet-like body is expelling an unusual plume of nickel — without the iron that typically accompanies it. The discovery challenges conventional wisdom about how comets form and evolve, sparking both excitement and controversy across the scientific community. A Cosmic Outsider: What Is 3I/ATLAS? The object 3I/ATLAS —the third known interstellar traveler after "Oumuamua (2017) and 2I/Borisov (2019) —was first detected in July 2025 by the ATLAS telescope network , which scans he skies for potentially hazardous objects. Earlier images from Chile's Vera C. Rubin Observatory had unknowingly captured it, but ...

Quantum neural algorithms for creating illusions

Quantum Neural Networks and Optical Illusions: A New Era for AI? Introduction At first glance, optical illusions, quantum mechanics, and neural networks may appear unrelated. However, my recent research in APL Machine Learning Leverages "quantum tunneling" to create a neural network that perceives optical illusions similarly to humans. Neural Network Performance The neural network I developed successfully replicated human perception of the Necker cube and Rubin's vase illusions, surpassing the performance of several larger, conventional neural networks in computer vision tasks. This study may offer new perspectives on the potential for AI systems to approximate human cognitive processes. Why Focus on Optical Illusions? Understanding Visual Perception O ptical illusions mani pulate our visual  perce ption,  presenting scenarios that may or may not align with reality. Investigating these illusions  provides valuable understanding of brain function and dysfunction, inc...

fractal universe cosmic structure mandelbrot

Is the Universe a Fractal? Unraveling the Patterns of Nature The Cosmic Debate: Is the Universe a Fractal? For decades, cosmologists have debated whether the universe's large-scale structure exhibits fractal characteristics — appearing identical across scales. The answer is nuanced: not entirely, but in certain res pects, yes. It's a com plex matter. The Vast Universe and Its Hierarchical Structure Our universe is incredibly vast, com prising a p proximately 2 trillion galaxies. These galaxies are not distributed randomly but are organized into hierarchical structures. Small grou ps ty pically consist of u p to a dozen galaxies. Larger clusters contain thousands, while immense su perclusters extend for millions of light-years, forming intricate cosmic  patterns. Is this where the story comes to an end? Benoit Mandelbrot and the Introduction of Fractals During the mid-20th century, Benoit Mandelbrot introduced fractals to a wider audience . While he did not invent the conce pt —...