Skip to main content

OpenAI stole massive amounts of personal data to train ChatGPT?


  • Legal Action Unveiled: Lawsuit Accuses OpenAI of Unauthorized Acquisition of Personal Data from Millions of Americans for Training ChatGPT.
  • Legal Allegations Surface: Lawsuit Claims OpenAI Engaged in Web Crawling to Gather Vast Quantities of Data Without Consent.
  • Alarming Allegations: Lawsuit Accuses OpenAI of Collecting and Storing User Chat-Log Data, Even from Third-Party Platforms such as Snapchat and Spotify.
Explosive Lawsuit Claims OpenAI Acquired "Massive Amounts of Personal Data" Illegally for Training ChatGPT.

According to the proposed class-action suit, Sam Altman's company engaged in undisclosed data harvesting practices to train its large language models, allowing its chatbot to simulate human language.

The legal team stated in the 157-page lawsuit, filed on Wednesday in the US District Court for the Northern District of California, that the defendants deviated from established protocols for the acquisition and utilization of personal information by resorting to theft.

The lawsuit asserts that OpenAI conducted web crawling activities to accumulate extensive datasets, which encompassed substantial quantities of information sourced from social media platforms. According to the claims, OpenAI's proprietary AI corpus, known as WebText2, acquired significant data by scraping Reddit posts and the associated websites they referenced.

According to the lawsuit, the accessed data encompassed a wide range of sensitive information, including private conversations, medical data, details pertaining to children, and essentially every piece of data exchanged on the internet. The lawsuit alleges that this data was obtained without providing notice to the data owners or users, let alone obtaining their permission.

The lawsuit alleges that the actions described amount to the negligent and potentially illegal misappropriation of personal data belonging to millions of Americans who do not utilize AI tools.

OpenAI has yet to respond to Insider's request for comment, which was made outside of regular working hours.

The lawsuit alleges that in addition to scraping the "digital footprints" of the general public, OpenAI also retains and discloses users' personal information, including data provided during the creation of OpenAI accounts, chat logs, and social media information.

The lawsuit alleges that in addition to individuals who directly use ChatGPT, the data also includes information from users of various integrated applications such as Snapchat, Stripe, Spotify, Microsoft Teams, and Slack. As of now, the companies mentioned have not responded to Insider's request for comment.

The lawsuit requests a temporary injunction on the commercial access to and development of OpenAI's products until the implementation of enhanced regulations and safeguards. These measures include granting individuals the option to opt out of data collection and preventing OpenAI's products from exceeding human intelligence and causing harm. Additionally, the lawsuit seeks financial compensation for individuals whose data was used in training the chatbots.

In addition to OpenAI, the lawsuit includes Microsoft, a significant backer of the company, as a defendant.

The plaintiffs in the lawsuit were intentionally identified by their initials, occupations, and state of residence, as their legal representatives stated that this approach aimed to protect them from intrusive scrutiny and potential backlash.

The popularity of generative AI, capable of producing text, audio, images, and videos, has surged following the release of OpenAI's ChatGPT in November. Individuals have been leveraging generative AI for personal, professional, and academic pursuits, although apprehensions exist regarding its data access.

In March, Italy implemented a temporary ban on accessing ChatGPT due to privacy concerns. The ban was enacted on the basis that there was no legal justification for the "mass collection and storage of personal data" used in training the algorithms powering ChatGPT. Several companies, such as Amazon and Microsoft, have instructed their employees to refrain from entering confidential information into the chatbot. Additionally, Samsung has imposed a ban on the use of generative AI tools by its staff.

According to the lawsuit filed on Wednesday, AI platforms, while acknowledged for their significant potential to bring about positive change, also pose a "potentially catastrophic risk to humanity.

In addition to concerns regarding its potential to significantly disrupt job markets, AI has been associated with the dissemination of false information and has been exploited for malicious purposes. The creators of OpenAI have expressed the belief that AI could surpass human proficiency in most domains within the next decade, while some critics express apprehension over the technology's existential risks.

The lawsuit contends that we are currently exposed to imminent and unwarranted risks that could undermine the very foundations of our society. These risks are attributed to profit-driven, multibillion-dollar corporations.

The lawsuit alleges that dominant corporations, possessing significant and concentrated technological capabilities, have pursued the rapid deployment of AI technology with little regard for the potentially devastating consequences to humanity. This pursuit is justified in the name of 'technological advancement.'

Comments

Popular posts from this blog

NASA chile scientists comet 3i atlas nickel mystery

NASA and Chilean Scientists Study 3I/ATLAS, A Comet That Breaks the Rules Interstellar visitors are rare guests in our Solar System , but when they appear they often rewrite the rules of astronomy. Such is the case with 3I/ATLAS , a fast-moving object that has left scientists puzzled with its bizarre behaviour. Recent findings from NASA and Chilean researchers reveal that this comet-like body is expelling an unusual plume of nickel — without the iron that typically accompanies it. The discovery challenges conventional wisdom about how comets form and evolve, sparking both excitement and controversy across the scientific community. A Cosmic Outsider: What Is 3I/ATLAS? The object 3I/ATLAS —the third known interstellar traveler after "Oumuamua (2017) and 2I/Borisov (2019) —was first detected in July 2025 by the ATLAS telescope network , which scans he skies for potentially hazardous objects. Earlier images from Chile's Vera C. Rubin Observatory had unknowingly captured it, but ...

bermuda triangle rogue waves mystery solved

Bermuda Triangle Mystery: Scientist Claims Rogue Waves May Explain Vanishing Ships and Aircraft for decades, the Bermuda Triangle has captured the world's imagination, often described as a supernatural hotspot where ships vanish and aircraft disappear without a trace. From ghostly ships adrift to unexplained plane crashes, this stretch of ocean between Bermuda, Puerto Rico and Florida remains one of the most infamous maritime mysteries. But now, Dr. Simon Boxall, an oceanographer at the University of Southampton , suggests the answer may not be extraterrestrial at all. Instead, he argues that the truth lies in rogue waves — giant, unpredictable surges of water capable of swallowing even the largest ships within minutes. The Bermuda Triangle: A Legacy of Fear and Fascination The Bermuda Triangle has inspired decades of speculation , with theories ranging from UFO abductions to interdimensional rifts. Popular culture, documentaries and countless books have kept the legend alive, of...

nist breakthrough particle number concentration formula

NIST Researchers Introduce Breakthrough Formula for Particle Number Concentration Understanding the number of particles in a sample is a fundamental task across multiple scientific fields — from nanotechnology to food science. Scientists use a measure called Particle Number Concentration (PNC) to determine how many particles exist in a given volume, much like counting marbles in a jar. Recently, researchers at the National Institute of Standards and Technology (NIST) have developed a novel formula that calculates particle concentrations with unprecedented accuracy. Their work, published in Analytical Chemistry , could significantly improve precision in drug delivery, nanoplastic assessment and monitoring food additives. Related reading on Nanotechnology advancements: AI systems for real-time flood detection . What is Particle Number Concentration (PNC)? Defining PNC Particle Number Concentration indicates the total count of particles within a specific volume of gas or liquid,...