Data broker
View on WikipediaA data broker is an individual or company that specializes in collecting personal data (such as income, ethnicity, political beliefs, or geolocation data) or data about people, mostly from public records but sometimes sourced privately, and selling or licensing such information to third parties for a variety of uses. Sources, usually Internet-based since the 1990s, may include census and electoral roll records, social networking sites, court reports and purchase histories. The information from data brokers may be used in background checks used by employers and housing.
There are varying regulations around the world limiting the collection of information on individuals; privacy laws vary. In the United States there is no federal regulation protection for the consumer from data brokers, although some states have begun enacting laws individually. In the European Union, GDPR serves to regulate data brokers' operations. Some data brokers report to have large numbers of population data or "data attributes". Acxiom purports to have data from 2.5 billion different people.
Overview
[edit]Information broker is sometimes abbreviated to IB, and other terms used for information brokers include data brokers, independent information specialists,[1] information or data agents,[2] data providers, data suppliers,[3] information resellers, data vendors,[4] syndicated data brokers, or information product companies.[5] Information consultants, freelance librarians, and information specialists are also sometimes termed information brokers.[6][7]
Credit scores were first used in the 1950s,[8] and information brokering emerged as a career for individuals during that decade.[1] However the business of information brokering did not become widely known or specifically regulated until the 1990s.[8] During the 1970s, "information brokers" often had a library science degree; however, towards the end of the 20th century, people with degrees in science, law, business, medicine, or other disciplines entered the profession, and the line between the terms information professional and information broker became more blurred.[9] In 1977, Kelly Warnken published the first fee-based information directory, followed by the Journal of Fee-Based Information Services in 1979[9] and the book The Information Brokers: How to Start and Operate Your Own Fee-based Service in 1981.[10]
Beginning in the late twentieth century, technological developments such as the development of the Internet, increasing computer processing power, and declining costs of data storage made it much easier for companies to collect, analyze, store and transfer large amounts of data about individuals. This gave rise to the information broker or data broker industry.[11] As of 2021[update], there is no required academic qualification for the job of information broker; some people may have a bachelor's degree in business or marketing,[12] while others may have a background in library science, or may have worked for a database provider.[1]
Services
[edit]Information brokering has been described as the "business of buying and selling information as a commodity".[13] Information brokers have been defined by the (US) Federal Trade Commission as "companies that collect information, including personal information about consumers, from a wide variety of sources for the purpose of reselling such information to their customers for various purposes, including verifying an individual's identity, differentiating records, marketing products, and preventing financial fraud".[4] Gartner defines an information broker as "a business that aggregates information from a variety of sources; processes it to enrich, cleanse or analyze it; and licenses it to other organizations". It states that data is "licensed for particular or limited uses" rather than sold to a client.[5]
Information brokers (IBs) collect and collate data concerning myriad topics, ranging from the daily communications of an individual to more specialized data such as product registrations,[14] patents and copyright data,[15] mostly from publicly available sources, usually obtained from online databases. They may also provide various other services, such as analysing the data and writing reports on them; creating databases for clients; or updating clients whenever new information on a specific topic or person. Clients use data brokers to save themselves time and money, as the brokers are trained in the skills needed to retrieve such information effectively and efficiently.[1] Information brokers are secondary researchers, who find information on a variety of subjects, including companies (often competitors[2]), markets, people, and products. Their role includes analysis and synthesis of the data they find,[16] Brokers may find everything else they can about an individual on the Internet, and aggregate that data with information from a variety of other sources.[3]
Information brokers sometimes specialise in a specific area, such as market research, statistics, or scientific data.[2]
Clients of information brokers come from a wide range of industries and professions, including manufacturing, financial institutions, political parties, government agencies and historians.[17] Non-profit organizations might benefit from information which helps them to apply for grant funding, and real estate agents often use IBs to undertake land title searches.[2][18] Advertising, fraud detection and risk mitigation are three common reasons for using data brokers,[3] and these are the three broad categories defined by the Federal Trade Commission.[4] Information brokers need to screen their clients carefully to avoid criminals obtaining data on individuals for nefarious purposes: US broking companies Lexis-Nexis and ChoicePoint have both been duped by phoney clients, leading in one case to identity theft on a large scale.[4]
Data may be harvested from various sources, including census, change of address, motor vehicle-related records, user-contributed material and social networking sites,[19] media and court reports, voter registration lists, consumer purchase histories, most-wanted lists and terrorist watch lists, bank card transaction records, health care authorities, and Web browsing histories.[8] IBs may also purchase information from other companies (such as a credit card company).[3] The information collected may include name, address, social security number, driver's licence number and other such identifying information, as well as occupation, property ownership, income, etc. Advertising companies are most often only interested in profiles and categories rather than personal information about an individual.[3]
Information from property records, tax filings, etc. may also be available via "people-search" whitepage sites, either for a small fee or no cost. These websites can thereby have implications for stalking, harassment, and domestic violence.[20]
The data are aggregated to create individual profiles, often made up of thousands of pieces of information, such as a person's age, race, gender, height, weight, marital status, religious affiliation, political affiliation, occupation, household income, net worth, home ownership status, investment habits, product preferences and health-related interests.[21] Brokers then sell the profiles to other organizations that use them mainly to target advertising and marketing towards specific groups,[17] or to verify a person's identity including for purposes of fraud detection, and to sell to individuals and organizations so they can research people for various reasons.[21] Some datasets may also include geolocation data and is included in marketing resources from Acxiom. Experian and Oracle also advertise location-based marketing services.[20]
Many brokers work independently, while others are employees of large companies such as LexisNexis or ProQuest.[17]
In the United States
[edit]Data brokers in the United States include Acxiom, Experian, Epsilon, CoreLogic, Datalogix, Intelius, PeekYou, Exactis, and Recorded Future.[21][22] In 2012, Acxiom claimed to have files on about 500 million active consumers worldwide, with about 1,500 data points per person [23] and, in 2023, Acxiom (renamed LiveRamp) claims to have files on 2.5 billion people and over 3,000 data points per person.[24][25] The company Oracle has publicly noted it has connections with 80 data broker companies. The US Department of Homeland Security has purchased cell phone location data and home utility data from data brokers to facilitate deportations. The Federal Bureau of Investigation (FBI) has purchased personal data from the company Venntel. Under both of these circumstances, a warrant is not required to acquire this data, due to the fact that it is "open source" or "commercially obtained".[20] Use of the data also has implications in background checks (used in rent/housing and job applications).[26]
In 2012, Spokeo, a people search website, settled with the US Federal Trade Commission for $800,000 over violations of the Fair Credit Reporting Act.[27]
In 2017, Cambridge Analytica claimed that it has psychological profiles of 220 million United States citizens, based on 5,000 separate data sets,[28] with another source reporting 230 million.[29] A scandal emerged after it was found that after 270,000 Facebook users consented to sharing their data, data was scraped from about 50 million profiles on the social media platform. This was seen as breach of trust by Facebook.[30]
In 2018, American companies spent $19 billion acquiring and analyzing consumer data, according to the Interactive Advertising Bureau.[27]
In 2021, The Pillar outed a Catholic priest by purchasing data from a data broker including data usage from Grindr.[20]
Privacy issues and regulation
[edit]Information privacy laws are not as strict in the United States as in the European Union, where data brokers work hard to get around the General Data Protection Regulation (GDPR) regulations, brought into operation in 2018. Under GDPR, data can only be collected for re-use on one of six legal bases. The rather vague term "legitimate interest" is often abused or misinterpreted.[3] Explicit consent from users is required for information storage. In addition, data processing related with political opinion and religious belief is prohibited unless the consent of data subject is granted.[31]
In the US, individuals generally cannot find out what data a broker holds on them, how a broker got it, or how it is used.[32] There is no federal law that permits or enables consumers to see, make corrections to, or opt out of data compiled by brokers.[4]
Files on individuals are generally sold in lists; examples cited in testimony to the U.S. Congress include lists of rape survivors, seniors with dementia, financially vulnerable people, people with HIV, police officers (by home address),[8][22] alcoholics, and people with erectile dysfunction.[3][33]
Calls for regulation in the US
[edit]A 2007 University of California study, after requesting and analyzing information-sharing practices at 86 companies, found many operating under an opt-out model that it described as inconsistent with consumer expectations, and recommended that the California state legislature require companies to disclose their information-sharing policies using clear, unambiguous language, and consider creating a centralized, user-friendly method for consumers to opt out of information-sharing.[34]
The proposed US Data Accountability and Trust Act (introduced in 2009)[35] contained a number of requirements for auditing and verification of accuracy of data held by information brokers, and additional measures in the case of a security breach. The bill also gave identified individuals the means and opportunity to review and correct the data held that related to them. It passed through the United States House of Representatives in the 111th United States Congress, but failed to pass the United States Senate. It was revived by the 112th United States Congress in 2011 as H.R. 1707.,[36] but died after being referred to committee. The bill was first introduced by Rep. Bobby Rush [D-IL1] on 30 April 2009, H.R. 2221.[37]
In 2009, the U.S. Federal Trade Commission had recommended the United States Congress develop legislation enabling consumers to see the information that data brokers hold about them, a recommendation it renewed in subsequent reports in 2012 and 2014. In 2013, the U.S. Government Accountability Office also called for Congress to consider legislation.[21][38]
In October 2019, California Governor Gavin Newsom signed into action statute AB 1202. This bill "would require data brokers to register with, and provide certain information to, the Attorney General. The bill would define a data broker as a business that knowingly collects and sells to third parties the personal information of a consumer with whom the business does not have a direct relationship, subject to specified exceptions".[39] This law was created to safeguard against the "cloak of invisibility" (unregistered, unregulated, untracked information broker) that previous data brokers roamed in. It was also meant to regulate the purchasing of data in commercial third party buyers, and tracks the data brokers information trades.[clarification needed][40]
Due to the interest in federal regulation, data broker firms have lobbied and spent $29 million in the year 2020.[26]
In 2025, Zhou Shuai and Yin Kecheng, two China-based data brokers wanted by the FBI[41][42] were accused by the United States Department of State of selling sensitive American data to China’s Ministry of Public Security (MPS).[43]
Criticisms, consumer rights and breaches
[edit]A United States Senate Committee in 2013 published A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes.[22] It states that "Today, a wide range of companies known as 'data brokers' collect and maintain data on hundreds of millions of consumers, which they analyze, package, and sell generally without consumer permission or input." Their main findings were that:
- Data brokers collect a huge volume of detailed information on hundreds of millions of consumers.
- Data brokers sell products that identify financially vulnerable consumers.
- Data broker products provide information about consumer offline behavior to tailor online outreach by marketers.
- Data brokers operate behind a veil of secrecy.
The information produced by data brokers has been criticized for enabling discrimination in pricing, services and opportunities. For example, a May 2014 White House report found that web searches that included black-seeming first names such as Jermaine were more likely to result in ads being displayed that include the word "arrest," compared with web searches including white-seeming first names such as Geoffrey.[11]
An Online Information Broker FAQ[44] is published by Privacy Rights Clearinghouse (PRC), a nonprofit consumer organization in the United States. PRC also maintains a list of information brokers, with links to their privacy policies, terms of service, and opt-out provisions.[45]
Data brokers have also faced legal charges for security breaches due to poor data security practices.[46]
Professional associations
[edit]The Association of Independent Information Professionals (AIIP) is a professional association based in Baton Rouge, Louisiana, with members from 20 countries worldwide,[47] representing both primary and secondary researchers.[16]
Fiction
[edit]This section needs additional citations for verification. (March 2021) |
Examples of information brokers in contemporary fiction would be the Shadow Broker in the video game series Mass Effect;[48] Nicholas Wayne, Rachel, Elean Duga, Gustav St. Germain, Carol, and the President of the Daily Days newspaper company in Baccano!; or Izaya Orihara in the light novel series Durarara!!.[49] A few of the characters in Neal Stephenson's novel Snow Crash find work selling data as "stringers" for the Central Intelligence Corporation. Information broker characters play a prominent role in stories published by DC Comics. The character trope is prominently used by the superhero Oracle, as well as Calculator, Proxy, Chloe Sullivan, and Felicity Smoak.
See also
[edit]- Background check
- Competitive intelligence
- Information consultant
- Information professional
- Information warfare
- List broker
- Microtargeting
- Narrowcasting
- Whitepages (company), a US people-search website
- PeekYou, a US people-search website
- Privacy laws of the United States
- Psychological warfare
- Price discrimination
- Spokeo, a US people-search
- Amazon, an ecommerce platform
- Facebook, a social media company
References
[edit]- ^ a b c d "Information Brokers". Inc.com. 6 February 2020. Retrieved 11 March 2021.
- ^ a b c d Luenendonk, Martin (18 September 2019). "How to become an information broker". Cleverism. Retrieved 11 March 2021.
- ^ a b c d e f g "What Is a Data Broker and How Does It Work?". Clearcode. 4 February 2019. Retrieved 12 March 2021.
- ^ a b c d e "Information Brokers". The Social Engineering Framework. 4 December 2020. Retrieved 12 March 2021.
- ^ a b "Definition of Data Broker". Gartner. Retrieved 12 March 2021.
- ^ Wormell, Irene; Olesen, Annie Joan; Mikulás, Gábor (2011). "What is information consulting?". Information Consulting. pp. 1–11. doi:10.1016/B978-1-84334-662-3.50001-3. ISBN 978-1-84334-662-3.
Definitions of who or what information consultants actually are remain varied, as does the terminology to describe them, e.g. information broker, freelance librarian, service, etc. Once one has waded through the various definitions and found that no single term is totally accurate and satisfactory to indicate the type of work carried out, in most cases it is up to the individual to decide which is the most appropriate.
- ^ Broughton, Diane; Blackburn, Lissa; Vickers, Lesley (June 1991). "Information brokers and information consultants". Library Management. 12 (6): 4–16. doi:10.1108/EUM0000000000838.
- ^ a b c d "Congressional Testimony: What Information Do Data Brokers Have on Consumers?". 18 December 2013.
- ^ a b Sabroski, Suzanne J., ed. (2000). The independent information professional (PDF) (Report). AIIP.
- ^ Warnken, K. (1981). The Information Brokers: How to Start and Operate Your Own Fee-based Service. Information management series. Bowker. ISBN 978-0-8352-1287-8. Retrieved 11 March 2021.
- ^ a b "Big data: seizing opportunities, preserving values" (PDF). Executive Office of the President. May 2014. Archived from the original (PDF) on 20 January 2017. Retrieved 17 August 2014.
- ^ "Information broker job description, career as an information broker, salary, employment". StateUniversity.com. Retrieved 12 March 2021.
- ^ Bressan, Stephane; Lee, Thomas (June 1997). Information Brokering on the World Wide Web (PDF) (Report). Sloan School of Management, Massachusetts Institute of Technology.
Accepted at the WebNet 97 World Conference.
- ^ Kitchin, Rob (2014). The Data Revolution. Sage Publications Ltd. (UK).
- ^ Campana, Natalia (6 February 2020). "What does an Information Broker do?". Freelancer Blog. Retrieved 12 March 2021.
- ^ a b "Research Specialists". The Association of Independent Information Professionals (AIIP). Archived from the original on 14 April 2021. Retrieved 11 March 2021.
- ^ a b c "All About information brokers: what they do and how to become one". Fairygodboss. Retrieved 11 March 2021.
- ^ "Information Brokers". NSW Land Registry Services. Retrieved 12 March 2021.
- ^ Beckette, Lois (9 November 2012). "Yes, Companies Are Harvesting – and Selling – Your Facebook Profile". Retrieved 17 August 2014.
- ^ a b c d Sherman, Justin. "Data Brokers Know Where You Are—and Want to Sell That Intel". Wired. ISSN 1059-1028. Retrieved 15 April 2022.
- ^ a b c d "Data Brokers: A Call for Transparency and Accountability" (PDF). Federal Trade Commission. Government of the United States. May 2014. Retrieved 13 August 2014.
- ^ a b c "A Review of the Data Broker Industry" (PDF). educationnewyork.com. 18 December 2013. Archived from the original (PDF) on 15 March 2018. Retrieved 19 August 2025.
- ^ Singer, Natasha (16 June 2012). "Acxiom, the Quiet Giant of Consumer Database Marketing". The New York Times. Retrieved 22 March 2019.
- ^ "Acxiom sales brochure" (PDF). Acxiom. Retrieved 3 August 2023.
- ^ "Acxiom Launches Marketplace to Drive Smarter Campaigns". Acxiom. Retrieved 3 August 2023.
- ^ a b Sherman, Justin. "Data Brokers Are a Threat to Democracy". Wired. ISSN 1059-1028. Retrieved 25 January 2022.
- ^ a b Matsakis, Louise. "The Wired Guide to Your Personal Data (and Who Is Using It)". Wired. ISSN 1059-1028. Retrieved 15 April 2022.
- ^ Govind Krishnan V. (3 June 2017). "Aadhaar in the hand of spies Big Data, global surveillance state and the identity project". Fountain Ink Magazine. Archived from the original on 26 August 2017. Retrieved 27 August 2017.
- ^ Cadwalladr, Carole (18 March 2018). "'I made Steve Bannon's psychological warfare tool': meet the data war whistleblower". The Guardian. Retrieved 22 March 2019.
- ^ Glum, Julia (22 March 2018). "Was Your Facebook Data Actually 'Breached'? Depends On Who You Ask". Money. Archived from the original on 28 February 2021. Retrieved 12 March 2021.
- ^ "Answer to Question No E-000054/19". www.europarl.europa.eu. Retrieved 1 December 2020.
- ^ "Online Information Broker FAQ". Privacy Rights Clearinghouse. privacyrights.org. 4 October 2010. Archived from the original on 19 August 2016. Retrieved 6 May 2014.
- ^ Hill, Kashmir (19 December 2013). "Data broker was selling lists of rape victims, alcoholics, and 'erectile dysfunction sufferers'". Forbes. Retrieved 12 March 2021.
- ^ Hoofnagle, and Jennifer King, Chris Jay (17 December 2007). "Consumer Information Sharing: Where the Sun Still Don't Shine". (working Paper). SSRN 1137990.
- ^ "Data Accountability and Trust Act: Federal Breach Notification, Data Security Policies and File Access Addressed". Privacy Compliance & Data Security. 7 May 2009. Retrieved 12 March 2021.
- ^ Data Accountability and Trust Act (2011; 112th Congress H.R. 1707). GovTrack.us. Retrieved on 12 November 2013.
- ^ "Data Accountability and Trust Act (2009; 111th Congress H.R. 2221)". govtrack.us. Retrieved 12 November 2013.
- ^ "TC Recommends Congress Require the Data Broker Industry to be More Transparent and Give Consumers Greater Control Over Their Personal Information". Federal Trade Commission. 27 May 2014. Retrieved 31 May 2014.
- ^ "Assembly Bill No. 1202". 2019.
- ^ Lazarus, David (5 November 2019). "Column: Shadowy data brokers make the most of their invisibility cloak". Los Angeles Times.
- ^ "Zhou Shuai".
- ^ "Yin Kecheng".
- ^ "Sanctions on China-Based Hacker and Data Broker".
- ^ "Online Information Broker FAQ". Archived from the original on 19 August 2016. Retrieved 20 April 2012.
- ^ "Privacy Rights Clearinghouse". Archived from the original on 11 September 2016. Retrieved 20 April 2012.
- ^ "Agency Announces Settlement of Separate Actions Against Retailer TJX, and Data Brokers Reed Elsevier and Seisint for Failing to Provide Adequate Security for Consumers Data". 27 March 2008.
- ^ "Contact Us". The Association of Independent Information Professionals (AIIP). Retrieved 11 March 2021.
- ^ Finley, Brittni (20 May 2021). "Mass Effect 1: Should You Give Cerberus Info to Shadow Broker". Game Rant. Retrieved 2 May 2025.
- ^ Hayward, Adam (17 November 2021). "Durarara!! The 10 Best Characters". ScreenRant. Retrieved 2 May 2025.
Further reading
[edit]- Beckett, Lois (13 June 2014). "Everything we know about what data brokers know about you". ProPublica.
External links
[edit]- Association of Independent Information Professionals "AIIP members are owners of diverse, information-centric businesses located around the world."
Data broker
View on GrokipediaDefinition and Scope
Core Activities and Functions
Data brokers engage in the systematic collection of personal information from multiple sources, including public records such as property deeds and voter registrations, commercial data from purchases and loyalty programs, online tracking via cookies and device identifiers, and third-party providers like financial institutions or data aggregators.[2][11] This raw data encompasses identifiers like names, addresses, and phone numbers, alongside behavioral indicators such as browsing history and transaction records.[2][12] Once collected, data brokers aggregate disparate datasets to construct detailed individual profiles, often linking identifiers across sources to infer attributes not directly observed, such as income levels or lifestyle preferences.[2][13] These profiles integrate hundreds of data points per consumer, enabling the creation of segmented databases for specific use cases.[2] Aggregation relies on matching algorithms to resolve duplicates and enhance accuracy, though errors in linkage can occur due to common names or outdated information.[14] Analysis follows aggregation, where brokers apply statistical models and machine learning to derive actionable insights, including predictive scoring for behaviors like purchase likelihood or default risk.[2][14] This processing generates derived variables, such as demographic categories (e.g., age, ethnicity inferred from surnames) or propensity scores for interests like travel or health conditions.[2] Services derived from this include risk assessment tools for insurers evaluating underwriting, targeted advertising datasets for marketers selecting audiences, and demographic profiling for audience segmentation in campaigns.[2][15] The resulting products are sold to clients such as marketing firms, financial institutions, insurers, and government entities seeking operational efficiencies through data-driven decisions.[2][16] Sales occur via APIs, bulk datasets, or customized reports, monetizing the value extracted from aggregated insights rather than raw inputs alone.[11] At scale, the industry processes datasets covering hundreds of millions of individuals, with some brokers maintaining trillions of data points to support real-time querying and efficient information exchange in commercial markets.[2][17]Distinctions from Related Industries
Data brokers differ from consumer reporting agencies (CRAs), which are primarily regulated under the Fair Credit Reporting Act (FCRA) of 1970 for furnishing information used in credit, employment, insurance, or other eligibility decisions, requiring verifiable accuracy, consumer dispute rights, and permissible purpose restrictions.[18] In contrast, data brokers aggregate and sell broader consumer profiles for purposes such as marketing, advertising, and risk assessment beyond FCRA-defined uses, often without equivalent consumer protections or notice, leading to historical arguments that they do not produce "consumer reports."[19] This distinction persisted until regulatory scrutiny intensified, though data brokers maintain specialization in non-credit data commoditization rather than CRA-style verification for eligibility.[20] Unlike major technology platforms such as Google and Meta (formerly Facebook), which primarily collect first-party data directly from users through interactions on their owned services—like searches, ads, and social feeds—for internal advertising and personalization, data brokers focus on acquiring, aggregating, and reselling third-party data without establishing direct consumer relationships or interfaces.[21] Tech platforms leverage proprietary ecosystems for data generation and retention, often under user agreements implying consent, whereas data brokers operate as intermediaries compiling disparate sources into marketable profiles sold to diverse buyers, emphasizing scale over platform-specific engagement.[22] Data brokers also diverge from data analytics firms, which typically provide customized processing, modeling, or insights derived from client-supplied datasets rather than standardized, off-the-shelf data products.[23] While analytics firms emphasize bespoke services like predictive modeling for specific business needs, data brokers prioritize the commoditized aggregation and direct sale of raw or derived consumer data profiles, enabling broad market access without tailored analysis.[24] Hybrid entities exist where firms blend brokerage with analytics, but data brokers' core niche remains third-party data intermediation detached from end-user services or custom consulting.[25]Historical Development
Early Origins in Credit and Consumer Reporting
The practice of systematic credit reporting originated in the early 19th century with commercial agencies focused on business creditworthiness, such as the Mercantile Agency founded in 1841, which collected data on merchants to mitigate risks in trade transactions.[26] Consumer-oriented reporting emerged later, particularly after the American Civil War, as retail credit expanded; agencies began compiling personal financial histories, including subjective assessments of character, to inform lending decisions by retailers and insurers.[27] A pivotal early example was the Retail Credit Company, established in 1899 in Atlanta, Georgia, which initially provided localized assessments of individuals' credit reliability for merchants and later evolved into Equifax.[28] These manual operations relied on networks of investigators and paper records, laying the groundwork for data aggregation practices that extended beyond pure credit evaluation to include rudimentary consumer profiles for risk assessment.[29] The post-World War II economic expansion amplified demand for such reporting, as surging consumer spending—fueled by rising incomes, suburbanization, and installment buying—led to widespread use of credit for automobiles, appliances, and housing, necessitating centralized data to evaluate borrowers' repayment capacity.[30] Consumer credit outstanding reached record levels by the late 1940s, exceeding $11 billion by September 1949, prompting credit bureaus to consolidate fragmented local records into more comprehensive national repositories to support the lending boom.[31] This era marked the shift toward viewing aggregated personal data as a commodity for financial institutions, with bureaus like early predecessors of TransUnion and Experian emerging to handle the volume of inquiries from banks and retailers.[32] The Fair Credit Reporting Act (FCRA) of 1970 formalized these practices by regulating consumer reporting agencies, requiring accuracy, fairness, and privacy protections in data handling to address inaccuracies and misuse in manual files.[33] This legislation spurred standardization amid growing scrutiny, as it mandated verification processes and consumer access rights, influencing bureaus to professionalize operations.[34] Concurrently, technological advancements transitioned records from paper ledgers to computerized databases by the 1970s, enabling faster aggregation and reducing errors; by the decade's end, major agencies had digitized vast datasets, paving the way for scalable reporting in the 1980s.[35] This digitization concentrated the industry into a few dominant players, enhancing efficiency for credit and early consumer marketing applications without yet incorporating internet-scale data flows.[29]Expansion in the Digital Era
The data broker industry experienced significant expansion in the 1990s and 2000s, driven by the internet's proliferation, which enabled the collection of digital behavioral data through online tracking technologies such as cookies and web logs.[36] This period coincided with the dot-com boom, where rapid investments in internet infrastructure from 1995 to 2000 increased online user activity, generating traceable consumer interactions that brokers could aggregate from public and commercial sources.[37] E-commerce platforms, emerging in the mid-1990s and scaling post-2000 with improved broadband access, supplied transactional records including purchase histories and browsing patterns, causally linking platform growth to brokers' access to granular, real-time datasets.[38] By the early 2000s, established brokers digitized legacy operations to handle surging volumes; for instance, Acxiom, operational since 1969, shifted toward digital processing around 2000, capitalizing on enhanced computing capabilities to integrate internet-sourced data with traditional records.[36] The number of online-operating brokers proliferated as internet users grew from approximately 248 million globally in 2000 to over 1 billion by 2005, providing exponential inputs for profiling.[39][40] The 2010s marked further acceleration, as mobile devices and apps— with smartphone adoption rising from 35% of U.S. adults in 2011 to 81% by 2019—yielded location, usage, and sensor data streams for brokers to acquire via partnerships and APIs.[41] Internet of Things (IoT) deployments, expanding from fewer than 10 billion connected devices in 2010 to over 20 billion by 2019, contributed real-time environmental and behavioral metrics, broadening data diversity.[42] Social media platforms' APIs facilitated extraction of interaction graphs and preferences, while AI advancements in predictive modeling—enabled by scalable cloud computing—allowed brokers to derive probabilistic insights from petabyte-scale aggregations, enhancing commercial utility.[43][44] A pivotal milestone occurred in 2018 with the Cambridge Analytica revelations, where data harvested from up to 87 million Facebook profiles via app integrations demonstrated brokers' role in scaling psychological profiling for targeted applications, spurring refinements in sourcing transparency amid heightened ecosystem interconnectedness.[45] This event highlighted causal dependencies on platform APIs but did not halt growth, as brokers adapted by diversifying inputs beyond single networks.[46]Business Model and Operations
Data Acquisition and Sources
Data brokers acquire consumer information primarily through a combination of public records, commercial transactions, and digital tracking mechanisms, ensuring compliance with applicable laws governing access to such data. Public sources form a foundational input, including government-maintained records such as property deeds, voter registrations, court documents, and business filings, which are accessible via statutory provisions allowing public inspection without individual consent.[2] [47] These records provide demographic details like addresses, marital status, and legal histories, often aggregated through automated scraping or licensed feeds from official repositories.[2] Commercial sources contribute transactional and behavioral data derived from voluntary consumer interactions, such as loyalty programs offered by retailers, where participants exchange personal details for discounts or rewards, and product warranty registrations that include purchase histories and contact information.[2] [11] Financial institutions and catalog companies also supply aggregated purchase data under data-sharing agreements, reflecting consumer spending patterns without direct broker-consumer relationships.[47] These streams emphasize opt-in mechanisms inherent to the services, where disclosure occurs via terms of participation. Digital sources encompass online activities captured through cookies, device identifiers, and application data, often with user consent embedded in privacy policies or terms of service for websites and apps.[2] Data brokers license feeds from third-party trackers monitoring browsing, search queries, and mobile app usage, as well as social media profiles and e-commerce transactions, yielding behavioral insights like interests and preferences.[48] Across these methods, the industry amasses billions of data elements—such as one broker reporting 700 billion elements from 1.4 billion transactions as of 2015—drawn from diverse, legally permissible channels rather than covert means.[41] [47]Processing, Aggregation, and Analytics
Data brokers initiate processing by cleaning and deduplicating raw datasets, employing automated matching algorithms such as fuzzy logic to identify and merge duplicate records despite variations in spelling, formatting, or incomplete entries, like linking "Jane Dae" to "Jane Doe."[2] This deduplication compares data against verified benchmarks, including internal known truths like employee birthdates, to detect and resolve inconsistencies, thereby minimizing errors inherent in manual compilation methods.[2] Aggregation integrates data from diverse sources—commercial transactions, public records, and inter-broker exchanges—through record linkage techniques that connect identifiers across datasets to build unified profiles encompassing demographics, financial history, and behavioral indicators.[2] Enrichment enhances these aggregates by appending derived attributes, such as inferring brand loyalty from purchase patterns or recreational interests from licensing data like boating permits.[2] Analytics apply algorithmic models to infer latent traits and behaviors, analyzing hundreds to thousands of data elements to generate predictive scores, such as likelihood of seeking chargebacks or interest in specific purchases.[2] These models, increasingly incorporating machine learning for pattern recognition, enable segmentation into categories like "Soccer Moms" (women aged 21-45 with children and recent sporting goods buys) or "Financially Challenged" households, producing scalable outputs in the form of anonymized or de-identified profiles and audience segments that surpass the precision of traditional rule-based systems.[2][25] Real-time algorithmic reconciliation against multiple sources further boosts accuracy by resolving conflicts, such as age discrepancies, through weighted evaluations.[2]Sales and Revenue Mechanisms
Data brokers primarily monetize through business-to-business (B2B) sales models, including subscriptions for ongoing database access, pay-per-use arrangements such as per-record queries or searches, and custom datasets tailored to client specifications.[5][47] Subscription models dominated revenue in 2024, enabling clients to access real-time, aggregated data streams for persistent analytics needs, while pay-per-use options accommodate episodic demands like targeted lookups.[5] Hybrid approaches, combining fixed subscriptions with usage-based fees via APIs, further support scalable delivery.[5] These mechanisms target B2B sectors, with marketing and advertising comprising over 36% of the market in 2024, driven by demand for consumer profiles in targeted campaigns; financial services (BFSI) represent the largest end-use segment for risk assessment and credit modeling; and government agencies increasingly purchase datasets for operational intelligence.[49][5][5] Pricing structures hinge on data granularity—such as depth of attributes like demographics, behaviors, or purchase history—and exclusivity, where unique or non-redundant datasets fetch premiums over commoditized alternatives.[5][50] Industry revenue, estimated at USD 277.97 billion in 2024, benefits from integration with advertising technology platforms, where brokers supply real-time consumer insights for programmatic bidding and personalized ad delivery, exemplified by partnerships like Acxiom's collaboration with LoopMe in June 2025.[5][5] Projections indicate growth to USD 294.27 billion in 2025, fueled by these ad tech synergies that enhance data liquidity and buyer efficiency.[51] By aggregating disparate information sources into verifiable packages, brokers function as intermediaries that diminish buyers' acquisition and validation costs, enabling more precise market transactions without direct sourcing.[5]Market Landscape and Key Players
Major Companies and Their Roles
Acxiom specializes in marketing-oriented data brokering, aggregating consumer profiles from public records, purchase histories, and online behaviors to enable predictive analytics and audience segmentation for advertisers. With operations spanning over 60 countries and data on approximately 2.5 billion individuals, it supports sectors like retail and finance by delivering third-party data for personalized campaigns and omnichannel strategies.[52][53] Experian functions as a hybrid credit bureau and data broker, leveraging its vast repository of financial and demographic data to offer solutions beyond traditional credit scoring, including marketing datasets for customer acquisition and risk modeling. It provides third-party data enriched with transactional insights to businesses in insurance, telecommunications, and e-commerce, facilitating targeted outreach while integrating alternative data sources like digital footprints.[54][55] Oracle Data Cloud historically integrated data brokering with enterprise technology platforms, supplying aggregated consumer data for advertising targeting and analytics until its advertising division ceased operations in July 2024 amid shifting privacy regulations and market dynamics. Prior to shutdown, it focused on tech-driven sectors like digital marketing, combining behavioral data with cloud infrastructure for scalable audience insights.[56][57] These leading entities, alongside firms like Equifax and LexisNexis, demonstrate specialization—Acxiom in consumer marketing, Experian in credit-adjacent applications, and former players like Oracle in tech ecosystems—which fosters competition through differentiated offerings in data granularity and sector-specific applications. Post-2020 consolidations, such as strategic acquisitions enhancing dataset synergies, have enabled scale amid regulatory scrutiny, though specific deals remain selective to bolster core competencies without overextending into saturated areas.[53][55]Industry Scale, Growth, and Economic Contributions
The global data broker market was estimated at USD 277.97 billion in 2024.[5] Independent analyses place the figure at approximately USD 270 billion for the same year.[58] These valuations reflect the aggregation and monetization of consumer, behavioral, and transactional data across sectors including marketing, finance, and risk assessment. Projections indicate sustained expansion, with the market anticipated to reach USD 512.45 billion by 2033 at a compound annual growth rate (CAGR) of 7.3%.[5] Mordor Intelligence forecasts a 2025 value of USD 294.27 billion, growing to USD 419.72 billion by 2030 with a CAGR of 7.36%.[51] This trajectory stems from rising demand for data-driven decision-making amid digital transformation, though growth rates vary slightly across reports due to differing methodologies in scope and regional weighting. Economically, data brokers enhance resource allocation by supplying aggregated insights to small and medium-sized enterprises (SMEs), which lack the infrastructure for independent data acquisition, thereby lowering barriers to market entry and operational efficiency. The sector bolsters adjacent industries like digital advertising, where brokered data enables targeted allocation of expenditures exceeding hundreds of billions annually, indirectly amplifying productivity and GDP contributions through optimized consumer matching. Employment impacts include roles in data curation, analytics, and compliance, feeding into the expansion of the data science workforce, though precise job figures attributable solely to brokers remain aggregated within broader tech employment trends.[5][51]Benefits and Innovations
Economic Efficiency and Commercial Advantages
Data brokers mitigate information asymmetries in commercial transactions by aggregating and disseminating consumer data, enabling businesses to make informed decisions without extensive independent collection efforts. This intermediary role streamlines data markets, unlocking economic value from otherwise underutilized information and fostering more efficient resource allocation across industries.[59][60] In advertising, data brokers support targeted campaigns that reduce expenditure on ineffective outreach, allowing firms to prioritize high-engagement audiences. For instance, by providing demographic and behavioral insights, brokers help advertisers avoid broad-spectrum blasts, cutting waste in a sector where global spending exceeded $1 trillion in 2024. This precision enhances return on investment, as evidenced by improved marketing efficiency through data-driven segmentation.[61][62] Beyond advertising, data brokers facilitate refined risk pricing in insurance by supplying aggregated datasets for actuarial analysis, permitting premiums that better reflect individual risk factors rather than population averages. Accurate underwriting enabled by such data minimizes cross-subsidization, potentially lowering costs for lower-risk policyholders while maintaining solvency for providers.[63] In a market-oriented system, these voluntary data exchanges promote competition, as firms leverage broker services to innovate offerings and consumers benefit from opt-out mechanisms or direct data monetization opportunities.[64]Applications in Fraud Detection and Personalization
Data brokers supply financial institutions with aggregated consumer data, including behavioral patterns, transaction histories, and identity verification details, enabling real-time fraud profiling and anomaly detection. In the banking sector, this integration allows for cross-referencing of live transactions against broker-provided risk scores, helping to flag synthetic identity fraud or account takeovers before completion. For example, institutions rely on such data to prevent unauthorized access, with the American Bankers Association noting that limiting access to data brokers would undermine banks' fraud prevention capabilities by reducing the granularity of available consumer insights.[65] In insurance, data brokers facilitate fraud detection by aggregating public records and lifestyle data to assess claim validity, such as identifying inconsistencies in reported injuries or vehicle usage patterns. This application supports proactive interventions, contributing to overall reductions in fraudulent payouts, though industry-wide statistics attribute broader fraud prevention savings in banking and insurance to advanced analytics incorporating broker data, amid annual global fraud losses exceeding tens of billions.[66] For personalization, data brokers aggregate disparate data sources to enrich customer profiles, enabling e-commerce platforms to deliver tailored recommendations, dynamic pricing, and targeted advertising based on inferred preferences and purchase histories. This enhances marketing efficacy by improving ad relevance and conversion rates, with broker-supplied datasets driving precise segmentation that supports automated personalization engines. Analyses of the data broker market highlight that rising demand for such personalized data fuels e-commerce expansion, as businesses leverage aggregated insights to optimize customer journeys without relying solely on first-party data.[61][67] In healthcare, anonymized data from brokers aids predictive analytics for treatment personalization, such as forecasting patient responses to therapies using population-level trends in demographics and behaviors. This allows providers to customize care plans, improving outcomes through targeted interventions, though applications remain constrained by regulatory requirements for de-identification. Studies on related analytics frameworks report ROI doublings in predictive personalization efforts, underscoring the efficiency gains from broker-enabled data enrichment in resource allocation.[68]Broader Societal and Technological Impacts
Data brokers enhance technological progress by supplying aggregated datasets that underpin advancements in artificial intelligence and machine learning, where access to diverse, large-scale data is essential for effective model training and validation. An OECD report highlights that data marketplaces and brokers function as key intermediaries, providing third-party data to support AI development ecosystems, thereby enabling broader experimentation and refinement of algorithms without requiring entities to independently amass equivalent volumes of information.[69] This role has contributed to the data broker market's projected growth, with AI-driven analytics facilitating the extraction of insights from petabyte-scale repositories, as noted in industry analyses projecting a compound annual growth rate of approximately 8% through the late 2020s.[49] Beyond commercial applications, broker-sourced datasets bolster public goods such as epidemiological modeling, where anonymized aggregates of mobility and behavioral data aid in simulating disease propagation and informing policy responses. For example, during public health crises, commercial data intermediaries have supplied location-derived insights to researchers, complementing government datasets and enhancing predictive accuracy in real-time outbreak tracking, as demonstrated in studies on digital epidemiology leveraging big data sources.[70] This aggregation capability extends causal understanding of population dynamics, allowing for more robust causal inference in health modeling without sole reliance on resource-intensive primary surveys. As innovation catalysts, data brokers lower entry barriers for startups in data-intensive sectors like fintech and adtech by offering purchasable datasets that circumvent the need for proprietary data moats, thereby promoting competitive dynamism and rapid prototyping. Economic analyses of data markets underscore parallels to historical commodity trading ecosystems, where decentralized data dissemination fostered liquidity and efficient resource allocation, driving overall market thickness without prohibitive regulatory overlays—principles that similarly apply to modern data brokerage, yielding net positive externalities through voluntary exchange and price discovery.[71] Such mechanisms have empirically supported scalable innovation, as evidenced by the integration of broker data into agentic AI frameworks that unlock portable datasets for entrepreneurial applications.[72]Risks, Criticisms, and Challenges
Privacy and Surveillance Concerns
Data brokers aggregate personal information from public records, commercial databases, and online sources to construct detailed consumer profiles, often without individuals' explicit knowledge or consent, enabling pervasive surveillance through inferred behaviors, preferences, and risks.[2] This profiling process, which includes deriving sensitive attributes such as political affiliations, health inferences, or financial vulnerabilities from disparate data points, raises concerns about unauthorized monitoring akin to commercial surveillance, as brokers sell these dossiers to marketers, insurers, and government entities for decision-making.[2] Empirical analysis of nine major brokers by the Federal Trade Commission in 2014 revealed that such practices occur largely invisibly to consumers, with limited opportunities for access or correction, amplifying risks of inaccurate or harmful characterizations.[2] The potential for doxxing emerges when broker-sold data, including real-time location histories, court records, and social affiliations, facilitates targeted exposure of private details, as seen in cases where aggregated profiles enabled harassment or outing of individuals' personal lives, such as the 2021 identification of a priest's private activities via commercially available mobility data.[73] Discrimination risks arise from profiling's use in algorithmic assessments, where inferred traits lead to adverse outcomes like housing denials; investigations have documented instances where consumers were rejected based on broker-derived "risk scores" incorporating unverified or biased inferences, exacerbating inequalities without recourse.[74] Consumer surveys and regulatory findings underscore awareness gaps, with the FTC noting that individuals typically remain unaware of brokers' existence and the extent of data aggregation, hindering informed participation in data ecosystems; while consumers can submit opt-out requests to suppress their data from certain products, these processes are often opaque, take several weeks to implement, offer incomplete control due to persistent aggregated or matched data, and require periodic repetition as information re-aggregates from public sources such as voter records and real estate sites, with paid services available to automate submissions across brokers though their effectiveness varies.[2][2] Pro-privacy advocates contend that aggregation transforms consented or public disclosures into comprehensive surveillance tools, necessitating explicit opt-in mechanisms to mitigate harms like stalking or identity-based targeting, as broker data has been linked to enabling domestic abusers and scammers through sensitive sales.[15] In contrast, industry perspectives argue for implied consent derived from original data sources—such as public records or terms accepted during online interactions—positing that resale of non-sensitive aggregates fosters efficiency without overriding reasonable privacy expectations, though critics counter that such claims overlook the novel risks of recontextualized profiles.[14] These debates highlight tensions between data utility and individual autonomy, with empirical evidence from broker practices indicating that transparency deficits perpetuate unbalanced power dynamics in information flows.[2] While legitimate data brokers emphasize regulatory compliance and the use of verifiable sources, a parallel illicit ecosystem thrives on dark web marketplaces and underground forums. Here, stolen or unverifiable personal data—often sold in packages known as "fullz" containing full names, Social Security numbers, addresses, dates of birth, and financial details—is traded anonymously, typically sourced from data breaches, phishing attacks, or other criminal methods. Prices for these records generally range from $1 to $100 per record, depending on factors such as freshness, completeness, and data quality. This shadow market directly fuels identity theft, financial fraud, and other criminal activities. It also poses risks to the legitimate industry, as poorly vetted data from illicit sources can potentially enter mainstream aggregation chains through inadequate supplier due diligence or secondary resales.Data Security Vulnerabilities and Breaches
Data brokers face significant security vulnerabilities stemming from unpatched software and outdated systems, which enable exploitation by cybercriminals. Unpatched vulnerabilities account for approximately 60% of cyber compromises across industries, including those handling consumer data profiles.[75] In data brokerage operations, the aggregation of vast personal datasets—often including names, addresses, Social Security numbers, and financial histories—amplifies risks when legacy infrastructure lacks timely updates, as attackers target known flaws rather than developing novel exploits.[76][77] A prominent example is the 2017 Equifax breach, where hackers exploited a vulnerability in the Apache Struts web application framework that had been publicly disclosed in March 2017 but remained unpatched on Equifax's systems.[78] This incident compromised sensitive information of 147 million individuals, including Social Security numbers, birth dates, and addresses, leading to widespread unauthorized access.[79][80] The breach's scale was exacerbated by poor segmentation and detection mechanisms, allowing lateral movement within Equifax's network after initial entry.[81] More recent incidents highlight persistent challenges. In 2024, a major consumer data broker suffered its largest breach due to an accidental insider action exposing back-end database passwords, potentially affecting millions of records.[82] Such events underscore how human error combined with inadequate access controls can rival technical flaws in causing exposures. The average global cost of data breaches reached $4.88 million in 2024, with financial services firms—overlapping with data brokerage—facing costs up to $5.9 million per incident due to regulatory fines, remediation, and lost business.[83][84] Post-breach responses have driven industry adaptations, including accelerated patching protocols and broader encryption deployment to render stolen data unusable.[85] For instance, following high-profile incidents like Equifax, affected entities implemented enhanced encryption for data at rest and in transit, alongside improved monitoring, reducing the effective impact of subsequent compromises.[81] These measures reflect causal links between vulnerabilities and outcomes, with empirical evidence showing faster containment correlating to 10% lower costs when breaches are detected within days.[83] Despite handling trillions of data points annually, reported breaches remain a fraction of total operations, indicating that targeted hardening mitigates systemic risks without eliminating them.[86]| Breach Incident | Date | Affected Individuals | Primary Cause |
|---|---|---|---|
| Equifax | May-July 2017 | 147 million | Unpatched Apache Struts vulnerability[78] |
| Major Consumer Data Broker (2024) | 2024 | Millions (exact undisclosed) | Accidental database password exposure[82] |