Protege Raises $30 Million Led by a16z to Unlock Access to Data for AI Development
News > Technology News
Audio By Carbonatix
7:00 AM on Thursday, January 8
The Associated Press
NEW YORK--(BUSINESS WIRE)--Jan 8, 2026--
Protege, an AI data platform unlocking access to trusted, real-world data at scale, today announced a $30 million Series A round led by Andreessen Horowitz (a16z). The financing expands the company’s $25 million Series A from August 2025 and brings total funding to $65 million since the company’s founding in 2024. Returning investors include Footwork, CRV, Bloomberg Beta, Flex Capital, Shaper Capital, and more.
“Across industries, we’re seeing demand for real-world data grow faster than the market’s ability to supply it responsibly,” said Bobby Samuels, CEO and co-founder of Protege. “At the same time, data is highly fragmented, and neither data holders nor AI builders are set up to operationalize it at scale. Protege serves as a trusted source of curated, and AI-ready data while unlocking new revenue streams for data providers. Partnering with Andreessen Horowitz allows us to scale this model and deliver high-quality, use-case-specific data that AI research teams can trust.”
Protege supports streamlined access to real-world datasets, including private and proprietary data across multiple domains and formats, such as media content, audio recordings, de-identified health records, and medical imaging. The company aggregates data sources from trusted data providers via licensing agreements while also providing technical expertise for curating, creating, and optimizing datasets for AI. Protege works with AI companies and institutions worldwide—including the majority of the “Magnificent Seven”—to support training and evaluation workflows for next-generation AI systems.
“Access to data is the biggest bottleneck to the advancement of AI,” said Travis May, Chairman and co-founder of Protege, and previous CEO of Datavant and LiveRamp. “The next phase of AI will be driven by real-world, proprietary data generated through everyday human activity. Protege is pioneering ways to safely access this information across data sources and compensate data owners to unlock AI’s potential.”
In 2025, Protege expanded its data partner network to hundreds of organizations to provide aggregated access to new data sources and formats. Protege curates datasets from across its partner network to meet AI development needs and provides revenue share payouts to data partners with each use.
"The next era of AI will be shaped by who can responsibly unlock access to the world’s most valuable data,” said Daisy Wolf, Partner at Andreessen Horowitz. “Protege has built a platform that respects the complexity of real-world data across industries while making it usable for modern AI development. Their momentum reflects a broader shift in the market, and we’re proud to support the team as they scale this critical layer of the AI ecosystem.”
The new capital will be used to accelerate product development, significantly expand Protege’s data network into new domains and data formats, deepen partnerships with leading institutions, and scale the team and infrastructure required to deliver AI-ready and rights-protected access to real-world data.
About Protege:
Protege is an AI data platform designed to unlock real-world data at scale. By enabling high-quality, cross-vertical data networks, Protege helps AI teams overcome the most critical bottleneck in AI development and deploy more capable, reliable models across industries such as healthcare, media, audio, and beyond. Learn more at www.withprotege.ai.
View source version on businesswire.com:https://www.businesswire.com/news/home/20260108257753/en/
CONTACT: Inkhouse for Protege
KEYWORD: NEW YORK UNITED STATES NORTH AMERICA
INDUSTRY KEYWORD: DATA MANAGEMENT HEALTH TECHNOLOGY HEALTH TECHNOLOGY AUDIO/VIDEO SOFTWARE ARTIFICIAL INTELLIGENCE
SOURCE: Protege
Copyright Business Wire 2026.
PUB: 01/08/2026 07:00 AM/DISC: 01/08/2026 07:00 AM
http://www.businesswire.com/news/home/20260108257753/en