Porsche
SSupported by cloud hosting provider DigitalOcean – Try DigitalOcean now and receive a $200 when you create a new account!

Protege Raises $30M in Series A Funding Led By a16z

Listen to this article

Protege successfully raised $30 million in a Series A extension round led by Andreessen Horowitz (a16z), with participation from returning investors including Footwork, CRV, Bloomberg Beta, Flex Capital, and Shaper Capital, elevating the company’s total funding to $65 million since its 2024 founding.

Protege operates as an AI data platform that facilitates access to trusted, real world datasets at scale. Founded in 2024 by Bobby Samuels (CEO), Travis May (Chairman, with prior experience at Datavant and LiveRamp), Richard Ho (CTO), and Engy Ziedan (Chief Scientific Officer), the company emphasizes ethical sourcing of multimodal data, including de-identified health records, medical imaging, audio recordings, and media content. It curates datasets for AI training and evaluation, partnering with data providers through licensing agreements and offering revenue sharing models. By 2025, Protege had expanded its network to hundreds of organizations and supports workflows for leading AI institutions worldwide.

This latest round builds on a $25 million Series A in August 2025 led by Footwork and a $10 million seed round led by CRV. The extension underscores rapid adoption in industries facing data shortages for AI development.

a16z Partner Daisy Wolf noted that Protege’s approach respects data complexities while enabling modern AI use, highlighting a market shift toward responsible data unlocking. CEO Bobby Samuels emphasized the platform’s role in supplying curated, AI ready data amid fragmented sources, while Chairman Travis May pointed to proprietary data as the driver for AI’s next phase.

Protege’s $30 million Series A extension round, led by Andreessen Horowitz (a16z) and announced on January 8, 2026, marks a significant milestone in the company’s trajectory, bringing its total funding to $65 million since its inception in 2024. This funding builds directly on the momentum from Protege’s prior raises, including a $25 million Series A in August 2025 led by Footwork and a $10 million seed round led by CRV, reflecting investor confidence in its mission to address one of AI’s most pressing challenges: access to high quality, real world data.

At its core, Protege functions as a governed marketplace and data infrastructure platform that connects data holders (such as organizations in healthcare, media, audio, and motion capture) with AI developers seeking proprietary, multimodal datasets for training, fine tuning, and evaluation. Founded by a seasoned team including CEO Bobby Samuels, Chairman Travis May (former CEO of Datavant and LiveRamp), CTO Richard Ho, and Chief Scientific Officer Engy Ziedan, the company has rapidly scaled its partner network to hundreds of organizations by 2025, emphasizing ethical licensing, data curation, anonymization, and revenue sharing models that compensate providers based on usage. This approach differentiates Protege from reliance on public or synthetic data, focusing instead on real world sources that capture authentic human and system behaviors across domains like video, imaging, gaming, manufacturing, life sciences, real estate, finance, and education.

The funding round arrives amid a broader industry shift where AI progress is increasingly constrained by data availability rather than compute or model architecture. Public datasets have been largely exhausted, and the internet’s scrapable content has reached its limits, pushing developers toward fragmented, proprietary sources that are often inaccessible due to privacy, intellectual property, and operational hurdles. Protege addresses these by streamlining discovery, filtering, and combination of datasets with built-in compliance and transparency, effectively shortening delivery timelines from years to months in sectors like healthcare. As Travis May articulated, “Access to data is the biggest bottleneck to the advancement of AI. The next phase of AI will be driven by real world, proprietary data generated through everyday human activity.” Similarly, Bobby Samuels highlighted the demand supply imbalance: “We’re seeing demand for real world data grow faster than the market’s ability to supply it responsibly.”

Investor enthusiasm, particularly from a16z, stems from Protege’s proven product market fit, as evidenced by its collaborations with foundational model builders, including the majority of the Magnificent Seven tech giants. Daisy Wolf of a16z remarked, “The next era of AI will be shaped by who can responsibly unlock access to the world’s most valuable data,” underscoring the platform’s role in navigating complex data landscapes. The capital will fuel specific initiatives: accelerating product features for data cleaning and formatting, broadening coverage into new verticals, enhancing partnerships, and expanding the team across roles like data scientists, engineers, and operations.

Recommended: HEN Technologies Raises $22 Million In Equity And Debt Funding

In the competitive landscape, Protege stands out by concentrating on data aggregation and ethical exchange, unlike broader AI tooling providers. Key competitors include Scale AI, which offers data labeling and annotation services; Snorkel AI, focused on programmatic data labeling for machine learning; and Labelbox, which provides tools for data labeling and management. These players address adjacent needs but lack Protege’s emphasis on proprietary, real world data marketplaces with revenue sharing and compliance layers. Looking ahead, this funding positions Protege to influence AI’s evolution, potentially shaping standards for data valuation, licensing, and ethical AI development, as seen in its participation in events like CES 2026 panels on AI copyright and data valuation. Reactions from the tech community, including congratulatory notes on LinkedIn and X, indicate strong support for Protege’s vision amid predictions that media licensing for AI could reach hundreds of millions in revenue by 2026 alone.

Funding History Table

Round Amount Raised Lead Investor Date Participating Investors Cumulative Total
Seed $10 million CRV 2024 (exact date not specified) Not detailed $10 million
Series A $25 million Footwork August 2025 CRV, Bloomberg Beta, Flex Capital, Shaper Capital $35 million
Series A Extension $30 million Andreessen Horowitz (a16z) January 8, 2026 Footwork, CRV, Bloomberg Beta, Flex Capital, Shaper Capital $65 million

Key Investors Table

Investor Role in Latest Round Notable Background/Contributions
Andreessen Horowitz (a16z) Lead Focus on AI infrastructure; Partner Daisy Wolf emphasizes data’s role in AI’s future.
Footwork Returning Participant Led previous Series A; Co-founder Nikhil Basu Trivedi supports data focused ventures.
CRV Returning Participant Led seed round; Partner Saar Gur invests in enterprise tech and AI.
Bloomberg Beta Returning Participant Early stage AI and data investments aligned with media and tech.
Flex Capital Returning Participant Focus on scalable tech platforms.
Shaper Capital Returning Participant Supports innovative data and AI startups.

Competitors Comparison Table

 

Company Primary Focus Key Differentiation from Protege Funding/Scale Highlights
Scale AI Data labeling and annotation for AI models Broader tooling including human in the loop labeling; less emphasis on proprietary data marketplaces. Raised over $1 billion; serves major AI firms.
Snorkel AI Programmatic data labeling and management Focuses on weak supervision and custom labeling pipelines; not centered on real world data aggregation. $135 million in funding; enterprise oriented.
Labelbox Data labeling platform with collaboration tools Emphasizes workflow for labeling teams; lacks revenue sharing for data providers. $188 million raised; integrates with ML frameworks.

This round not only validates Protege’s model but also signals a maturing AI data ecosystem where ethical, scalable access could unlock trillions in economic value, though ongoing debates around data privacy and ownership will shape its long term impact.

Please email us your feedback and news tips at hello(at)superbcrew.com

Activate Social Media:
Facebooktwitterredditpinterestlinkedin
HP