At AVP, we chose to invest in Deepgram because they’re delivering the world’s most accurate and real-time Voice AI platform, because human interaction with AI is shifting from text to voice, and because we believe ‘Powered by Deepgram’ is to Voice AI as ‘Powered by Stripe’ or ‘Intel Inside’ are to their respective industries.
In personal computing, ‘Intel Inside’ tells you that your laptop has a capable CPU, regardless of who manufactured it. In digital commerce, ‘Powered by Stripe’ signals that you can expect secure and seamless payments, regardless of where you’re shopping. Both companies became the enabling infrastructure of their era’s platform shift. Their innovations created category-defining businesses and serve as the foundational layer other products are built on.
Deepgram is building the equivalent infrastructure layer for Voice AI. It delivers the foundation models, API platform, and infrastructure that underpins real-time, accurate, and reliable speech understanding, speech generation, and fully autonomous voice agents for over 1,300 organizations.
Voice AI Market
The enterprise Voice AI market is accelerating faster than expected, picking up steam in late 2024, with a step-change in adoption and investment in 2025. Industry forecasts expect the global Voice AI agent market to grow by a ~35% CAGR over the next decade, from ~$2.4 billion in 2024 to ~$47.5 billion in 2034. Voice AI is also the hurdle that unlocks true multimodal (text + voice) AI use cases that are forecasted to reach tens of billions of incremental spend in the medium-term.
Text-based interactions with technology introduce friction and cannot deliver a real-time experience. Voice opens the door to a long and expanding list of use cases that aren’t possible with text-based interactions with technology including:
- Ordering food from a drive-thru system that autonomously takes an order, answers questions, and syncs with the restaurant’s point-of-sale (POS) system
- Calling into a pharmacy and confirming a prescription is filled and ready for pickup without a pharmacy technician on the line
- Reducing wait times on a customer support line from hours to moments
Shifting our interactions from text to voice is the final point of friction between humans and software, permitting real-time automation and knowledge work in real-life use cases. We expect that our day-to-day interactions with AI will grow and work in an intuitive way that currently seems unthinkable, and we believe Deepgram’s platform will be a driving force that enables the industry to clear the hurdle.
Deepgram’s Position in Voice AI
Leadership in Voice AI is measured in two dimensions: delivering best-in-class foundation models (that are low latency and highly accurate) and delivering an easily usable enterprise layer (deployment options, implementation, tooling, reliability, and security). Deepgram excels at both, offering best-in-class models and a developer experience that customers love.
Deepgram’s flagship models are industry-leading:
- Nova-3, a real-time enterprise speech-to-text model recognized for industry-leading accuracy and reliability
- Flux, the world’s first real-time Conversational Speech Recognition model built specifically for voice agents and capable of handling natural interruptions and turn-taking
- Aura-2, a professional-grade text-to-speech system focused on clarity, realism, and ultra-low latency
Deepgram’s models are packaged into products that are ready for real-life applications:
- Models can be trained on domain-specific and company-specific terminology, improving accuracy and contextualizing vocabularies for customers’ workflows
- Models can be deployed via managed cloud APIs, in a private environment (VPC), or on-premises
- Deepgram publishes a full SDK library and complete documentation that simplifies development cycles and accelerates production timelines
- Deepgram’s products are designed with compliance, security, and integration flexibility in mind
Deepgram’s excellence on both dimensions is why companies including Cloudflare, Twilio, Decagon, Sierra, and Cresta, are building on Deepgram’s platform. We again see parallels to Stripe’s rise in the payments industry – a developer-first product (simple API, great docs, production reliability) creating an adoption flywheel, the same pattern we believe will define the Voice AI infrastructure layer.
What’s Next
We’re incredibly excited to lead Deepgram’s $130m Series C round, joined by several existing and new investors including Alkeon, Alumni Ventures, funds & accounts managed by BlackRock, Citi Ventures, Columbia University, In-Q-Tel, Madrona, Princeville Capital, Tiger, University of Michigan, SAP, ServiceNow Ventures, Stanford University, Twilio, Wing, and Y Combinator.
Scott and the entire Deepgram team have proven their unique vision for the Voice AI market is resonating with customers. Going into our initial conversations with Scott, we already had a strong thesis in Voice AI infrastructure, but his pitch reinforced our conviction in the need for an enterprise-ready, optimized, real-time Voice AI layer. We rarely see companies that are so perfectly situated for a massive market opportunity – with explosive growth in Voice AI, a highly capable and innovative frontier research team, a pragmatic and skilled product team, and a go-to-market team that earns customer trust.







