Sarvam AI
Not OpenAI, not Google — This Indian AI model is turning heads
An Indian homegrown AI company is drawing significant attention for building foundation AI models from scratch within the country.
Sarvam AI’s latest offerings, Sarvam Vision and Bulbul V3, are generating buzz across the tech ecosystem.
According to data shared by co-founder Pratyush Kumar, Sarvam AI has outperformed leading global AI models — including ChatGPT, Google Gemini, and Anthropic’s Claude — on select optical character recognition (OCR) benchmarks.
In a post on X, Kumar revealed that Sarvam Vision achieved an accuracy score of 84.3% on the olmOCR-Bench, surpassing Gemini 3 Pro and newer OCR systems such as DeepSeek OCR v2. ChatGPT’s performance, he noted, was significantly lower on the same benchmark.
Kumar further stated: "On OmniDocBench v1.5 (English-only subset), Sarvam Vision achieves 93.28% overall score, excelling in complex formulas and layout parsing and being within touching distance of the current state of the art."
I was wrong about Sarvam.
— Deedy (@deedydas) February 7, 2026
When I wrote about them a year ago, I felt like the direction to train small "indic" language models was wrong. But boy, have they turned it around. They have the best text-to-speech, speech-to text, and OCR models for Indic languages, and that's… pic.twitter.com/gXMOo9XzXZ
Tech commentator Deedy Das also praised the company’s progress. Reflecting on his earlier skepticism, he wrote on X: "I was wrong about Sarvam. When I wrote about them a year ago, I felt like the direction to train small 'Indic' language models was wrong. But boy, have they turned it around. They have the best text-to-speech, speech-to-text, and OCR models for Indic languages — and that's actually really valuable."
He also highlighted the company’s pricing and user experience: "The website is not only beautifully designed but dirt easy to use. They're filling a well-needed gap in the ecosystem and doing things big labs will probably never focus on to the fullest extent — at least in the short term."
Das added that while he was not familiar with the company’s business fundamentals, he was impressed with its technological achievements: "I can't remember the last time I felt this way about software products coming out of India. Well done."
In addition to Sarvam Vision, the company recently launched Bulbul V3, a new AI voice model. In a blog post, Sarvam described Bulbul V3 as its most advanced text-to-speech system, designed to deliver natural, expressive, and production-ready voices across Indian languages.
Bulbul V3 was evaluated through an independent third-party blind A/B human listening study across 11 languages.
We use Bulbul as our go-to tts model for our Indic use cases, and they have just gotten better with each release. Meanwhile, Elevenlabs cost never made sense for Indic or any other languages. https://t.co/IbRPEwFHzL
— Pratik Desai (@chheplo) February 7, 2026
Pratik Desai, founder of KissanAI, praised the model, saying: "We use Bulbul as our go-to TTS model for Indic use cases, and they have just gotten better with each release. Meanwhile, ElevenLabs’ cost never made sense for Indic or other languages."
Sarvam has also announced strategic partnerships with the governments of Odisha and Tamil Nadu.
According to the company, the collaborations aim to drive large-scale AI transformation by building compute infrastructure, sovereign models, and institutional capacity to accelerate AI adoption.
Support Our Journalism
We cannot do without you.. your contribution supports unbiased journalism
IBNS is not driven by any ism- not wokeism, not racism, not skewed secularism, not hyper right-wing or left liberal ideals, nor by any hardline religious beliefs or hyper nationalism. We want to serve you good old objective news, as they are. We do not judge or preach. We let people decide for themselves. We only try to present factual and well-sourced news.
