Podcastle Launches a Text-to-Speech Model with More than 450 AI Voices

Podcastle

Podcastle

Podcastle, a podcast recording and editing platform, has launched Asyncflow v1.0, their new AI model for text-to-speech. The release introduces more than 450 AI-generated voices, allowing users to turn text into high-quality audio narration. Additionally, Podcastle also offers an API to integrate text-to-speech features into their applications.

Podcastle founder, Arto Yeritsyan, stated that building a text-to-speech model was a long-term aim for the company. However, the high costs and extensive data requirements originally made the project difficult.

Recent advancements in large language models meant Podcastle could make significant breakthroughs, allowing the company to construct high-quality voice models without requiring large data sets. Last year, the company received $13.5 million in Series A funding, which provided critical resources for innovation.

Podcast Voices.
expand image
Credit: Podcastle
Podcast Voices.

One of Podcastle's major competitive advantages is its price. The platform charges $40 for 500 minutes of text-to-speech conversion, which is significantly lower than ElevenLabs, a top competitor that charges $99 for the same length. This cost-effectiveness is because of Podcastle's efficient training and inference algorithms, which keep operational expenses low.

Aside from text-to-speech features, Podcastle has improved its voice cloning capability. Previously, users had to read approximately 70 sentences to train the AI to replicate their voice. Now, with just a few seconds of recorded audio, users can create a highly personalized AI-generated version of their voice. This development is powered by Podcastle's unique Magic Dust AI technology, which was released last year to increase audio recording quality.

Magic Dust AI.
expand image
Credit: Podcastle
Magic Dust AI.

Initial testing of Podcastle's new AI voice model shows that, while the generated voice retains some robotic elements, it effectively replicates the speaker's tone. Podcastle assures users that the technology will continue to grow, resulting in more natural-sounding output.

Yeritsyan underlined that, in addition to cost-effectiveness, Podcastle's competitive advantage stems from its entire suite of tools for podcasting, audio, video, and AI-powered narration, all accessible via a new, user-friendly interface. While most users are currently focused on audio content creation, the demand for video production is steadily growing, positioning Podcastle as a versatile player in the digital content space.