Amazon Polly vs Google Cloud Text-to-Speech comparison

Cancel
You must select at least 2 products to compare!
Comparison Buyer's Guide
Executive Summary
Updated on Mar 6, 2024

We compared Amazon Polly and Google Cloud Text-to-Speech based on our user's reviews in several parameters.

Amazon Polly is praised for its natural-sounding voices, accuracy, and ease of integration, while Google Cloud Text-to-Speech is appreciated for its exceptional accuracy, wide range of languages, and real-time synthesis ability. Amazon Polly users value its customer service and pricing, while Google Cloud Text-to-Speech users appreciate its service quality and flexibility.

Features: Amazon Polly is praised for its natural-sounding voices, pronunciation accuracy, parameter adjustment flexibility, and integration ease. Meanwhile, Google Cloud Text-to-Speech offers exceptional accuracy, a wide range of languages, customizable speech delivery, real-time synthesis, and easy integration with Google services.

Pricing and ROI: Amazon Polly's setup cost is not specifically mentioned in user feedback, suggesting a seamless process. On the other hand, Google Cloud Text-to-Speech offers competitive pricing that eliminates the need for substantial setup costs. Both products have straightforward and convenient licensing processes., Amazon Polly has been praised for its cost-effective solution and high-quality voice output, resulting in a favorable return on investment. On the other hand, Google Cloud Text-to-Speech has received positive feedback for its efficiency, cost-effectiveness, and enhanced project quality.

Room for Improvement: Amazon Polly could improve its pronunciation accuracy, address audio distortion issues, and expand the variety of voices and languages available. They should also enhance the setup process and improve the interaction with text formatting and user interface. Google Cloud Text-to-Speech should provide more voice options and languages, particularly for non-English speakers. Users suggest improving accuracy and clarity in speech synthesis, better pricing options, and increased flexibility in terms of customization and control over synthesized voices.

Deployment and customer support: Based on user reviews, Amazon Polly has varying deployment and setup times, with one user taking three months for deployment and an additional week for setup. In contrast, Google Cloud Text-to-Speech also has varying durations, with some users taking three months for deployment and a week for setup, while others only need a week for both. The context of usage is critical for accurate evaluation., Customers have expressed their overall satisfaction with the customer service and support provided by Amazon Polly, praising the promptness and effectiveness of assistance. On the other hand, user reviews for Google Cloud Text-to-Speech highlight exceptional customer service and support with responsive and helpful support team.

The summary above is based on 2 interviews we conducted recently with Amazon Polly and Google Cloud Text-to-Speech users. To access the review's full transcripts, download our report.

Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"Amazon Polly is useful because it's helpful to hear the words on top of it when I can't take in information in a general way. Sometimes, it's very taxing if I'm trying to read cases. They have the neural voices, and they're so realistic. You don't even know that a person is not reading to you, making things much better. I know that they do have the ability to provide you with your own lexicon that's personal to you. I like that you can adjust the pitch and the speed of the voice because some people talk way too fast. Or if you're reading, I read slowly, so that's always helpful. One of the functions that I find helpful is that when reading material on the web, it's like it has its own browser. You go to the URL, and you don't have to read the whole thing, and you can stick the cursor on the place where you want it to start. Then if you want it to skip over something, you put it somewhere else, and that's ideal for reading case law because you skip around a lot. You don't really read it from start to finish. It helps if someone's going to read all those citations because they definitely want to be able to skip that."

More Amazon Polly Pros →

"Precision is the most valuable feature of Google Cloud Text-to-Speech because the text is perfectly voiced.""It's not complex to set up."

More Google Cloud Text-to-Speech Pros →

Cons
"The price could be better. I wish it weren't so expensive to do because it's really cool. I would love to see them have lexicon packages of them like, this is for lawyers, this is for accountants, and it's going to have a lot of things in it. I also think they could do a better job at showing use cases other than telemarketing or contact center stuff like bots that are very commercial. I know that's where the money is, but it's such a huge hole that's missing for people with disabilities that are even worse than mine. Some people cannot see or hear at all, but they're not just cognitively impaired."

More Amazon Polly Cons →

"We had some problems with Dialogflow.""Google Cloud Text-to-Speech has just one female voice and one male voice in Brazil, while it has a lot of voices in other countries."

More Google Cloud Text-to-Speech Cons →

Pricing and Cost Advice
  • "The price could be better. Neural voices are so realistic, and I want to say that they have it so that you can try to tell where the voice is coming from or something like that. But if I have more than one, it's so expensive to have to listen to a bunch of cases on my phone and have the neural voice read to me. It really wouldn't be worth it. It'd be paying probably more than what I make in the case. Right now, I'm on the free tier, and I think the number of minutes that you get is reasonable as long as you're not doing this all the time and you're using it judiciously. I have some credits that I think I can use, but I don't know how fast they'll go through."
  • More Amazon Polly Pricing and Cost Advice →

  • "I rate Google Cloud Text-to-Speech three out of ten for pricing."
  • More Google Cloud Text-to-Speech Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Text-To-Speech Services solutions are best for your needs.
    765,386 professionals have used our research since 2012.
    Questions from the Community
    Ask a question

    Earn 20 points

    Top Answer:Precision is the most valuable feature of Google Cloud Text-to-Speech because the text is perfectly voiced.
    Top Answer:I rate Google Cloud Text-to-Speech three out of ten for pricing.
    Top Answer:Google Cloud Text-to-Speech has just one female voice and one male voice in Brazil, while it has a lot of voices in other countries.
    Ranking
    Views
    4,124
    Comparisons
    2,832
    Reviews
    0
    Average Words per Review
    0
    Rating
    N/A
    Views
    3,175
    Comparisons
    2,548
    Reviews
    2
    Average Words per Review
    396
    Rating
    8.5
    Comparisons
    Learn More
    Overview

    Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Polly's Text-to-Speech (TTS) service uses advanced deep learning technologies to synthesize natural sounding human speech. With dozens of lifelike voices across a broad set of languages, you can build speech-enabled applications that work in many different countries.

    In addition to Standard TTS voices, Amazon Polly offers Neural Text-to-Speech (NTTS) voices that deliver advanced improvements in speech quality through a new machine learning approach. Polly’s Neural TTS technology also supports two speaking styles that allow you to better match the delivery style of the speaker to the application: a Newscaster reading style that is tailored to news narration use cases, and a Conversational speaking style that is ideal for two-way communication like telephony applications.

    Finally, Amazon Polly Brand Voice can create a custom voice for your organization. This is a custom engagement where you will work with the Amazon Polly team to build an NTTS voice for the exclusive use of your organization.

    Google Cloud Text-to-Speech converts text into human-like speech in more than 180 voices across 30+ languages and variants. It applies groundbreaking research in speech synthesis (WaveNet) and Google's powerful neural networks to deliver high-fidelity audio. With this easy-to-use API, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications.

    Sample Customers
    GoAnimate, Duolingo, Bandwidth
    Home Depot, Paypal, Target, HSBC, McKesson
    Top Industries
    VISITORS READING REVIEWS
    Computer Software Company17%
    Financial Services Firm9%
    University8%
    Manufacturing Company7%
    VISITORS READING REVIEWS
    Computer Software Company13%
    University10%
    Manufacturing Company9%
    Financial Services Firm9%
    Company Size
    VISITORS READING REVIEWS
    Small Business30%
    Midsize Enterprise16%
    Large Enterprise54%
    VISITORS READING REVIEWS
    Small Business26%
    Midsize Enterprise16%
    Large Enterprise58%

    Amazon Polly is ranked 2nd in Text-To-Speech Services while Google Cloud Text-to-Speech is ranked 1st in Text-To-Speech Services with 2 reviews. Amazon Polly is rated 7.0, while Google Cloud Text-to-Speech is rated 8.6. The top reviewer of Amazon Polly writes "A text to spoken audio solution with a realistic neural voice feature, but the price could be better". On the other hand, the top reviewer of Google Cloud Text-to-Speech writes "A stable solution that is used to vocalize the text written on the API to the client". Amazon Polly is most compared with Microsoft Azure Speech Service and IBM Watson Text To Speech, whereas Google Cloud Text-to-Speech is most compared with Microsoft Azure Speech Service, IBM Watson Text To Speech and ElevenLabs .

    See our list of best Text-To-Speech Services vendors.

    We monitor all Text-To-Speech Services reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.