I use AssemblyAI only with audio files, not for real-time transcription. I mainly use only US English, and I have not tried other languages. I upload audio files through AssemblyAI API, and they provide the transcription script with speaker identification and the dialogues.
AssemblyAI
AssemblyAIReviews from AWS customer
-
5 star0
-
4 star0
-
3 star0
-
2 star0
-
1 star0
External reviews
External reviews are not included in the AWS star rating for the product.
Accurate transcripts with clear grammar have supported reliable speaker-based dialogue analysis
What is our primary use case?
What is most valuable?
The main features I appreciate in AssemblyAI are that it provides better accuracy compared to other transcription services, with clear grammar and no errors in spelling mistakes or grammatical mistakes, delivering clear transcription.
The primary benefit I receive from their product is much more accurate transcription. First, it is a very affordable service, and second, the accuracy is much better compared to other services such as Deepgram or AWS transcription services, which are the main benefits. Third, the speaker identification capability is better.
What needs improvement?
A few drawbacks I observed in the speaker identification are that in some videos where text and names appear on the video frames, AssemblyAI does not identify the actual speaker name, instead providing generic names such as Speaker A, Speaker B, Speaker C, or Speaker X, Y, Z.
AssemblyAI does not identify the real speaker in some audio or video files, just sending Speaker A, Speaker B, or Speaker C. They are not easily identifying speakers in some instances.
AssemblyAI does not provide a cloud service; I simply upload the audio file to the API, and they store it somewhere internally to send me the transcription text.
For additional functions, the API does not provide video uploading functionality, and I need to convert video to audio first before uploading it to AssemblyAI.
For how long have I used the solution?
I have been working with AssemblyAI for approximately one year.
How are customer service and support?
AssemblyAI should respond more quickly because when I post a ticket, they take too much time to respond to it.
Which solution did I use previously and why did I switch?
I did not continue working with Deepgram after trying it, but I recently started using AssemblyAI because Deepgram does not provide accurate transcription. I chose AssemblyAI because I did not use Deepgram again.
How was the initial setup?
I only need to create an account on AssemblyAI, and initially, they provide some credits for transcription, which is enough initially. However, if usage increases, I can purchase a subscription from there.
What's my experience with pricing, setup cost, and licensing?
I think the price for the product is a seven.
Which other solutions did I evaluate?
I can compare AssemblyAI with Deepgram. I would choose only AssemblyAI instead of Deepgram when comparing both products. The main reason I chose it is that it is far better compared to Deepgram regarding speaker identification, the clear verbatim process, and the time-stamp process, providing accurate time-stamping and the dialogues.
If I compare AssemblyAI with other services such as Gameloop, ChatAI, and Deepgram, the accuracy is far better, always maintaining the grammar and providing good, accurate text for audio or video files.
What other advice do I have?
The AssemblyAI noise filtering feature exists, but I did not use that feature. I use the existing API where I upload the audio to AssemblyAI, and after a few seconds or minutes, I continuously check if the transcription is done. Once it is done, I pass the transcription text into a file and generate an SRT file, a text file, and a doc file.
It works fine with different accents.
I rate this product an overall 8 out of 10.