Manual transcription still beats AI: A comparative study on transcription services (2024)

Manual transcription still beats AI: A comparative study on transcription services (1)

A research team from the Empirical Research Support (ERS) at CISPA Helmholtz Center for Information Security has conducted a systematic comparison of the most popular transcription services. The comparison involved 11 providers of manual as well as AI-based transcriptions.

It shows that, good quality notwithstanding, the latter still have problems with speaker attribution and that there are discrepancies between recording and transcription that distort meaning. Whisper AI from OpenAI delivered the best results among the AI providers.

Interviews are a popular method for collecting scientific data. There is a basic distinction between quantitative and qualitative interviews. While the former is designed to obtain statistically usable information from a large number of participants with the help of standardized questionnaires, the latter is aimed at obtaining interview data that allow for interpretation by the researchers.

A special type is the guided interview, in which there is a prepared list of questions, which can, however, be deviated from during the interview. "In cybersecurity research, these interviews are utilized when exploring the patterns of action and interpretation of actors who operate through digital means," explains sociologist Dr. Rafael Mrowczynski from CISPA's Empirical Research Support (ERS) team. The ERS team advises the Center's researchers on methodological issues.

Converting an audio file into text

Transcription is a crucial step in qualitative data analysis. "The standard procedure is to convert the audio recordings of the interviews into text. It is important for the quality of the data that the transcriptions are adequate," Mrowczynski explains. Depending on the scientific field, there are different standards for transcription.

"In cybersecurity research, we usually work with transcripts that precisely reproduce the content of the conversation," says Mrowczynski. An adequate transcript, therefore, only contains the relevant spoken words. The researchers can obtain the transcript in two ways: Either it is created by the research team itself, or the task is outsourced to third-party providers.

Among the third-party providers, besides manual transcription, there has recently been real hype about automated, AI-based transcription. This is due to the exponential leaps in development and quality that AI applications have experienced in many areas over the last two years.

The researchers from CISPA's ERS team wanted to know which provider on the market achieves the best results and how automated, AI-based transcription performs in comparison with manual transcription. The goal was to be able to provide the researchers at CISPA and the cybersecurity community with a recommendation for working with qualitative interviews.

The approach of the ERS team

For their research project, Mrowczynski and his colleagues Dr. Maria Hellenthal, Dr. Rudolf Siegel, and Dr. Michael Schilling created a test dataset. This consisted of individual interviews lasting about ten minutes and group discussions with CISPA researchers in German and English. The content focused on the research field of cybersecurity.

"It was important that technical terms from the community were included so that the precision of the transcription could be assessed," Mrowczynski explains. Some of the interviews were additionally enhanced with background noise in order to reflect real settings in everyday research better.

The data were sent to eleven providers in December 2022. Among those were the transcription services Amberscript, GoTranscript, QualTranscribe, Rev, and Scribbl, as well as the AI-based transcription providers Amazon Transcribe, AssemblyAI,, Google Cloud, Microsoft Azure, and Whisper by OpenAI.

For the assessment of the obtained transcripts, Mrowczynski and his colleagues created a reference transcript that served as the basis for the comparative analysis. The analysis itself then focused on two central criteria. First, the researchers assessed the word error rate, which indicates by how many words a transcript differs from the reference transcript. Second, the qualitative deviation from the reference transcript was coded manually.

Manual transcription services beat AI

In their paper, Mrowczynski and his colleagues conclude that, in general, "most of the manual transcription services achieve a commendable level of performance, while AI-based services often show meaning-distorting discrepancies between recording and transcription."

The distortion of meaning can be clearly seen in technical terms; Mrowczynski explains, "In the transcript, for example, the term 'hashes' became 'ashes." That is how we came up with the title of the paper."

OpenAI's Whisper achieved the best results among the AI-based providers. Most providers handled English better than German. Three providers did not offer transcription for German at all. Background noise generally had a negative effect on the result. The AI-based providers particularly had problems with speaker assignments.

In addition, the transcripts created by an AI had to be reformatted before it was possible to further process them in software for qualitative data analysis. However, the researchers point out that their analysis reflects the state of the art as of December 2022 and that current developments could not be taken into account.

The research was presented at the 2023 CCS ACM Conference on Computer and Communications Security.

More information:Rudolf Siegel et al, Poster: From Hashes to Ashes - A Comparison of Transcription Services, Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (2023). DOI: 10.1145/3576915.3624380

Provided by CISPA Helmholtz Center for Information Security

Citation: Manual transcription still beats AI: A comparative study on transcription services (2024, April 5) retrieved 5 April 2024 from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Manual transcription still beats AI: A comparative study on transcription services (2024)


Manual transcription still beats AI: A comparative study on transcription services? ›

Manual transcription services beat AI

How good is AI at transcription? ›

AI transcription tools offer unparalleled efficiency and accuracy, transforming spoken language into text with minimal review and editing required. The integration of Natural Language Processing (NLP) enhances the ability of AI transcription tools to understand and process human language in its myriad forms.

What does transcript AI work on? ›

AI Transcription makes use of artificial intelligence to convert speech into a text document or transcript by analyzing spoken language patterns, dialects, and accents. Often, it's seen as a panacea to the time-consuming task of transcription, especially for those with tight deadlines.

What is the transcription of AI? ›

AI Transcription is the use of artificial intelligence to convert speech into a text document, or transcript. Developers use machine learning to build software that can tell when somebody is speaking and quickly convert what they say into text.

What is the most accurate automated transcription service? ›

For the best and most accurate transcripts, consider the below ten services.
  1. Riverside. Riverside. ...
  2. TranscribeMe. Price: $0.07 to $2+ per minute. ...
  3. Scribie. Scribie. ...
  4. Rev. Price: $0.25 USD per minute for automated AI transcription, $1.50 per minute for human transcription, ...
  5. Temi. Temi. ...
  6. ...
  7. GoTranscript. ...
  8. TranscriptionPanda.
Mar 4, 2024

Will transcriptionists be replaced by AI? ›

The future of medical transcription is expected to be significantly influenced by advancements in artificial intelligence (AI) and machine learning technologies. While AI has the potential to streamline and enhance the transcription process, it is unlikely to entirely replace human transcribers.

Which AI writes transcripts? ›

Powered by AI. Notta automatically converts meetings, interviews, and other conversations into searchable text. Transcribe, edit, summarize, and collaborate in a single workflow to stay productive. Notta can now help you transcribe your meeting conversations with real-time translation.

What is the cheapest AI transcription software? ›

What makes the best transcription software?
RevStandard audio and video transcription$0.25/min (or $29.99/month for 20 hours)
DescriptStandard audio and video transcriptionFrom $15/month per user
Fireflies.aiMeeting transcriptionFrom $18/month per user
GrainMeeting transcriptionFrom $19/month per user
2 more rows
May 18, 2023

Is there an AI tool to transcribe for free? ›

2. Riverside. Riverside is a free recording platform that offers audio and video-to-text transcriptions in over 100 languages. It claims to transcribe content with 99% accuracy, which is a huge boon for a free tool, and also offers recording and editing features at no extra cost.

Is AI transcription safe? ›

Safety and Security

AI transcription tools and human transcription solutions can be ideal if you need confidentiality. However, it is important to consider the security features of the transcription tool to ensure that it is encrypted.

How long does AI transcription take? ›

Most files are transcribed in less than 5 minutes but we will always provide an estimate. You will receive an email when your AI transcripts are ready. Do you offer AI transcription services in other languages? No, our speech recognition engine only transcribes English audio and video files.

How fast can AI transcribe audio? ›

As a rough estimate, it may take anywhere from 4 to 10 hours to transcribe one hour of audio using speech recognition software, depending on the factors mentioned above. This means that a 10-hour audio recording could take anywhere from 40 to 100 hours to transcribe into a Word file using speech recognition software.

What is the free AI that converts audio to text? ›

Which AI can transcribe audio to text for free? Many AI-powered tools can transcribe audio to text for free, including Descript,, MacWhisper, and Google Docs Voice Typing.

Can AI transcribe audio to text? ›

You can transcribe almost any audio file using Restream's free AI transcription tool. Select Choose File to get started.

What AI converts audio to text? ›

Which AI converts audio to text? Maestra's AI audio to text converter can transcribe audio recordings, podcasts, lectures, or any kind of audio file within seconds with impressive accuracy. Maestra is up to date on AI technology and provides a state of the art audio to text converter for everyone to use.

How good is AI at translating? ›

The capability of AI to translate texts or speech has become faster than human translators, particularly for large volumes of data or multiple languages. Translation quality is also improved via reducing errors, ambiguities, or biases.

How accurate is the AI translation? ›

While AI can translate language quickly, it lacks the human touch and understanding needed for accurately translating complex and nuanced content. This is especially concerning in industries where accuracy and precision are crucial, as even a small mistranslation can have serious consequences.

How good is AI translation? ›

AI might be slightly more advanced than basic machine translation, but it's still nowhere near the level of an experienced translator. The supposed benefits of using machine and AI translation are that they save time and cut costs.

Top Articles
Latest Posts
Article information

Author: Stevie Stamm

Last Updated:

Views: 6476

Rating: 5 / 5 (80 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Stevie Stamm

Birthday: 1996-06-22

Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

Phone: +342332224300

Job: Future Advertising Analyst

Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.