Medical Notes Using Speech Recognition: Less Than Perfect Interview with:

Li Zhou, MD, PhD, FACMI Associate Professor of Medicine Division of General Internal Medicine and Primary Care Brigham and Women’s Hospital, Harvard Medical School Somerville, MA 02145

Dr. Li Zhou

Li Zhou, MD, PhD, FACMI
Associate Professor of Medicine
Division of General Internal Medicine and Primary Care
Brigham and Women’s Hospital, Harvard Medical School
Somerville, MA 02145 What is the background for this study? What are the main findings?

Response: Documentation is one of the most time-consuming and costly aspects of electronic health record (EHR) use.

Speech recognition (SR) technology, the automatic translation of voice to text, has been increasingly adopted to help clinicians complete their documentation in an efficient and cost-effective manner. One way in which SR can assist this process is commonly known as “back-end” SR, in which the clinician dictates into the telephone, the recorded audio is automatically transcribed to text by an speech recognition engine, and the text is edited by a professional medical transcriptionist and sent back to the EHR for the clinician to review and sign.

In this study, we analyzed errors at different processing stages of clinical documents collected from 2 health care institutions using the same back-end SR vendor. We defined a comprehensive schema to systematically classify and analyze these errors, focusing particularly on clinically significant errors (errors that could plausibly affect a patient’s future care). We found an average of 7 errors per 100 words in raw  speech recognition transcriptions, and about 6% of those errors were clinically significant. 96.3% of the raw speech recognition transcriptions evaluated contained at least one error, and 63.6% had at least one clinically significant error. However, the rate of errors fell significantly after review by a medical transcriptionist, and it fell further still after the clinician reviewed the edited transcript. What should readers take away from your report?

Response: Seven percent of words in unedited clinical documents created with speech recognition technology involve errors, and 1 in 250 words involves an error that may impact a patient’s future care. The comparatively low error rate in edited and signed notes highlights the crucial role that manual review plays in the speech recognition-assisted documentation process. Automated error detection and correction methods using natural language processing technology may also help reduce the number of errors in documents created with speech recognition. What recommendations do you have for future research as a result of this work?

Response: The error classification schema we developed can be used to annotate errors in more notes, obtained from a wider variety of providers and clinical domains, to create a robust corpus of errors SR-generated clinical documents. A corpus of this nature would support many important research tasks, from increasing the reliability of error prevalence estimates to serving as training data for the development of an automatic error detection system. Such work will be vital to ensuring the effective use of clinicians’ time and to improving and maintaining documentation quality, all of which can, in turn, increase patient safety.

Disclosures: This study was funded by the Agency for Healthcare Research and Quality (AHRQ) R01HS024264.


Zhou L, Blackley SV, Kowalski L, et al. Analysis of Errors in Dictated Clinical Documents Assisted by Speech Recognition Software and Professional Transcriptionists. JAMA Network Open. 2018;1(3):e180530. doi:10.1001/jamanetworkopen.2018.0530

The information on is provided for educational purposes only, and is in no way intended to diagnose, cure, or treat any medical or other condition. Always seek the advice of your physician or other qualified health and ask your doctor any questions you may have regarding a medical condition. In addition to all other limitations and disclaimers in this agreement, service provider and its third party providers disclaim any liability or loss in connection with the content provided on this website.


1 thought on “Medical Notes Using Speech Recognition: Less Than Perfect

  1. Dr. Li Zhou,
    Interesting article.

    A couple other major issues concerning verbal documentation are the lack of privacy and interference from background noise.

    Recently we have noticed a lot of clinicians using a specialized microphone called stenomask. Stenomask enables you to accurately document patient information without being overheard by others while simultaneously eliminating all background noise. It functions like a hand held sound-booth guaranteeing clear voice communications in loud and busy clinical settings.

    If you would like to learn more please visit

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.