Deep-learning system detects fractures on X-rays across the musculoskeletal system

A deep-learning system detecting fractures across the musculoskeletal system can help reduce common diagnostic errors and improve clinical outcomes

Like Comment
Read the paper

Misdiagnosed fractures are the most common interpretational error made by physicians on musculoskeletal X-rays 1, a frequent cause of malpractice claims 2,3, and the leading cause of diagnostic errors in Emergency Departments 4. Readily identifying fractures on X-ray is a challenging task because fractures can occur on any bone and their appearance varies depending on local anatomy and X-ray projection. Despite the many clinical challenges for fracture detection, there are currently no reliable tools to aid clinicians interpreting radiographs across the musculoskeletal system.

Output from deep-learning systems can improve diagnostic outcomes 5, yet prior deep-learning systems for fracture detection have been circumscribed to individual bones, one anatomic region, or a single practice setting. The existing anatomical coverage has limited the impact of deep-learning systems to broadly improve diagnostic accuracy.

In the current publication of npj Digital Medicine, our team of machine learning scientists, clinical researchers, and practicing physicians describe a deep-learning system to detect fractures across the musculoskeletal system. We trained the deep-learning system with data annotated by 18 expert orthopedic surgeons and 11 radiologists and tested the performance of the deep-learning system on its ability to emulate the physicians.

The quality of any deep-learning system’s predictions is dependent on the data used to train the algorithms. To that end, we used 715,343 manually annotated radiographs that represented 314,866 patients from 15 hospitals for training, a sample more than twice as large as prior published work. Creating a deep-learning system from over a quarter of a million patients increased the likelihood that the technology performs well for different types of fractures. As shown in the figure below, the machine learning algorithm makes predictions about the presence or absence of fractures and when a fracture is present, produces a bounding box around the fracture site.

Figure 1: The deep-learning system.

On a test dataset of 16,019 radiographs, representing 12,746 patients, a sample size at least four times larger than previously published work, we reported an overall AUC of 0.974. Sensitivity was 95.2%, and specificity was 81.3%. Over half of the 16 regional anatomies had mean AUCs above 0.98; foot was the lowest-performing with an AUC of 0.888, likely due to the significant visual complexity of this region. Performance was high even on fractures that are more visually challenging to detect, such as fractures without lucent lines or fractures without callus formations. AUCs across all fracture types were above 0.94. 

Deploying the deep-learning system could impact clinical workflows by providing more accurate and timely diagnosis. Clinicians without a specialization in musculoskeletal imaging such as those in primary care or urgent care settings have limited training in identifying fractures across their many distinct and often-subtle presentations and therefore, could benefit from the deep-learning system.

Reducing patient misdiagnoses and improving diagnostic quality could have ramifications for improving patient outcomes and reducing healthcare costs. Misdiagnosed fractures can lead to malunion or attendant morbidity which can lead to more healthcare visits or unnecessary trips to the ER. These avoidable healthcare system touchpoints are costly to the patient, provider, and payer. Future work is needed to test how our technology can prospectively reduce diagnostic errors and improve diagnostic outcomes. 

To learn more, read our free and open access article: Assessment of a deep-learning system for fracture detection in musculoskeletal radiographs, published by npj Digital Medicine.


  1. Donald, J. J. & Barnard, S. A. Common patterns in 558 diagnostic radiology errors. J. Med. Imaging Radiat. Oncol. 56, 173–178 (2012).
  2. Berlin, L. Defending the ‘Missed’ Radiographic Diagnosis. American Journal of Roentgenology vol. 176 317–322 (2001).
  3. Whang, J. S., Baker, S. R., Patel, R., Luk, L. & Castro, A., 3rd. The causes of medical malpractice suits against radiologists in the United States. Radiology 266, 548–554 (2013).
  4. Hallas, P. & Ellingsen, T. Errors in fracture diagnoses in the emergency department--characteristics of patients and diurnal variation. BMC Emerg. Med. 6, 4 (2006).
  5. Lindsey, R. et al. Deep neural network improves fracture detection by clinicians. Proc. Natl. Acad. Sci. U. S. A. 115, 11591–11596 (2018).

Robert Lindsey

Chief Science Officer, Imagen Technologies


Go to the profile of SteterTropfen 💧
3 months ago

Interesting study! As all doctors know there are easy and hard to spot fractures. Do you have any idea what the performance of this software for the hard ones is? Maybe you can get a dataset with cases where doctores were sued for misdiagnosis.