Radiologists should learn about data science to promote effective development of AI in imaging.
Artificial Intelligence begins with Data Analytics
It's said that "data is the new oil."1 Radiologists should agree — our bread and butter is interpreting complex visual data. Enter artificial intelligence (AI), with its predictions of doom for human radiologists.
Hype aside, AI and machine learning analytics are advanced techniques used in data science, a hybrid field of computer science and statistics. These are not the statistics we remember from medical school. Newer statistical techniques are increasingly concerned with classification, or grouping, of data. For example, if we take the given Data Set X, with three different classes A, B, and C, and we introduce a new data point, which group does it belong to?
To get started with data science, the venerable spreadsheet is a good entry point. But to truly understand data science, radiologists must become familiar with programming languages, functions, and analysis tools that many probably haven’t encountered before. Powerful apps like Tableau and Microstrategy provide descriptive reporting, basic statistics, and even some predictive analytics for the non-coder. But data scientists tend to use R or Python programming languages for analytics.2
R and Python allow for sophisticated analysis of complex datasets. R is an open-source programming language, with many available statistical, financial, and scientific analysis packages. Functions like Extreme Gradient boosting, LASSO, random forests, support vector machines, and k-nearest neighbors are available in R.3 Python is object-oriented and well-suited for large datasets and machine learning using Tensorflow and Keras. Hardware requirements and challenging software implementation for GPU accelerated high performance deep learning can be as lofty as the impressive results.
The powerful convolutional neural network (CNN) is particularly appropriate for classifying images, and meets a colloquial definition of AI. In the ImageNet ILSVRC challenge, from 2010-2017, image classification accuracy soared from 72 to 97 percent, even exceeding human accuracy because of CNNs. 4, 5 Recently, University of Virginia Health System began testing a product using this architecture to assist its radiologists.6
If radiologists treat AI algorithms as a 'black box' and surrender our sound clinical judgment, we risk further commoditization. By doing the groundwork to understand classification statistics, data analytics, and the computing innovations associated with AI, we can earn a seat at the table — and lend our voice to safeguard our patients and ensure that the practice of radiology is improved, not hurt, by AI.
By Stephen M. Borstelmann, MD, associate professor of radiology at the University of Central Florida School of Medicine, Boca Raton, Fla. www.n2value.com
1. Fortune. Why Data is the New Oil.
2. KDnuggets. Four Main Languages for Analytics, Data Mining, Data Science.
3. James G, Witten D, Hastie T, et al. An Introduction to Statistical Learning. New York, NY: Springer; 2013.
4. ImageNet. ImageNet Large Scale Visual Recognition Challenge.
5. Andre Karpathy Blog. What I learned from competing against a ConvNet on ImageNet.
6. American College of Radiology. Imaging 3.0 Case Study: An Extra Set of Eyes.