Introduction

Artificial intelligence (AI) is the process of having a computational program that can perform tasks of human intelligence (e.g. pattern recognition) by mimicking human thought processes [1]. These programs have been developed since the 1970s. It's been used in gaming, social media, and robotics for a long time. Machine learning has been developed since the 1980s. Machine learning is relevant for medicine as a precursor to AI. Conventional machine learning is a method in which a computer automatically provides appropriate judgments from inspection images by giving control rules (e.g., features). After 2012, deep learning has been developed by automated learning of features extracted through repetitive trial and error. Deep learning is a type of machine learning, the sophistication of its internal structure and learning methods enabling more accurate results than conventional machine learning [2]. Deep learning identifies patterns progressively from large databases without being explicitly programmed. In the last 10 years, deep learning has led to new strategies of machine learning and plays a central role in image recognition as a breakthrough method [3].

AI has influenced every field of cardiovascular imaging in all phases from acquisition to reporting [4,5,6]. Such examples of this are AI-developed computed tomography and magnetic resonance image measurement of lumen diameter, recognition of coronary calcium score, recognition of obstructive coronary disease, automated acquisition, segmentation, and report generation [7,8,9]. Compared with these methods, there is an issue of high observer variation in the interpretation of echocardiograms. AI can help minimize the observer variation and provide an accurate diagnosis in the field of echocardiography. In this review, we summarize the necessity for automated diagnosis in the echocardiographic field, and discuss results of AI application to echocardiography and future perspectives.

AI in the COVID-19 era

With the development of echocardiography, the number of echocardiographic parameters in routine examinations has increased and examination techniques have become more complex. Commercially available portable devices have expanded the range of physicians who perform echocardiography, not only cardiologists but also physicians in emergency medicine, anesthesiology, and general internal medicine. Thus, there may be a need for these non-specialists to make decisions based on echocardiographic findings.

Moreover, the outbreak of COVID-19 infection 2020 [10], all lifestyles and industries affected to some extent, such as lockdown on movement and a greater dependence in telecommunications. This wave of infections has also affected how we approach cardiovascular care and echocardiography. As shown in Supplemental Fig. 1, before the COVID-19 era, the patient was moved to the echo room and underwent an echocardiogram by expert sonographer or cardiologist. However, in an environment with a rapidly increasing number of confirmed/suspected COVID-19 patients, non-specialists, such as emergency physicians, are more likely required to perform this examination in the infection control room. In the usual settings of echocardiography, the risk of infection is high because practitioners must be in close proximity to the patient. Thus, personal protection is crucial in the COVID-19 era [11,12,13]. Many infected and sick patients need a quick, accurate, and reproducible diagnosis. Importantly, scanning time should be reduced to limit possible exposure. There have been many recommendations to reduce the scanning time by acquiring limited images, and measuring off site [12]. The problem with this approach is that we could miss important findings when we are mainly focused on taking many images instead of doing a thorough echocardiographic study. Also, when we are performing the measurements off site, we might realize that we have not taken the necessary images at the appropriate angles. All of this can relate to poor patient outcomes from missed diagnosis. AI can be an essential aid for solving these problems.

Overview of deep learning

To understand the role of deep learning in echocardiography, it is necessary to clarify the difference between conventional machine learning and deep learning. The biggest difference between deep learning and conventional machine learning is that deep learning uses multilayer neural networks. Using neural networks, the accuracy of models has been dramatically improved, especially in the fields of visual object recognition, object detection, and word recognition.

Learning with artificial intelligence can be classified into three categories: supervised learning, unsupervised learning, and reinforcement learning (Fig. 1). Supervised learning is a method of learning by labeling training data with the correct answers. Many medical image analysis studies have been conducted on images that humans correctly labeled beforehand. In the echocardiographic field, thousands of images with predefined labels should be included when making new models. Notably, no unlearned answer would be presented in any training model. For example, if a model for myocardial infarction is trained with only the options "with myocardial infarction" or "without myocardial infarction", the training model cannot determine the culprit coronary artery due to the lack of specific training.

Fig. 1
figure 1

Artificial intelligence including deep learning and their tasks

Unsupervised learning is a method of learning without labels, one famous example being the AI Google created in 2012, that could recognize a cat without manual training [14]. This AI came to recognize a "cat" autonomously as it read web images and videos without labels, its learning process similar to human learning. Reinforcement learning, an example being Google's AlphaGo AI beating a professional Go player, is a method for finding ways to "maximize future value" through trial and error [15]. It is similar to supervised learning, but the machine predicts future values in a broader sense (in the case of Go, winning moves). Unsupervised learning and reinforcement learning are not yet something that we generally use in imaging, so here we focus more on supervised learning in this review.

An overview of the studies for the diagnostic ability of current deep learning models in the field of echocardiography in Table 1 [16,17,18,19,20,21,22,23,24,25,26,27]. The accuracy of AI models has been achieved around 80–90%. Sample sizes in these studies ranged from hundreds to tens of thousands. Recently, the method of AI is sophisticated and it has become possible to create accurate AI models without using a large number of images. In the future, many institutions will develop research from a reasonable sample size.

Table 1 Clinical deep learning studies for echocardiography

Steps of echocardiographic AI

Echocardiography is a useful tool in the diagnosis and management of cardiovascular diseases. Accurate and reproducible echocardiographic studies are required for good clinical practice. Echocardiography requires not only the adequate acquisition of images but also proper interpretation of images. The same physician may come to different conclusions on the same images. Thus, more objective and reproducible methods are needed. The process of applying AI to echocardiography involves four steps. We humans subconsciously perform these steps during each examination. However, when we develop the application of AI, it is necessary to divide these steps into separate tasks to reach a diagnosis (Fig. 2).

Fig. 2
figure 2

Steps to develop artificial intelligence models for echocardiography

1st step: maintaining image quality

The first step in the application of AI to echocardiography is the evaluation of image quality. In humans, experience must obtain accurate images. Poor-quality images are often acquired by inexperienced examiners. Since image quality is directly linked to diagnostic accuracy, it is important to evaluate image quality correctly [28]. A recent paper showed that the accuracy of AI-based classification for image quality was excellent (score error: 0.11 ± 0.09). This proposed approach could also be generalized to other images involving deep learning in the cardiovascular field, where there are frequent gaps in clinical labeling [16].

In February 2020, the U.S. Food and Drug Administration (FDA) approved the first AI based echocardiographic device with "guiding to acquire appropriate images with good quality". The system guides the user on which way to adjust the probe for the acquisition of a better image, and is designed to be easy to use even for beginners [26]. The condition for use of this device is that a cardiologist must review and approve the acquired images for patient evaluation, rather than leaving everything to the machine. The extent to which everything can be trusted the machines, is an important problem that should be discussed in the future.

2nd step: view classification

The next step is the view classification and segmentation of cardiovascular structures. Echocardiographic images require many types of recordings because of the complexity of the cardiac structure. The view classification of cardiovascular structures can be useful in automated scans or detection of appropriate views. Several investigators reported a good accuracy for view classification models (accuracy: around 91–94% for 15-view classification) [17, 18]. Recently, we reported on our newly developed view classification model, based on convolutional neural network using 17,000 images. In this model, there were 1.9% mislabeled images. Clustering analyses showed that the neural network could sort heterogeneous input images into five pre-determined views (Fig. 3). In addition, to determine if the 98.1% accuracy rate was acceptable for creating a feasible prediction model, we tested the prediction model for ejection fraction (EF) using the learning data with a 1.9% error rate. The accuracy of the prediction model for EF was warranted, even with training data containing 1.9% mislabeled images. Thus, this approach may provide a clinically feasible method of view classification for the analysis of echocardiographic data [27]. Upon inspection of the misclassified images, we found that many of them were difficult even for experts to judge, suggesting that deep learning had been used to successfully imitate human reorganization (Supplemental Fig. 2).

Fig. 3
figure 3

Clustering analyses. Data through the last fully connected layer of the neural network are represented in two-dimensional space, showing organization into clusters according to view category

3rd step: measurements

Once the image quality has been assessed and appropriate views have been determined, the next step is to measure and quantify the morphological structure. For example, the different frameworks between conventional machine learning and deep learning for the ejection fraction (EF) estimating method are shown in Fig. 4. In conventional machine learning for prediction of EF, after pre-processing of images, a human data scientist manually configures a point to focus their attention (region of interest: ROI and segmentation) and extracts features of the target region. Then, the extracted features are used to create a classification model while performing dimension reduction. In deep learning, all feature extraction steps are embedded in the algorithm, allowing for end-to-end learning to be performed.

Fig. 4
figure 4

The flowchart for the estimation of ejection fraction. In deep learning, all feature extraction steps are embedded in the algorithm, allowing for end-to-end learning to be performed

Zhang et al. developed a method based on a deep learning approach for a fully automated analysis of echocardiographic data [18]. Their model showed a mean percentage error of approximately 10% for EF from the apical 2-chamber view, and 20% for EF from the apical 4-chamber view. Another paper showed that the EF based on AP2/AP4 views correlated well with reference EF (mean absolute deviation = 2.9%) [20]. Our recent report adds to this by demonstrating better performance of a deep learning algorithm when five views are used. Results of receiver operating characteristic (ROC) analysis were used to assess the diagnostic ability of classification of reduced EF (< 50%) are shown in Fig. 5a. For the classification of patients with reduced EF, we have compared the area under curves (AUCs) from each model, and the model with largest AUC was based on averaged 5 views (AUC: 0.99 ± 0.01 on fivefold cross validation). The AUC of estimated LVEF based on the averaged 5 views was larger to the AUC of estimated LVEF based on the other views (AUC by AP2 and AP4: 0.95 ± 0.01, AUC by AP2: 0.94 ± 0.01 and AUC by AP4: 0.93 ± 0.01 on fivefold cross validation, compared p value < 0.05) [21]. According to these results, it may be more accurate to make a prediction model for LVEF from multilevel images in the clinical setting.

Fig. 5
figure 5

Diagnostic ability for classification of low ejection fraction (a) and regional wall motion abnormality (b)

4th step: detection of abnormality

The last step is the detection of abnormality. One of the most important evaluations in echocardiography is the detection of regional wall motion abnormalities (RWMAs). The presence of wall motion abnormalities is directly linked to the treatment decisions. Detection of regional wall motion abnormalities by echocardiography is a class I recommendation in multiple cardiovascular guidelines in patients complaining of chest pain [29, 30]. However, the evaluation of regional wall motion abnormalities has been traditionally subjective and relies on visual judgment, such examples being endocardial motion and myocardial thickness. Thus, an adequate objective method has long been desired.

Machine-learning models have been developed to identify and quantify RWMAs [24, 25]. Recently, our group developed an AI model for automated detection of RWMAs in myocardial infarction, using a deep learning algorithm including ResNet, DenseNet, Inception-ResNet, Inception, and Xception for a convolutional neural network [31,32,33]. We have compared the AUCs by several deep learning algorithms for detecting wall motion abnormalities. The largest AUC was ResNet (AUC: 0.97), but there was no significant difference among algorithms except for the Xception model (ResNet: AUC: 0.97, DenseNet: AUC: 0.95, Inception-ResNet: AUC: 0.89, Inception: AUC: 0.90, and Xception: AUC: 0.85, vs. other algorithms, p < 0.05). For the detection of the presence of RWMAs, the AUC of deep learning algorithm was similar to that of an experienced cardiologist/sonographer (0.97 vs. 0.95, p = 0.61), and significantly higher than the AUC of resident physicians (0.97 vs. 0.83, p = 0.003) (Fig. 5b). According to our findings, detection system of RWMAs may be of great value in the clinical setting.

Problems in AI

Building models using deep learning require a lot of data. To overcome this problem, the Japan Society of Ultrasonics in Medicine and Japanese Society of Echocardiography have begun a joint multicenter study to collect labeled echocardiographic data from 2018 onward. With this dataset, an effective AI model could be developed in the near future. The lack of interpretability of AI solutions was often considered as a major disadvantage. The overcoming of this limitation should be addressed in future studies.

Conclusions

Currently, there are two roles for AI in cardiovascular imaging. One is the automation of tasks usually performed by humans, such as image segmentation and measurement of cardiac structural and functional parameters. Another is the discovery of clinically important insights. Most reported applications were focused on the automation of tasks. Moreover, algorithms that can obtain cardiac measurements are also being developed. In the next stage, AI can be expected to expand and enrich existing knowledge. With the continual evolution of technology, cardiologists should become well versed in the new knowledge of AI and be able to harness it in the clinical setting. AI can be incorporated into everyday clinical practice and become a valuable aid for many healthcare professionals dealing with cardiovascular diseases.