Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Detecting hand joint ankylosis and subluxation in radiographic images using deep learning: A step in the development of an automatic radiographic scoring system for joint destruction

  • Keisuke Izumi ,

    Contributed equally to this work with: Keisuke Izumi, Kanata Suzuki

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing

    izz@keio.jp

    Affiliations Department of Internal Medicine, Division of Rheumatology, Keio University School of Medicine, Tokyo, Japan, Medical AI Center, Keio University School of Medicine, Tokyo, Japan, Division of Rheumatology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan

  • Kanata Suzuki ,

    Contributed equally to this work with: Keisuke Izumi, Kanata Suzuki

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Writing – original draft, Writing – review & editing

    Affiliations Medical AI Center, Keio University School of Medicine, Tokyo, Japan, Fujitsu Limited, Kanagawa, Japan

  • Masahiro Hashimoto,

    Roles Data curation, Investigation, Methodology, Writing – review & editing

    Affiliations Medical AI Center, Keio University School of Medicine, Tokyo, Japan, Department of Radiology, Keio University School of Medicine, Tokyo, Japan

  • Toshio Endoh,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Fujitsu Limited, Kanagawa, Japan

  • Kentaro Doi,

    Roles Investigation, Methodology, Project administration, Writing – review & editing

    Affiliation Fujitsu Limited, Kanagawa, Japan

  • Yuki Iwai,

    Roles Investigation, Methodology, Project administration, Writing – review & editing

    Affiliation Fujitsu Limited, Kanagawa, Japan

  • Masahiro Jinzaki,

    Roles Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Medical AI Center, Keio University School of Medicine, Tokyo, Japan, Department of Radiology, Keio University School of Medicine, Tokyo, Japan

  • Shigeru Ko,

    Roles Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Medical AI Center, Keio University School of Medicine, Tokyo, Japan, Department of Systems Medicine, Keio University School of Medicine, Tokyo, Japan

  • Tsutomu Takeuchi,

    Roles Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Department of Internal Medicine, Division of Rheumatology, Keio University School of Medicine, Tokyo, Japan, Medical AI Center, Keio University School of Medicine, Tokyo, Japan

  • Yuko Kaneko

    Roles Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Internal Medicine, Division of Rheumatology, Keio University School of Medicine, Tokyo, Japan

Abstract

We propose a wrist joint subluxation/ankylosis classification model for an automatic radiographic scoring system for X-ray images. In managing rheumatoid arthritis, the evaluation of joint destruction is important. The modified total Sharp score (mTSS), which is conventionally used to evaluate joint destruction of the hands and feet, should ideally be automated because the required time depends on the skill of the evaluator, and there is variability between evaluators. Since joint subluxation and ankylosis are given a large score in mTSS, we aimed to estimate subluxation and ankylosis using a deep neural network as a first step in developing an automatic radiographic scoring system for joint destruction. We randomly extracted 216 hand X-ray images from an electronic medical record system for the learning experiments. These images were acquired from patients who visited the rheumatology department of Keio University Hospital in 2015. Using our newly developed annotation tool, well-trained rheumatologists and radiologists labeled the mTSS to the wrist, metacarpal phalangeal joints, and proximal interphalangeal joints included in the images. We identified 21 X-ray images containing one or more subluxation joints and 42 X-ray images with ankylosis. To predict subluxation/ankylosis, we conducted five-fold cross-validation with deep neural network models: AlexNet, ResNet, DenseNet, and Vision Transformer. The best performance on wrist subluxation/ankylosis classification was as follows: accuracy, precision, recall, F1 value, and AUC were 0.97±0.01/0.89±0.04, 0.92±0.12/0.77±0.15, 0.77±0.16/0.71±0.13, 0.82±0.11/0.72±0.09, and 0.92±0.08/0.85±0.07, respectively. The classification model based on a deep neural network was trained with a relatively small dataset; however, it showed good accuracy. In conclusion, we provided data collection and model training schemes for mTSS prediction and showed an important contribution to building an automated scoring system.

Introduction

Rheumatoid arthritis (RA) is an inflammatory disease of the joints, in which the joints are destroyed as the disease progresses. In the practice of RA, the evaluation of joint destruction is important. Among rheumatic diseases, RA is the most common, and joint destruction affects the daily activities and quality of life of patients. One of the unique tasks in diagnosing RA is estimating the van der Heijde-modified total Sharp score (mTSS [1]) based on X-ray images. Clinical studies widely use mTSS as a quantitative evaluation of joint destruction. However, mTSS has some challenges: the evaluation requires skill, time, and effort; in addition, the scores given by the evaluators are not consistent. Consequently, mTSS is not used in daily practice because it is cumbersome, and an automatic radiographic scoring system is required.

Image recognition technology using deep neural networks (DNNs) has exceeded human performance in various tasks, such as classification [2], object detection [3], and object segmentation [4]. DNN can automatically extract image features in its middle layers which enables it to handle high-dimensional images without complicated preprocessing. In the medical field, DNNs are beginning to be applied to the discrimination of diabetic retinopathy, skin cancer, gastrointestinal cancer, etc. [59], and are expected to be an alternative to diagnostic work.

In rheumatic diseases, DNNs are used in some studies. Burlina et al. predicted myositis from ultrasound images of muscles [10]. Lin et al. predicted methotrexate liver damage from electronic medical records [11]. Raddy et al. predicted readmission within 30 days of systemic lupus erythematosus discharge [12]. As in other medical fields, the use of data from various domains such as ultrasound images and medical records is increasing. This study applies DNNs to the rheumatic diagnosis of subluxation and ankylosis using X-ray images and verifies their effectiveness.

There is a study that predicts mTSS end-to-end by combining two DNNs that predict joint areas and their scores [13]. The study used a single X-ray image and a different procedure from the normal mTSS derivation. The results showed a low score. We aim to predict joint subluxation and ankylosis, which is part of the mTSS procedure, and achieve high accuracy within a limited scope as a first step in the development of an automatic radiographic scoring system for bone destruction. Subluxation/ankylosis prediction is important in improving the accuracy of mTSS because subluxation and ankylosis are given a large score in mTSS.

In this study, we trained DNNs end-to-end to predict wrist subluxation and ankylosis on X-ray images. We developed dedicated annotation tools for RA to efficiently collect the data needed to train the model. Our work is the first step in developing an automatic radiographic scoring system for bone destruction in X-ray images using deep learning. The proposed model contributes toward improving mTSS accuracy. Finally, we discuss the model behavior by visualizing the parts that contribute to the prediction result of the trained model.

Method

Patients and dataset

We randomly extracted 216 hand X-ray images acquired from patients who visited the rheumatology department of Keio University Hospital in 2015 from the electronic medical record system of the hospitas. One X-ray image included a pair of hands. Patients with RA who met the 1987 ACR classification criteria or the 2010 EULAR/ACR classification criteria were selected. This study protocol was approved by the Ethics Committee at Keio University School of Medicine (No. 20160316), and written informed consent was waived because of the retrospective study design.

We annotated subluxation/ankylosis in proximal interphalangeal (PIP), metacarpal phalangeal (MP), and wrist joints in the extracted X-ray images using our newly developed image labeling software (annotation tool) and based on the agreement between well-trained rheumatologist and radiologist. After annotation, 21 X-ray images containing one or more subluxation joints and 42 X-ray images with ankylosis were identified and used to train the DNNs.

Annotation tool

In this study, we developed a dedicated annotation tool for mTSS [14]. Fig 1 shows an overview of the annotation tool. Since mTSS needs to be scored by comparing the images at two different times, the tool always displays two images. A window is displayed for each image, and the display screen can be enlarged, reduced, moved, and reset (Fig 1a). X-ray images were saved in DICOM format and displayed in 2010 × 1670 pixels and 1024 gradations. The image resolution was designed in consultation with rheumatologists to ensure their sufficiency for scoring mTSS.

thumbnail
Fig 1. Developed annotation tool for automatic radiographic scoring system.

Windows (a) are the image windows that show X-ray images to be annotated, and windows (b) to (d) are the annotation input windows. Since our tool works on the electronic medical record system, the annotators annotated the images during their spare time.

https://doi.org/10.1371/journal.pone.0281088.g001

The annotator input scores are shown on a dedicated mTSS screen. Since the X-ray images and input screens are displayed in separate windows, the annotators can choose the convenient layout to work with. The input window has a simple hand diagram that shows the current input location (Fig 1b), a finding selection screen (Fig 1c), and an input image selection screen (Fig 1d). The scoring order of the tool was set according to the mTSS rules. The annotators used either the mouse or keyboard to annotate.

Annotating medical images is usually very laborious. Healthcare professionals are so busy that they cannot do a lot of annotations during off-hours. However, since the amount of training data affects the performance of DNNs, we developed a system to make annotation work efficient. Our tool was designed to operate on the electronic medical record system. The annotators were able to perform annotation in their spare time and we efficiently collected the data for DNN training. Although there are some DNN techniques for small datasets such as semi-supervised learning and unsupervised learning [1517], annotation tools built into the work are effective in continuously improving model performance. We plan to release our annotation tool in the future.

Model

We conducted learning experiments with DNN models for image classification: AlexNet [2], ResNet [18], DenseNet [19], and Vision Transformer (ViT [20]). We identified a suitable model for RA X-ray image classification by comparing multiple models with different network structures. We briefly describe the four models used in our experiments. AlexNet is a model composed of multiple convolutional layers and is a pioneering model that has improved the performance of image recognition tasks. ResNet has a shortcut connection to learn the residual function that references the input of layers, rather than learning only the optimal output of layers. This makes it possible to perform training with deeper layers than the normal DNNs. DenseNet uses “Dense blocks,” in which all sub-blocks are densely skip-connected, as main components. While keeping the basic idea of ResNet, it has developed into a large-scale multi-layer model by increasing the number of residual connections. ViT is a model that uses Transformer [21] attracting attention in the natural language processing field. ViT does not use convolutional layers and treats image patches like words, thereby achieving high performance at a low computational cost.

Fig 2 shows an overview of the entire training. In the experiment, the output number of the fully connected (FC) layer connected at the end of the model was changed to two to indicate the presence or absence of subluxation/ankylosis findings. In this study, the model was trained on the input image x for each binary classification of ankylosis and subluxation. For loss function L, we used Softmax cross-entropy, which is expressed by the following formula: (1) where t is data number, c is the class number, y is the class probability vector by softmax function, and is the teaching signal that is a one-hot vector. During model training, the parameters were optimized by minimizing L. The above loss function L is common to all models.

thumbnail
Fig 2. Overview of preprocessing and training phase of our method.

Annotated images were resized and augmented before inputting to the DNN model. We trained two models to predict each label: subluxation and ankylosis. In the experiment, we also verified by replacing ResNet in the figure with several DNN models (AlexNet, DenseNet, ViT).

https://doi.org/10.1371/journal.pone.0281088.g002

Training setup

We performed five-fold cross-validation. The images were randomly divided into five datasets (Fold numbers 1 to 5). At this time, images with findings of subluxation/ankylosis were equally allocated to each dataset. Each fold dataset had approximately 172 training data samples and 43 test data samples. To train the DNN models, we used NVIDIA GeForce2070 as GPU, and each training took approximately 2–3 hours.

We used Adam [22] and AdamW [23] as the optimizers in the experiment and compared their performances. Adam is often used for training DNNs, and AdamW is an improved version of Adam. We used the parameters proposed in previous studies for each optimizer [22, 23]. Owing to the difference between natural and medical images, we did not use the weight parameters pre-trained with ImageNet [2]. The batch size was set to 64, and all models were trained for 500 epochs. Before inputting images to the model, the annotated image data were converted to grayscale bitmap format (244 × 244 pixels and 256 gradations). The values of all pixels of the image were normalized. The means for each channel were [0.485, 0.456, 0.406], and the standard deviations were [0.229, 0.224, 0.225]. To increase the robustness of the model, we performed some augmentations: brightness and saturation conversions, horizontal flipping, and random-position clipping. The final image input size was 224 × 224 pixels. During the evaluation phase, the images cropped from the center were used as the input.

Evaluation and analysis

The trained models were evaluated in terms of accuracy, precision (synonymous with positive predicted value), recall (true positive rate, synonymous with sensitivity), specificity (true negative rate), and F value (harmonic average of precision and recall rate). We also calculated the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. We conducted five training experiments with different seeds for each experimental setup: type of DNN model/dataset/optimizer. The above evaluation was performed for each trial.

In addition, we visualized the activated parts of the trained model using gradient-weighted class activation mapping (Grad-CAM [24]). Grad-CAM can calculate the contribution of the input image based on the gradient information of each DNN layer as follows. (2) (3) where Z is the number of pixels in the feature map, k represents the channel, and i, j represent the pixel position. The weight α of the feature map A is calculated from the gradient information obtained through backpropagation (Eq 2). The gradient, other than the output target class, is set to 0. By calculating the weighted sum of α and A, the contribution level Gc of the class c is calculated (Eq 3). Since it is difficult to evaluate the basis of judgment of the DNN model quantitatively, it is important to compare whether the model’s activation is consistent with the knowledge of the doctor. We discuss the trained subluxation/ankylosis classification model using these visualizations in the experimental section.

Result and discussion

Subluxation classification

Table 1 shows the classification results of wrist subluxation. As described in the previous section, we used four models, AlexNet, ResNet, DenseNet, and ViT, and two optimizers, Adam and AdamW. The 3rd–6th columns show the performances of the trained model, and the higher the number, the better the performance. Focusing on the average AUC, ResNet using Adam optimizer achieved the best performance. With the overall dataset, ResNet achieved the following averaged results: accuracy was 0.97±0.01, precision was 0.92±0.12, recall was 0.77±0.16, F1 Value was 0.82±0.11, AUC was 0.92±0.08. However, the performances of AlexNet and DenseNet are not bad, confirming that the convolutional layer-based model effectively recognizes rheumatism medical images. Also, there was no significant difference in the results between Adam and AdamW, and the optimizer difference did not seem to affect the model performance. ViT performed the worst among the four models. This is probably because the model was too large for the dataset and could not be generalized well.

Fig 3 shows the training results of ResNet trained with Dataset 2. Fig 3(a) shows the learning curve. The horizontal axis represents the learning progress, and the vertical axis represents the loss value. The learning curve represents the mean and standard deviation across five trials. It can be confirmed that the learning of all four models converged. Fig 3(b) shows ROC curve. The horizontal axis represents the true positive rate, and the vertical axis represents the false positive rate. Although some variation in the performance of the trained model was confirmed, generally good results were obtained.

thumbnail
Fig 3. Average learning and ROC curves for subluxation classification.

Both results were obtained from the training of ResNet with Adam optimizer and Dataset 2.

https://doi.org/10.1371/journal.pone.0281088.g003

It can be said that this result was good because mTSS depends on the skill of the doctor. Although images with subluxation findings accounted for approximately only 10% of the entire training dataset, the model could predict subluxation with greater accuracy than ankylosis described in the next section. This may be because the shape of the hands or finger posture of patients with subluxation changes significantly compared to that of patients without subluxation.

Fig 4 shows typical examples of true positive, false positive, true negative, and false negative in the classification of subluxation with the trained ResNet. Looking at the examples of DNN models making incorrect decisions (Fig 4d), the model seems to have responded to distal interphalangeal (DIP) joints that are not subject to mTSS. Another false example (Fig 4b) suggests that even a well-trained rheumatologist might find it difficult to make the right decision. Since human judgments vary from person to person, the judgments of mTSS were decided based on an agreement between doctors. In the case of machine learning, combining the output of multiple models usually improves the accuracy (ensemble learning). Therefore, a system that combines the results of machine learning models and the judgment of doctors is suggested [9].

thumbnail
Fig 4. Examples of X-ray images in subluxation dataset.

The caption of each image indicates the prediction result.

https://doi.org/10.1371/journal.pone.0281088.g004

Ankylosis classification

Table 2 shows the classification results of wrist ankylosis. Similar to the results of subluxation classification, ResNet using Adam optimizer showed the best performance. For averaged results in the overall dataset, accuracy was 0.89±0.04, precision was 0.77±0.15, recall was 0.71±0.13, F1 Value was 0.72±0.09, and AUC was 0.85±0.07. Although the ankylosis classification was less accurate than the subluxation classification, the AUC was 0.85 or higher in both classification tasks. In addition, the learning and ROC curves of ResNet trained with Dataset 2 (Fig 5) also showed the same trend as the subluxation classification. Together with the results of the previous subsection, these results demonstrate the effectiveness of the DNN-based classification method.

thumbnail
Fig 5. Average learning and ROC curves for ankylosis classification.

Both results were obtained from training ResNet with Adam optimizer and Dataset 2.

https://doi.org/10.1371/journal.pone.0281088.g005

Fig 6 shows typical examples of true positive, false positive, true negative, and false negative in classifying X-ray images with ankylosis findings using trained ResNet. The result shows that the classification accuracy (true/false positive) of X-ray images with ankylosis findings was particularly low. This may be because the area of the image which represents ankylosis findings is about tens of pixels square, which is very small compared to the size of the input image. This may be solved by increasing the resolution of the input image or introducing an attention mechanism [21] that emphasizes important parts in the image.

thumbnail
Fig 6. Examples of X-ray images in ankylosis dataset.

The caption of each image indicates the prediction result.

https://doi.org/10.1371/journal.pone.0281088.g006

Visualizing contributing parts of input image.

We visualized the parts of the input image that contribute to the output of the model that learned ankylosis classification using Grad-CAM described in the previous section. We visualized ResNet, which showed the best performance in the learning experiments. Fig 7a shows the input image, and Fig 7b shows the image with all contributing features highlighted. The contribution map was obtained from the convolutional layer in Residual Block-3. It was confirmed that the model responded strongly to the position of the PIP joint of the middle finger of the right hand, similar to the subluxation/ankylosis findings by doctors. The joints with suspected findings throughout the dataset (PIP/DIP joint on the left ring finger, list of the right hand) were also captured correctly. This indicates that the model could correctly recognize the image features that determine mTSS.

thumbnail
Fig 7. Visualized parts contributing to the prediction result using Grad-CAM.

The contribution to the model increases along with the red, yellow, green, and blue scales.

https://doi.org/10.1371/journal.pone.0281088.g007

However, there were some cases where the model focused on areas that had nothing to do with the findings. It was also confirmed that the accuracy of the contribution map decreased from the deep layer to the shallow layer. This may be due to the small amount of training data or variation in the quality of the X-ray images. In future, it will be necessary to increase the size of the image dataset and conduct experiments with high-resolution images.

Limitations

The proposed method has several issues in building an automatic radiographic scoring system for bone destruction. One is that the proposed model estimated the presence or absence of subluxation/ankylosis from the entire X-ray image of the hand, not each joint. Considering the actual mTSS determination process, it is also necessary to estimate the presence of findings in each joint of the foot as well as the hand. It was difficult to train the model appropriately because the joints of the foot are more complicated than those of the hand, and the number of cases is small. In our experiment, we used the DNN model often used in general image recognition; however, we plan to develop a dedicated model that considers the characteristics of mTSS.

The other limitation is the issue of the quantity of the data samples and the quality of the findings. The size of the dataset used in this study was very small compared to the size of datasets generally used in DNN experiments. We need to collect larger datasets with high-quality annotation. Training is expected to take longer as the scales of the dataset and model increase, but this can be sufficiently improved through the hardware of the computer.

Related work

In Table 3, we summarized related previous studies where the joint destruction in patients with RA was evaluated using X-ray images by artificial intelligence [2528]. Miyama et al. [25] developed a classification model for joint space narrowing (JSN) and erosion using VGG-16 in a small number of patients, in which the accuracy of the erosion classification was worse than that of JSN. Ahalya et al. [26] developed a classification model to determine RA from hand X-ray images using GoogLeNet, in which only 10 epochs for pre-trained models and 50 epochs for customized CNN models were used. Wang et al. [27] classified the severity of JSN in the hand using a relatively large amount of images and EfficientNet. Üreten et al. [28] used only hand images to classify RA, OA, and normal images using VGG-16.

Conclusion

In this study, we proposed a DNN model for subluxation/ankylosis classification as the first step for an automatic radiographic scoring system. We collected the X-ray image data by developing a dedicated annotation tool for mTSS. As a result of learning experiments using some DNNs, (AlexNet, ResNet, DenseNet, and ViT) models that are capable of detecting subluxation and ankylosis on hand X-ray images with a relatively small number of samples were constructed. ResNet showed the best performance in both subluxation/ankylosis classification tasks. In addition, we visualized the contributing parts of the input images to the output of the model that learned ankylosis classification using Grad-CAM. The results indicated that the model could correctly recognize the image features that determine mTSS. In conclusion, we provided the data collection and model training schemes for mTSS prediction and showed an important contribution to building an automated estimating system. In future, we plan to extend this study, and our other study [29, 30] to automatically estimate joint destruction more accurately.

References

  1. 1. van der Heijde D. How to read radiographs according to the Sharp/van der Heijde method. Journal of Rheumatology, 26(3), pp. 743–745, 1999. pmid:10090194
  2. 2. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105, 2012.
  3. 3. Tan M, Pang R, Le QV. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10781–10790, 2020.
  4. 4. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018).
  5. 5. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA, 316(22):2402–2410, 2016. pmid:27898976
  6. 6. Haenssle HA, Fink C, Schneiderbauer R, Toberer F, Buhl T, Blum A, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 29, pp. 1836–1842, 2018. pmid:29846502
  7. 7. Horie Y, Yoshio T, Aoyama K, Yoshimizu S, Horiuchi Y, Ishiyama A, et al. Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointest Endosc., 89(1), pp. 25–32, 2019. pmid:30120958
  8. 8. Hirasawa T, Aoyama K, Tanimoto T, Ishihara S, Shichijo S, Ozawa T, et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer., 21(4), pp. 653–660, 2018. pmid:29335825
  9. 9. Uchino E, Suzuki K, Sato N, Kojima R, Tamada Y, Hiragi S, et al. Classification of glomerular pathological findings using deep learning and nephrologist–AI collective intelligence approach. International Journal of Medical Informatics, 141 (2020), 104231, 2020. pmid:32682317
  10. 10. Burlina P, Billings S, Joshi N, Albayda J. Burlina P, Billings S, et al. PLoS ONE, 12(8): e0184059, 2017. pmid:28854220
  11. 11. Lin C, Karlson EW, Dligach D, Ramirez MP, Miller TA, Mo H, et al. Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. d. Journal of the American Medical Informatics Association, 22(e1), pp. e151–e161, 2015. pmid:25344930
  12. 12. Raddy BK, Delen D. Predicting hospital readmission for lupus patients: An RNN-LSTM-based deep-learning methodology. Computers in Biology and Medicine, 101:199–209, 2018.
  13. 13. Hirano T, Nishide M, Nonaka N, Seita J, Ebina K, Sakurada K, et al. Development and validation of a deep-learning model for scoring of radiographic finger joint destruction in rheumatoid arthritis. Rheumatology Advances in Practice, Volume 3, Issue 2, 2019. pmid:31872173
  14. 14. Izumi K, Hashimoto M, Suzuki K, Endoh T, Doi K, Iwai Y, et al. Detecting Hand Joint Ankylosis in Radiographic Images Using Deep Learning: A Step in Developing Automatic Radiographic Scoring System for Bone Destruction. Arthritis Rheumatol 2018;70(suppl 10).
  15. 15. Miyato T, Maeda S, Koyama M, Nakae K, Ishii S. Distributional Smoothing with Virtual Adversarial Training. In Proceedings of the International Conference on Learning Representations, 2016.
  16. 16. Berthelot D, Carlini N, Goodfellow I, Oliver A, Papernot N, Raffel C. MixMatch: A Holistic Approach to Semi- Supervised Learning. In Advances in Neural Information Processing Systems, 2019.
  17. 17. Kingma DP, Welling M. Auto-Encoding Variational Bayes. In Proceedings of the International Conference on Learning Representations, 2014.
  18. 18. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
  19. 19. Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  20. 20. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, 2021.
  21. 21. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In Advances in Neural Information Processing Systems, pp. 6000–6010, 2017.
  22. 22. Kingma DP, Ba JL. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, pp. 1–13, 2015.
  23. 23. Loshchilov I, Hutter F. Decoupled Weight Decay Regularization. In Proceedings of the International Conference on Learning Representations, 2019.
  24. 24. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of IEEE International Conference on Computer Vision, pp. 618–626, 2017.
  25. 25. Miyama K, Bise R, Ikemura S, Kai K, Kanahori M, Arisumi S, et al. Deep learning-based automatic-bone-destruction-evaluation system using contextual information from other joints. Arthritis Res Ther. 2022 Oct 3;24(1):227. pmid:36192761
  26. 26. Ahalya RK, Umapathy S, Krishnan PT, Joseph Raj AN. Automated evaluation of rheumatoid arthritis from hand radiographs using Machine Learning and deep learning techniques. Proc Inst Mech Eng H. 2022 Aug;236(8):1238–1249. pmid:35822205
  27. 27. Wang HJ, Su CP, Lai CC, Chen WR, Chen C, Ho LY, et al. Deep Learning-Based Computer-Aided Diagnosis of Rheumatoid Arthritis with Hand X-ray Images Conforming to Modified Total Sharp/van der Heijde Score. Biomedicines. 2022 Jun 8;10(6):1355. pmid:35740376
  28. 28. Üreten K, Maraş HH. Automated Classification of Rheumatoid Arthritis, Osteoarthritis, and Normal Hand Radiographs with Deep Learning Methods. J Digit Imaging. 2022 Apr;35(2):193–199. pmid:35018539
  29. 29. Izumi K, Suzuki K, Hashimoto M, Endoh T, Doi K, Iwai Y, et al. Automatic detection of hand joint region, ankylosis and subluxation in radiographic images using deep learning: development of artificial intelligence-based radiographic evaluation system for bone destruction. Annals of the Rheumatic Diseases 2019;78 (suppl 2), pp. 1364–1364.
  30. 30. Izumi K, Suzuki K, Hashimoto M, Endoh T, Doi K, Iwai Y, et al. Automatic finger joint bone erosion score prediction considering 2-time-point X-rays of patients with rheumatoid arthritis by deep learning. Annals of the Rheumatic Diseases 2020;79 (suppl 1), pp. 1841–1841.