Skip to main content

REVIEW article

Front. Oncol., 04 October 2023
Sec. Cancer Molecular Targets and Therapeutics

Use and accuracy of decision support systems using artificial intelligence for tumor diseases: a systematic review and meta-analysis

Robert OehringRobert Oehring1Nikitha RamasettiNikitha Ramasetti1Sharlyn NgSharlyn Ng1Roland RollerRoland Roller2Philippe ThomasPhilippe Thomas2Axel WinterAxel Winter1Max MaurerMax Maurer1Simon MoosburnerSimon Moosburner1Nathanael RaschzokNathanael Raschzok1Can KamaliCan Kamali1Johann PratschkeJohann Pratschke1Christian BenzingChristian Benzing1Felix Krenzien,*Felix Krenzien1,3*
  • 1Department of Surgery, Charité – Universitätsmedizin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
  • 2Speech and Language Technology Lab, German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
  • 3Berlin Institute of Health (BIH), Berlin, Germany

Background: For therapy planning in cancer patients multidisciplinary team meetings (MDM) are mandatory. Due to the high number of cases being discussed and significant workload of clinicians, Clinical Decision Support System (CDSS) may improve the clinical workflow.

Methods: This review and meta-analysis aims to provide an overview of the systems utilized and evaluate the correlation between a CDSS and MDM.

Results: A total of 31 studies were identified for final analysis. Analysis of different cancers shows a concordance rate (CR) of 72.7% for stage I-II and 73.4% for III-IV. For breast carcinoma, CR for stage I-II was 72.8% and for III-IV 84.1%, P≤ 0.00001. CR for colorectal carcinoma is 63% for stage I-II and 67% for III-IV, for gastric carcinoma 55% and 45%, and for lung carcinoma 85% and 83% respectively, all P>0.05. Analysis of SCLC and NSCLC yields a CR of 94,3% and 82,7%, P=0.004 and for adenocarcinoma and squamous cell carcinoma in lung cancer a CR of 90% and 86%, P=0.02.

Conclusion: CDSS has already been implemented in clinical practice, and while the findings suggest that its use is feasible for some cancers, further research is needed to fully evaluate its effectiveness.

1 Introduction

Cancer is one of the leading causes of death worldwide (1). In 2020, 10 million people worldwide died from cancer (2). Interdisciplinary tumor boards or multidisciplinary team meetings (MDMs) are the backbone in treatment planning for patients with tumor disease (3). MDMs are usually held on a weekly basis, with the goal of finding the best treatment based on current guidelines and medical evidence. Indeed, medical guidelines strongly recommend discussing patients in MDMs prior to the actual treatment (4).

The goal of MDMs is to weigh potential treatment options based on available patient data and radiological exams. A complete set of the required patient data including performance status, tumor stage and co-morbidities is required for effective decision-making (5). In most countries, data are currently entered manually into simple online forms such as the Giessen Tumor Documentation System (GTDS) in preparation for MDMs (6). Administrative and procedural difficulties in retrieving patient information are not uncommon, usually due to missing pathology and radiology results or incomplete information on referral forms from other medical institutions (7). Thus, missing data can lead to delays in diagnosis and treatment (8). Moreover, excessive workload and time pressure adversely affect MDMs (9), which can in turn lead to unstructured case discussions and variability in the quality of decision-making.

To overcome the current problems in conventional MDMs, automated processes and decision support systems might help. There is increasing research on AI and machine learning (ML) techniques applied in MDM (Figure 1). In recent times, artificial intelligence (AI) is viewed as a branch of engineering that implements novel concepts and solutions to resolve complex challenges. With rapid advancements in technology, computers may someday be as intelligent as humans (10). Today, the natural language processing (NLP) model ChatGPT can hold conversations and produce meaningful text such as e-mail or essay writing when given prompts via a dialogue format (11). In medicine, AI can be divided into two main branches: virtual and physical (10). ML is an area of AI that aims to process large amounts of qualitative information to identify patterns of relevant information.

FIGURE 1
www.frontiersin.org

Figure 1 Possible workflow of AI supporting MDMs. An automated program using artificial intelligence (machine learning, natural language processing) runs in the background of the hospital information system and can extract relevant data for MDM from the system. Afterwards, the tumor board protocol can be automatically prepared and filled out with all relevant patient data in preparation for the MDM. At the same time, the program could provide treatment suggestions based on the available data and support these with existing guidelines or studies. Based on this, the physicians in the MDM can then make the therapy decision. In the end, both physicians and patients could benefit. Created with BioRender.com.

The objective of this review is to provide an overview and systematic analysis of the current usage and accuracy of AI-based decision support systems in MDM. Specifically, the review will focus on studies that evaluate the consistency between AI-based decision support systems and MDM decisions.

2 Methods

This review was conducted according to the PRISMA guidelines for systematic reviews (12) and was registered with the International Prospective Register of Systematic Reviews (PROPSPERO ID: 411462).

2.1 Eligibility criteria

The studies considered for this review met the following criteria:

● The studies verified the consistency of AI-based systems in MDM, regardless of cancer type.

● The studies thoroughly compared the consistency of treatment regimens established by AI and MDM, specifically the correspondence between AI decisions and those made by a multidisciplinary team or using established standards such as guidelines.

● Only studies with adult patients aged 18 and above were included.

● The studies were available in full text and written in English.

● Only retrospective and prospective studies were considered.

2.2 Exclusion criteria

· The study does not fulfill the inclusion criteria.

· The article is a systematic review or meta-analysis.

2.3 Literature search methodology

The present review was conducted according to the PRISMA guideline for systematic reviews (Figure 2) (13). The literature research on Pubmed (MEDLINE) was carried out until November 2022 using MeSH keyword search. The search terms were the following: (machine learning) AND (tumor board); (machine learning) AND (multidisciplinary team meetings); (machine learning) AND (multidisciplinary cancer teams); (artificial intelligence) AND (multidisciplinary cancer teams); (artificial intelligence) AND (multidisciplinary team meetings); (artificial intelligence) AND (tumor board); IBM Watson for Oncology; (machine learning) AND (multidisciplinary team); Watson for Oncology; (artificial intelligence) AND (multidisciplinary team); (clinical decision support system) AND (multidisciplinary team meetings); (clinical decision support system) AND (multidisciplinary team); (clinical decision support system) AND (multidisciplinary cancer teams); (clinical decision support system) AND (tumor board).

FIGURE 2
www.frontiersin.org

Figure 2 Flow diagram of the study selection process. This figure was designed according to the PRISMA-Statement (13).

For the search terms “Watson for Oncology” and “IBM Watson for Oncology”, the search was limited to literature from 2015 onwards, because commercial use of Watson for Oncology began in 2015 (14). For all other search terms, no time limit was set. In total, 4078 records were identified through database searching. Preliminary screening of titles, abstracts and duplicates yielded 139 articles. The aim of the paper was to include studies that focused on CDSS and then review the concordance. However, the very general selection of search terms resulted in in a large list of papers that deal with AI in oncology but did not cover any CDSS. Indeed, this was recognizable in most cases by title and abstract. Note, a decent amount of duplicates have been removed as well (n = 823). After the initial selection process, the articles were read in full and care was taken to review both treatment recommendations and concordance between CDSS and MDM. Excluded were articles that compared CDSS to a guideline, investigated how CDSS influences the actions of MDMs, articles that investigated the acceptance of CDSS by physicians or patients, or articles in which CDSS provided a prognosis or could decide on possible inclusion in a trial. Meta-analyses or reviews on the topic were also excluded. Finally, after independent assessment of full text articles by two different researchers (RO, SN), 31 articles were included. No separate checks on study quality like patient selection or study population were done. Studies were included when they had performed an analyzation of concordance rate between MDM and CDSS. If there was disagreement on this, an additional independent arbitrator (FK) was consulted for further resolution.

2.4 Statistical analysis

Review Manager (RevMan) 5.4.1 (The Cochrane Collaboration, 2020) software was utilized to conduct a comprehensive analysis of the extracted data. To enhance the clarity and ease of interpretation of the results, forest plots were generated. The primary objective was to assess the level of agreement between treatment decisions made by WFO and MDT for various cancer types. The data was analyzed dichotomously, and odds ratios (ORs) with corresponding 95% confidence intervals were calculated for each variable (stage, histology type, etc.) Heterogeneity among the studies was evaluated using the I2 test. I2 > 50% indicated considerable heterogeneity, whereas no heterogeneity was present in the absence of these conditions. P < 0.05 was considered significant. If the data provided could not be meta-analyzed, only descriptive analysis was done. Because not all studies could be included in the meta-analysis due to the unavailability data, an additional descriptive analysis was performed.

3 Results

3.1 Study characteristics

Most of the studies which matched the review criteria used Watson for Oncology (WFO). Twenty-three of the 31 studies were on WFO and concordance (Table 1). Other Clinical Decision Support Systems (CDSSs) included were OncoDoc (15), Lung Cancer Assistant (LCA) (17) and the Multidisciplinary meeting Assistant or Treatment sElector (MATE) (16). Two studies using a decision tree model based on Dutch guidelines were included (41). In addition to the CDSSs mentioned above, there were two prototype decision tree models created by the working group Andrew et al. and Lin et al. that conducted a concordance study (18, 42).

TABLE 1
www.frontiersin.org

Table 1 Overview of studies on decision support systems using artificial intelligence for tumor diseases; n refers to the actual number analyzed.

Across all studies, a total of 16,472 participants were included. The number of included subjects varied greatly within the included studies. Five studies had a very small number of cases (< 100) (20, 24, 25, 27, 40) while the other studies had a relatively large number of included cases(> 1000) (1618, 36, 43). Three studies examined multiple tumor entities, and included only a small number of participants in the subgroups (21, 38, 44).

Thirteen studies of breast cancer were conducted, involving a total of 7786 subjects (15, 16, 1821, 27, 34, 36, 38, 4345). The number of participants per study varied widely, ranging from 55 (27) to 1,977 (36). The most common treatment decisions reviewed in this study were for breast cancer, with the MDM and CDSS evaluated in multiple studies. Colorectal cancer was the subject of eight studies (21, 23, 25, 3739, 41, 44), followed by lung cancer with seven studies (17, 21, 22, 30, 31, 35, 38). Gastric cancer was reviewed in three studies (21, 24, 29), while cervical (21, 32) and prostate cancers (33, 44) were each the focus of two studies. Thyroid cancer was examined in two studies (26, 40), and there was one study each for ovarian cancer (21), basal cell carcinoma (42), and hepatocellular carcinoma (HCC) (28).

Of the analyses evaluating the concordance rate between therapy decisions and CDSS, the majority were retrospective. Only three analyses were prospective (15, 16, 44).

3.2 Clinical decision support systems

As seen in Table 1, there are several AI-based CDSSs used regularly in clinical oncology. The most common is Watson for Oncology (WFO); its use is widespread in the US and in Asia. Other systems like OncoDoc, LCA or other decision tree-based CDSSs are often prototypes which are only used at a single hospital, region or country. The CDSSs that were reviewed for concordance with treatment decisions are appointed below.

3.2.1 Watson for oncology

WFO is an AI CDSS developed by IBM Corporation (USA) in cooperation with oncologists from Memorial Sloan Kettering Cancer Center (USA) (46). For supported cases, the treatment recommendations provided by WFO fall into three possible categories: ´Recommended´, ´For consideration´ and ´Not recommended´ (14).

3.2.2 OncoDoc

OncoDoc is a CDSS based on clinical practice guidelines (CPGs) that allows physician discretion in the decision-making process. CPGs are organized in decision trees. Decision parameters are dynamically instantiated by the physicians. It was developed in collaboration with the medical oncology department of the Pitié-Salpêtrière Hospital (France) and has first been applied to the treatment of breast cancer (47).

3.2.3 Lung cancer assistant

LCA is a CDSS prototype designed in the United Kingdom. Probabilistic and guideline rule-based decision support are used to aid clinicians’ decision-making in lung cancer MDMs (17).

3.2.4 Oncoguide

Oncoguide is an open access, interactive decision support software developed in the Netherlands with the help of a multidisciplinary team. The Dutch CPGs for colorectal cancer were converted into decision trees and then validated with patient data. Supporting information from the CPGs, such as scientific evidence for specific treatment decisions, are presented with the recommendations (41, 44).

3.2.5 MATE

MATE (Multidisciplinary meeting Assistant and Treatment sElector) is a CDSS developed in the United Kingdom and used in breast cancer MDMs. It requires manual input of patient data by a physician, assesses patient eligibility for clinical trials and presents ranked recommendations together with supporting evidence (16).

3.3 Results of meta-analysis and concordance rate

First, we conducted an overall meta-analysis of patients with different cancer stages (see Figure 3). In studies concerning WFO, treatment was deemed concordant if it was categorized as ‘Recommended’ or ‘For consideration’. A total of 18 studies were included in the analysis. The results showed a concordance rate of 72.7% (1992/2739) for stages I-II and 73.4% (2289/3117) for stages III-IV across various carcinomas, although this difference was not statistically significant (P=0.18). However, the meta-analysis revealed significant statistical heterogeneity (I2 = 88%) across different cancer stages. As a result, we conducted a subgroup meta-analysis to examine specific cancer types and stages. In the case of breast cancer, five studies were included in the analysis (see Figure 4), revealing a concordance rate of 72.8% (1209/1661) for stages I-II and 84.1% (557/662) for stages III-IV, P≤ 0.00001.

FIGURE 3
www.frontiersin.org

Figure 3 Overall concordance of various cancers in stages I–II and III-IV.

FIGURE 4
www.frontiersin.org

Figure 4 Overall concordance in breast cancer in stages I-II and III-IV.

The concordance rates for different cancer types and stages were as follows: for colorectal carcinoma (Figure 5), 63% (245/392) for stages I-II and 67% (669/993) for stages III-IV; for gastric carcinoma (Figure 6), 55% (33/60) for stages I-II and 45% (127/282) for stages III-IV; for cervical cancer (Figure 7) for stages I-II 73% (105/144) and 68% (88/130) for stage III-IV; and for lung carcinoma (Figure 8), 85% (137/162) for stages I-II and 83% (494/593) for stages III-IV. However, none of these differences were statistically significant (P>0.05). In addition, we analyzed different types of lung cancer, including SCLC and NSCLC, in three studies (Figure 9). The results showed a concordance of 94.3% (134/142) for SCLC and 82.7% (416/503) for NSCLC, with a statistically significant difference (P=0.004). Analysis of histopathology subtypes in lung cancer revealed a concordance rate of 90% (450/495) for adenocarcinoma and 86% (230/266) for squamous cell carcinoma (Figure 10), with a statistically significant difference (P=0.02). For ECOG 0-1, the concordance rate was 66.6% (330/495), while for ECOG 2-5 (Figure 11), it was 58% (69/120), although this difference was not statistically significant (P=0.23).

FIGURE 5
www.frontiersin.org

Figure 5 Overall concordance in colorectal cancer in stages I-II and III-IV.

FIGURE 6
www.frontiersin.org

Figure 6 Overall concordance in gastric cancer in stages I-II and III-IV.

FIGURE 7
www.frontiersin.org

Figure 7 Overall concordance in cervical cancer in stages I-II and III-IV.

FIGURE 8
www.frontiersin.org

Figure 8 Overall concordance in lung cancer in stages I-II and III-IV.

FIGURE 9
www.frontiersin.org

Figure 9 Overall concordance in different lung cancer types for SCLC and NSCLC.

FIGURE 10
www.frontiersin.org

Figure 10 Overall concordance in NSCLC for histopathology type for adenocarcinoma and squamous cell carcinoma.

FIGURE 11
www.frontiersin.org

Figure 11 Overall concordance for ECOG 0-1 and 2-5.

Breast cancer has been analyzed by various CDSSs, showing generally high concordance. In the study of Somashekhar et al., the overall concordance rate between WFO and MDM is near 93% being at the ´Recommend´ level 62% and the `For consideration´ level 31% (19). Across the different stages, the concordance is above 80% (19), which is the same in the study of Zhou N et al. (21) As for the other CDSSs, there is also a high concordance rate of 93,4%, 93,2% and 85,3% using OncoDoc2, MATE and decision clinical tree system based on Oncoguide respectively (15, 16, 44). McNamara et al. conducted a study to analyze the concordance of WFO with decisions made by oncology experts and its impact on decisions made by newcomers to oncology. In breast cancer, the overall concordance rate among experts was found to be 87.9%. Novice oncologists had a concordance rate of 75.5% without the use of WFO, which improved to 95.3% with WFO (27).

In a study by Zhao et al., concordance rates between MDM and WFO were found to be only 77% for the adjuvant treatment group and 27.5% for the metastatic group (34). Xu et al. conducted an interesting study on the influence of WFO on treatment decisions, which showed that treatment decisions changed in only 5% of cases after reviewing WFO recommended treatment options for patients (36). However, there were also studies on breast cancer with low concordance rates, such as a study by Suwanvecho et al., which found a concordance rate of only 59.3% (38). In a study by Pan et al., the overall concordance rate was only 69.4%. Interestingly, the concordance rate was worse in the adjuvant chemotherapy group, whereas in the neoadjuvant chemotherapy group, the overall concordance rate was 96.7% (43).

Studies evaluating the use of WFO in patients with colorectal carcinoma have shown highly variable results. Some studies, including Zhou et al., Lee et al., and Mao et al., reported low overall concordance rates of 64%, 48.9%, and 66.9%, respectively, for colorectal cancer (21, 23, 37). However, other studies, such as Kim et al. and Aikemu et al., reported good agreement rates with overall concordance of 87% and 91%, respectively, for colorectal cancer (25, 39). Additionally, two studies that did not use WFO as a clinical decision support system also reported good overall concordance rates above 80% (41, 44).

Several studies have been conducted on lung cancer and WFO. Kim et al. achieved a high concordance rate of 92.4% (35). Zhou et al. showed an overall concordance rate of 83%, 92% for SCLS and 80% for NSCLC (21). In contrast, Liu et al. reported only an overall concordance rate of 65.8%, but also achieved 83% for SCLS but only 61.1% for NSCLC (22). Two of the studies discussed in this paper were conducted just for NSCLC, You et al. recorded a high overall concordance rate of 85.16% compared to the other studies, and Yao et al. achieved 73.3%, which was higher than the work of Liu et al. (30, 31) Sesen et al. used the LCA system, in which the rule-based decision support of the LCA guideline achieved an exact concordance rate of 0.57 with the recorded treatments. For the probabilistic LCA decision aid, the result was worse, with 0.27 and 0.76 for the exact and partial concordance rates, respectively. In this study, MDM was not performed, but patient treatment from the English National Lung Cancer Audit Database was compared with the LCA decision (17).

The overall concordance rate for gastric cancer was low at 54.5% by Tian et al. (32) In a study by Choi et al, concordance at the recommended level was also low at 41.5%, but higher at the recommendation level at 87.5%. For various stages and low ECOG scores, consensus was also low (24).

Two cervical cancer studies were found for this review. In both studies, overall agreement was below 75% with 64% and 72.8%, respectively (21, 32).

Yu et al. showed an overall concordance of 73,6% for prostate cancer. Looking at the different stages there was a higher concordance for lower stages (33). Ebben et al. showed in there study a similar overall concordance (78,8%) but using a different CDSS (44).

For thyroid cancer the results are diverse. The study of Yun et al. showed only an overall recommendation of 48% (40) in contrast to 77% overall concordance shown in the study by Kim et al. (26).

For ovarian cancer Zhou et al. showed a concordance rate above 90% overall and for stages as well (21).

Andrew et al. did a study on a Machine-learning algorithm to predict multidisciplinary team treatment recommendations in the management of basal cell carcinoma (42). They stated that the choice of conventional treatment (surgical excision or radiotherapy) by the MDT could be reliably predicted based on the patient’s age, tumor phenotype and lesion size. The algorithm reliably predicted the MDT decision outcome of 45.1% of nasal Basal cell cancer (42).

Zhang et al. conducted a study on hepatocellular carcinoma (HCC), where only surgically treated patients were included. The study aimed to compare the concordance between the decision made by WFO and the decision made by surgeons regarding the need for surgery, without comparing with MDM. The overall concordance rate was found to be 72%. In subgroup analyses, concordance varied from 66% for major hepatectomy to 88% for BCLC stage 0-A, indicating a higher level of agreement in less complex cases (28) (Table 2).

TABLE 2
www.frontiersin.org

Table 2 Concordance rate between AI system and MDM; Not every subgroup analysis has been included in the table.

4 Discussion

The objective of this review was to provide an overview and systematic analysis of the current research landscape, usage and accuracy of AI-based decision support systems in MDM. AI-based CDSS and MDM decisions have been evaluated according to consistency.

4.1 Limitation and disadvantages

While conducting a review on concordance, it was found that many studies from Asia were focused on the use of the WFO system. WFO was originally based on vast cancer treatment experience in North America and the National Comprehensive Cancer Network guidelines (14). Therefore, it is not surprising that there have been numerous studies on its concordance in other countries and this could affect the results and match rates. For example treatment recommendations for different types of cancer can differ significantly between countries, for example gastric cancer treatment in the US and Chinese population (48). Another example, while WFO recommends three immunotherapies, namely pembrolizumab, nivolumab, and atezolizumab, for metastatic NSCLC, these are not yet approved by the China Food and Drug Administration (CFDA) (21). Although WFO does not require all information, studies have shown that entering more data into the system could increase the concordance rate (20). It therefore also seems important to collect as much data from the population the system is used in.

When considering other CDSSs used, the studies available for analysis are limited, making it more challenging to draw conclusions about the consistency of treatment decisions compared to MDM.

Another significant issue is the variability in the definition of concordance. WFO overall concordance rate is often listed as ´Recommended´ and ´For consideration´ with these two categories sometimes being reported separately. It is crucial to carefully examine how the overall match is evaluated, as a high overall agreement may not always translate to a high “recommendation” but may only be viewed “for consideration”.

When treating cancer patients, an MDM is an integral part of treatment planning and approach (3, 49). Studies have already shown that oncology patients benefit from a multidisciplinary approach to health care (5052). Therefore, discussion in an MDM should be considered fundamental in treatment decision-making. Consequently, the decisions of the CDSS should only be compared with the decisions of the MDM. In some studies, however, only a comparison between decisions regarding the actual treatment of patients and the CDSS was made (2628, 30, 36, 38, 43). In part, even in some studies the CDSS decision was only compared to national guidance (17, 40). Moreover, the decisions of an MDM or actual treatment are not always consistent with the guidelines (53).

4.2 Concordance analyses

This review highlights a range of different tumor types with particular focus on breast, lung and colorectal cancer. These cancers are among the most frequently diagnosed worldwide (2), so it is understandable that more studies have been conducted on these types. The number of studies conducted for each cancer type allows for reasonable conclusions to be drawn about the agreement rate between the CDSS and MDM. However, for other tumor types, such as HCC, thyroid, prostate, cervical, ovarian and basal cell cancer only a few studies have been conducted, making it difficult to draw definitive conclusions. Among the various tumor types, breast cancer studies are the most consistent, with high agreement rates observed across different CDSS. These studies also tend to involve larger sample sizes, with most studies including more than 1000 patients compared to studies on other tumor types (1618, 36, 43).

The review demonstrates a wide range of concordance rates across different studies, with some studies showing rates above 90% (19, 35, 39), while others are below 60% (29, 40). Therefore, it is crucial to only use a CDSS in clinical practice when there is a high concordance rate to ensure high confidence in decision-making. Breast cancer studies have shown the highest overall concordance rate, exceeding 90% in some studies (15, 16, 19) but still showing a wide range with even reported concordance rates below 60% (38). The concordance rates for gastric, thyroid and basal cell cancer are consistently the lowest. Regarding the agreement rates for individual stages, there is no general statement as it varies between studies and tumor types. The meta-analysis for different carcinomas showed no significant difference between stage I-II and stage III-IV (Figure 3). For breast cancer, however, there was a significant difference, so the concordance rate was higher at advanced stages. For colorectal carcinoma, the studies that performed a staging analysis also showed low concordant rates. Thus, it is important to note that some studies showed a high overall concordance rate when no differential stage analysis was performed.

However, a lower ECOG score seems to be associated with a higher concordance in the results. Furthermore, in studies comparing the treatment recommendation for NSCLC and SCLS, SCLC shows a higher concordance rate than NSCLC. In NSCLC, adenocarcinoma has a higher concordance rate than Squamous cell carcinoma.

4.3 Comparison to work done in this field

Jie et al. (14) published a meta-analysis on the application of WFO in 2021. However, only studies on WFO were included here (n = 9). Since then, multiple studies on AI in MDM have been published. The main purpose of the review by Jie et al. was to analyze the concordance rate between MDM and CDSS which was similar to our review. In comparison to our study only WFO was analyzed and less studies were included. One important difference was the concordance rate between the stages. The study by Jie et al. showed a higher agreement for lower stages, but without statistical significance, and a slightly higher overall agreement in comparison to our study. In our study, there was no significant difference in this regard, except for breast cancer. However, the subdivision was different. Thus, in contrast to us, Jie et al. subdivided into stages I-III and IV. Gastric cancer also showed the lowest agreement rates in Jie et al. A low ECOC also seemed to have been associated with a higher agreement rate in Jie et al. They also showed a higher consistency of SCLC compared to NSCLC, which was similar to our study.

4.4 Future perspective

In the near future, CDSS could be used in daily clinical routine. However, it is necessary to train the various systems based on large patient data sets. Moreover, verification of the accuracy of these data must take place on large patient collectives. The highest medical evidence is desirable and can be reached by conducting multicenter studies. This is certainly a major obstacle, since many hospitals use their own hospital information systems, making it more difficult to develop systems that can be used between different hospitals.

Should these systems prove to be highly accurate, then the use of CDSS in MDM can bring both a time saving and a qualitative gain. However, complete decision-making power by a CDSS should not be granted yet due to the importance and complexity of the decisions made during MDMs. However, it is conceivable that decision proposals are made by the CDSS and that the medical staff only has to approve them. Furthermore, the system should also recognize and indicate complex or individual cases and serve the latest scientific studies for the cases. Lastly, the automatic preparation of MDM cases is also a conceivable support for the medical staff.

5 Conclusion

This review and meta-analysis provides a basic overview of previous work in the field of AI and MDM. In particular, concordance rate between CDSSS and MDM was assessed and compared. WFO is certainly the most widely used system, especially in the USA and Asia. Therefore, there are currently the most studies and data on this system. The use of WFO already allows some conclusions to be made, while the results are very heterogeneous. Some tumors show higher concordance rates than others. For instance, breast and lung cancer exhibit higher concordance rates than gastric cancer when using CDSS, yet WFO does not appear to be utilized in Europe. However, promising alternatives such as OncoDoc2 and Oncoguide exist in this region. AI holds the potential to revolutionize hospital workflows and enhance diagnostics and therapies for patients. However, to fully realize these benefits, it is crucial to conduct further studies on the concordance between CDSS and MDM decisions.

This systematic review provides a comprehensive overview of the current state of research and indicates that the use of CDSS in clinical practice is feasible, but additional research is required to fully evaluate its potential impact.

Author contributions

RO, SN and FK elaborated hypothesis, constructed the search algorithm, and performed the literature search systematically. RO wrote the manuscript. FK and SN critically revised the manuscript and interpreted the data. FK edited the revision of the manuscript. All the authors read and approved the final manuscript.

Funding

Funding received from the Joint Federal Committee Innovation Fund Grant Number 01VSF21047. We acknowledge financial support from the Open Access Publication Fund of Charité – Universitätsmedizin Berlin and the German Research Foundation (DFG).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Bray F, Laversanne M, Weiderpass E, Soerjomataram I. The ever-increasing importance of cancer as a leading cause of premature death worldwide. Cancer (2021) 127:3029–30. doi: 10.1002/cncr.33587

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin (2021) 71:209–49. doi: 10.3322/caac.21660

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Taylor C, Munro AJ, Glynne-Jones R, Griffith C, Trevatt P, Richards M, et al. Multidisciplinary team working in cancer: what is the evidence? BMJ (2010) 340:c951. doi: 10.1136/bmj.c951

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Galle PR, Forner A, Llovet JM, Mazzaferro V, Piscaglia F, Raoul J-L, et al. EASL Clinical Practice Guidelines: Management of hepatocellular carcinoma. J Hepatol (2018) 69:182–236. doi: 10.1016/j.jhep.2018.03.019

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Winters DA, Soukup T, Sevdalis N, Green JSA, Lamb BW. The cancer multidisciplinary team meeting: in need of change? History, challenges and future perspectives. BJU Int (2021) 128:271–9. doi: 10.1111/bju.15495

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Altmann U, Dudeck J. The gießen tumor documentation system (GTDS) – review and perspectives. Methods Inf Med (2006) 45:108–15. doi: 10.1055/s-0038-1634046

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Soukup T, Lamb BW, Morbi A, Shah NJ, Bali A, Asher V, et al. Cancer multidisciplinary team meetings: impact of logistical challenges on communication and decision-making. BJS Open (2022) 6. doi: 10.1093/bjsopen/zrac093

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Lamb BW, Sevdalis N, Taylor C, Vincent C, Green JSA. Multidisciplinary team working across different tumour types: analysis of a national survey. Ann Oncol (2012) 23:1293–300. doi: 10.1093/annonc/mdr453

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Lamb BW, Brown KF, Nagpal K, Vincent C, Green JSA, Sevdalis N. Quality of care management decisions by multidisciplinary cancer teams: A systematic review. Ann Surg Oncol (2011) 18:2116–25. doi: 10.1245/s10434-011-1675-6

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism (2017) 69:S36–40. doi: 10.1016/j.metabol.2017.01.011

CrossRef Full Text | Google Scholar

11. Ai O. ChatGPT. In: Introducing chatGPT. San Francisco: Open AI (2022). Available at: https://openai.com/blog/chatgpt.

Google Scholar

12. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev (2015) 4:1–9. doi: 10.1186/2046-4053-4-1

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ (2021) 372:n71. doi: 10.1136/bmj.n71

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Jie Z, Zhiying Z, Li L. A meta-analysis of Watson for Oncology in clinical application. Sci Rep (2021) 11:5792. doi: 10.1038/s41598-021-84973-5

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Séroussi B, Bouaud J, Gligorov J, Uzan S. Supporting multidisciplinary staff meetings for guideline-based breast cancer management: a study with OncoDoc2. AMIA Annu Symp Proc (2007) 2007:656–60.

PubMed Abstract | Google Scholar

16. Patkar V, Acosta D, Davidson T, Jones A, Fox J, Keshtgar M. Using computerised decision support to improve compliance of cancer multidisciplinary meetings with evidence-based guidance. BMJ Open (2012) 2:e000439. doi: 10.1136/bmjopen-2011-000439

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Sesen MB, Peake MD, Banares-Alcantara R, Tse D, Kadir T, Stanley R, et al. Lung Cancer Assistant: a hybrid clinical decision support application for lung cancer care. J R Soc Interface (2014) 11:20140534. doi: 10.1098/rsif.2014.0534

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Lin FPY, Pokorny A, Teng C, Dear R, Epstein RJ. Computational prediction of multidisciplinary team decision-making for adjuvant breast cancer drug therapies: a machine learning approach. BMC Cancer (2016) 16:929. doi: 10.1186/s12885-016-2972-z

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Somashekhar SP, Sepúlveda M-J, Puglielli S, Norden AD, Shortliffe EH, Rohit Kumar C, et al. Watson for Oncology and breast cancer treatment recommendations: agreement with an expert multidisciplinary tumor board. Ann Oncol (2018) 29:418–23. doi: 10.1093/annonc/mdx781

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Kim YY, Oh SJ, Chun YS, Lee WK, Park HK. Gene expression assay and Watson for Oncology for optimization of treatment in ER-positive, HER2-negative breast cancer. Coleman WB, editor. PloS One (2018) 13:e0200100. doi: 10.1371/journal.pone.0200100

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Zhou N, Zhang C-T, Lv H-Y, Hao C-X, Li T-J, Zhu J-J, et al. Concordance study between IBM watson for oncology and clinical practice for patients with cancer in China. Oncologist (2019) 24:812–9. doi: 10.1634/theoncologist.2018-0255

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Liu C, Liu X, Wu F, Xie M, Feng Y, Hu C. Using artificial intelligence (Watson for oncology) for treatment recommendations amongst chinese patients with lung cancer: feasibility study. J Med Internet Res (2018) 20:e11087. doi: 10.2196/11087

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Lee W-S, Ahn SM, Chung J-W, Kim KO, Kwon KA, Kim Y, et al. Assessing concordance with watson for oncology, a cognitive computing decision support system for colon cancer treatment in Korea. JCO Clin Cancer Inf (2018) 2:1–8. doi: 10.1200/CCI.17.00109

CrossRef Full Text | Google Scholar

24. Choi YI, Chung J-W, Kim KO, Kwon KA, Kim YJ, Park DK, et al. Concordance rate between clinicians and watson for oncology among patients with advanced gastric cancer: early, real-world experience in korea. Can J Gastroenterol Hepatol (2019) 2019:1–6. doi: 10.1155/2019/8072928

CrossRef Full Text | Google Scholar

25. Kim EJ, Woo HS, Cho JH, Sym SJ, Baek J-H, Lee W-S, et al. Early experience with Watson for oncology in Korean patients with colorectal cancer. Orzechowski P, editor. PloS One (2019) 14:e0213640. doi: 10.1371/journal.pone.0213640

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Kim M, Kim BH, Kim JM, Kim EH, Kim K, Pak K, et al. Concordance in postsurgical radioactive iodine therapy recommendations between Watson for Oncology and clinical practice in patients with differentiated thyroid carcinoma: Watson for Oncology for Thyroid Cancer. Cancer (2019) 125:2803–9. doi: 10.1002/cncr.32166

PubMed Abstract | CrossRef Full Text | Google Scholar

27. McNamara DM, Goldberg SL, Latts L, Atieh Graham DM, Waintraub SE, Norden AD, et al. Differential impact of cognitive computing augmented by real world evidence on novice and expert oncologists. Cancer Med (2019) 8:6578–84. doi: 10.1002/cam4.2548

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Zhang W, Qi S, Zhuo J, Wen S, Fang C. Concordance study in hepatectomy recommendations between watson for oncology and clinical practice for patients with hepatocellular carcinoma in China. World J Surg (2020) 44:1945–53. doi: 10.1007/s00268-020-05401-9

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Tian Y, Liu X, Wang Z, Cao S, Liu Z, Ji Q, et al. Concordance between watson for oncology and a multidisciplinary clinical decision-making team for gastric cancer and the prognostic implications: retrospective study. J Med Internet Res (2020) 22:e14122. doi: 10.2196/14122

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Yao S, Wang R, Qian K, Zhang Y. Real world study for the concordance between IBM Watson for Oncology and clinical practice in advanced non-small cell lung cancer patients at a lung cancer center in China. Thorac Cancer (2020) pp:1265–70. doi: 10.1111/1759-7714.13391

CrossRef Full Text | Google Scholar

31. You H-S, Gao C-X, Wang H-B, Luo S-S, Chen S-Y, Dong Y-L, et al. Concordance of treatment recommendations for metastatic non-small-cell lung cancer between watson for oncology system and medical team. CMAR (2020) 12:1947–58. doi: 10.2147/CMAR.S244932

CrossRef Full Text | Google Scholar

32. Zou F-W, Tang Y-F, Liu C-Y, Ma J-A, Hu C-H. Concordance study between IBM watson for oncology and real clinical practice for cervical cancer patients in China: A retrospective analysis. Front Genet (2020) 11:200. doi: 10.3389/fgene.2020.00200

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Yu SH, Kim MS, Chung HS, Hwang EC, Jung SI, Kang TW, et al. Early experience with Watson for Oncology: a clinical decision-support system for prostate cancer treatment recommendations. World J Urol (2021) 39:407–13. doi: 10.1007/s00345-020-03214-y

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Zhao X, Zhang Y, Ma X, Chen Y, Xi J, Yin X, et al. Concordance between treatment recommendations provided by IBM Watson for Oncology and a multidisciplinary tumor board for breast cancer in China. Jpn J Clin Oncol (2020) 50:852–8. doi: 10.1093/jjco/hyaa051

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Kim M-S, Park H-Y, Kho B-G, Park C-K, Oh I-J, Kim Y-C, et al. Artificial intelligence and lung cancer treatment decision: agreement with recommendation of multidisciplinary tumor board. Transl Lung Cancer Res (2020) 9:507–14. doi: 10.21037/tlcr.2020.04.11

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Xu F, Sepúlveda M-J, Jiang Z, Wang H, Li J, Liu Z, et al. Effect of an artificial intelligence clinical decision support system on treatment decisions for complex breast cancer. JCO Clin Cancer Inf (2020) 4:824–38. doi: 10.1200/CCI.20.00018

CrossRef Full Text | Google Scholar

37. Mao C, Yang X, Zhu C, Xu J, Yu Y, Shen X, et al. Concordance between watson for oncology and multidisciplinary teams in colorectal cancer: prognostic implications and predicting concordance. Front Oncol (2020) 10:595565. doi: 10.3389/fonc.2020.595565

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Suwanvecho S, Suwanrusme H, Jirakulaporn T, Issarachai S, Taechakraichana N, Lungchukiet P, et al. Comparison of an oncology clinical decision-support system’s recommendations with actual treatment decisions. J Am Med Inform Assoc (2021) 28:832–8. doi: 10.1093/jamia/ocaa334

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Aikemu B, Xue P, Hong H, Jia H, Wang C, Li S, et al. Artificial intelligence in decision-making for colorectal cancer treatment strategy: an observational study of implementing watson for oncology in a 250-case cohort. Front Oncol (2021) 10:594182. doi: 10.3389/fonc.2020.594182

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Yun HJ, Kim HJ, Kim SY, Lee YS, Lim CY, Chang H-S, et al. Adequacy and effectiveness of watson for oncology in the treatment of thyroid carcinoma. Front Endocrinol (2021) 12:585364. doi: 10.3389/fendo.2021.585364

CrossRef Full Text | Google Scholar

41. Keikes L, Kos M, Verbeek XAAM, Van Vegchel T, Nagtegaal ID, Lahaye MJ, et al. Conversion of a colorectal cancer guideline into clinical decision trees with assessment of validity. Int J Qual Health Care (2021) 33:mzab051. doi: 10.1093/intqhc/mzab051

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Andrew TW, Hamnett N, Roy I, Garioch J, Nobes J, Moncrieff MD. Machine-learning algorithm to predict multidisciplinary team treatment recommendations in the management of basal cell carcinoma. Br J Cancer (2022) 126:562–8. doi: 10.1038/s41416-021-01506-7

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Pan H, Tao J, Qian M, Zhou W, Qian Y, Xie H, et al. Concordance assessment of Watson for Oncology in breast cancer chemotherapy: first China experience. Transl Cancer Res (2019) 8:389–401. doi: 10.21037/tcr.2019.01.34

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Ebben KCWJ, Hendriks MP, Markus L, Kos M, De Hingh IHJT, Oddens JR, et al. Using guideline-based clinical decision support in oncological multidisciplinary team meetings: A prospective, multicenter concordance study. Int J Qual Health Care (2022) 34:mzac007. doi: 10.1093/intqhc/mzac007

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Liu Y, Huo X, Li Q, Li Y, Shen G, Wang M, et al. Watson for oncology decision system for treatment consistency study in breast cancer. Clin Exp Med (2022) 23:1649–57. doi: 10.1007/s10238-022-00896-z

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Séroussi, Bouaud, Antoine. OncoDoc: a successful experiment of computer-supported guideline development and implementation in the treatment of breast cancer. Artif Intell Med (2001) 22:43–64. doi: 10.1016/S0933-3657(00)00099-3

PubMed Abstract | CrossRef Full Text | Google Scholar

48. Strong VE, Russo A, Yoon SS, Brennan MF, Coit DG, Zheng C-H, et al. Comparison of young patients with gastric cancer in the United States and China. Ann Surg Oncol (2017) 24:3964–71. doi: 10.1245/s10434-017-6073-2

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Hollunder S, Herrlinger U, Zipfel M, Schmolders J, Janzen V, Thiesler T, et al. Cross-sectional increase of adherence to multidisciplinary tumor board decisions. BMC Cancer (2018) 18:936. doi: 10.1186/s12885-018-4841-4

PubMed Abstract | CrossRef Full Text | Google Scholar

50. Davies AR, Deans DAC, Penman I, Plevris JN, Fletcher J, Wall L, et al. The multidisciplinary team meeting improves staging accuracy and treatment selection for gastro-esophageal cancer. Dis Esophagus (2006) 19:496–503. doi: 10.1111/j.1442-2050.2006.00629.x

PubMed Abstract | CrossRef Full Text | Google Scholar

51. Hsu Y-H, Kung P-T, Wang S-T, Fang C-Y, Tsai W-C. Improved patient survivals with colorectal cancer under multidisciplinary team care: A nationwide cohort study of 25,766 patients in Taiwan. Health Policy (2016) 120:674–81. doi: 10.1016/j.healthpol.2016.04.001

PubMed Abstract | CrossRef Full Text | Google Scholar

52. Huang Y-C, Kung P-T, Ho S-Y, Tyan Y-S, Chiu L-T, Tsai W-C. Effect of multidisciplinary team care on survival of oesophageal cancer patients: a retrospective nationwide cohort study. Sci Rep (2021) 11:13243. doi: 10.1038/s41598-021-92618-w

PubMed Abstract | CrossRef Full Text | Google Scholar

53. Jaap K, Fluck M, Hunsinger M, Wild J, Arora T, Shabahang M, et al. Analyzing the impact of compliance with national guidelines for pancreatic cancer care using the national cancer database. J Gastrointest Surg (2018) 22:1358–64. doi: 10.1007/s11605-018-3742-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: artificial intelligence, multidisciplinary team meetings, clinical decision support system, machine learning, concordance between CDSS and MDS

Citation: Oehring R, Ramasetti N, Ng S, Roller R, Thomas P, Winter A, Maurer M, Moosburner S, Raschzok N, Kamali C, Pratschke J, Benzing C and Krenzien F (2023) Use and accuracy of decision support systems using artificial intelligence for tumor diseases: a systematic review and meta-analysis. Front. Oncol. 13:1224347. doi: 10.3389/fonc.2023.1224347

Received: 17 May 2023; Accepted: 11 September 2023;
Published: 04 October 2023.

Edited by:

Gennaro Daniele, Agostino Gemelli University Polyclinic (IRCCS), Italy

Reviewed by:

Francesco Bianco, University of Illinois Chicago, United States
Tayana Soukup, Imperial College London, United Kingdom

Copyright © 2023 Oehring, Ramasetti, Ng, Roller, Thomas, Winter, Maurer, Moosburner, Raschzok, Kamali, Pratschke, Benzing and Krenzien. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Felix Krenzien, felix.krenzien@charite.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.