Introduction

“Artificial intelligence (AI), in general while not well defined, is the capability of a machine to imitate intelligent human behavior” (Mintz and Brodie 2019, 73) or to employ the use of specifically tasked computer software to undertake tasks usually necessitating the intelligence of the human brain (Bærøe et al., 2020). Such task-directed systems may lexically “think” and “act” in a “human” manner and, further, may even “think” and “act”’ rationally (Stanila 2018).

Broadly, AI integrates large volumes of data through which knowledge and experience in problem-solving is gained at a rate and volume impossible for humans and is employed in medicine to achieve high levels of accuracy in the predictive tasks of diagnosis, prognosis, and therapeutics and, hence, to improve healthcare. As distinct from being a simple data repository and administrative (appointment and billing) system, as in electronic medical records (EMRs), AI in medicine (AIM) has the capacity to improve its performance through “auto-learning” in real-world applications (Reddy et al. 2019).

AIM can be physical, such as in robotic surgery, or virtual, relating to digital image manipulation, neural networks, and machine and deep learning (Hamet and Tremblay 2017). Examples include the following:

  1. 1.

    Digital imaging. AIM is well-established for tasks of image processing and interpretation, as may occur in analysis of radiological images (Zhou et al. 2000), skin lesions (Ercal et al. 1994), or retinal photography (Gardner et al. 1996).

  2. 2.

    Creating artificial neural networks, analogous to human decision-making processes, employing mathematical and statistical data-modelling processes to deal algorithmically with unmanageably complex problems (Baxt 1995).

  3. 3.

    Machine learning, whereby computers “learn” from a process of repetitive data examination using predetermined processes to answer a particular question. It is an iterative process that is critically dependent on the integrity of a training data set if it is to generate reliable results or predictions relevant to diagnosis and treatment (Schwarzer et al., 2000). Machine learning can lead to deep learning.

  4. 4.

    Deep learning, whereby machine learning contemporaneously merges multiple data sets which are iteratively evaluated in sequential “convolutional neural networks.” These operational steps may be invisible to both developers and users (Lakhani and Sundaram 2017; Schirrmeister et al. 2017).

AIM is being implemented in a number of ways, most recognizably in:

  1. 1.

    assessing the risk of disease onset;

  2. 2.

    making estimates of treatment success / assessing efficacy;

  3. 3.

    managing or alleviating complications of treatment;

  4. 4.

    assisting ongoing patient care; and

  5. 5.

    clinical research and drug development. (Becker 2019)

In all of these instances the “concept of using AI in medicine should be as a decision support system with the final action being from humans” (Mintz and Brodie 2019, 79), alternatively “speeding up or aiding human investigation” (Ching et al. 2018, 3). Despite the overtly positive valance evident in the extensive AIM literature, caveats and limitations exist, typically around the integrity of data inputs from the simplest to the most complex applications (Min et al., 2016). It is evident that human inputs to and control over decision support systems must be “meaningful” to deal with the ethical consequences of degrees of augmentation of human agency in patient interactions (Braun et al. 2020).

It has been proposed that these developments may result in an embodied form of AIM where “natural language conversational agents” may be capable of passing the “Turing test,” being indistinguishable from a human agent in health encounters, and that this would assist physicians, empower patients, and allow nudging towards positive health behaviours (Laranjo et al. 2018). Yet, the potential for unintended consequences relating to the safety (Ash et al., 2004; Fraser et al., 2018), efficacy (Becker 2019), rights (Stanila 2018), and procedural and distributional justice (Gill 2018; Reddy et al. 2019; Risse 2019) of this modality and other forms of AIM already incorporated into routine practice requires careful assessment (Schönberger 2019). Our trust in AIM must be justifiable and justified (Bærøe et al., 2020). It has been cautioned that AIM algorithms—if uncritically adopted—may “become the repository of the collective medical mind” (Char et al., 2018, 981).

Healthcare comprises the largest area of AI investment since 2016 (Buch et al., 2018) and is characterized as the pre-eminent means of progress in public and individual health (Fogel and Kvedar 2018), with the consequence that medical technology increasingly affects “not only the way doctors encounter and treat patients but also how they [patients] understand their ailments and complaints” (Hoffman et al. 2018, 246).

AIM may fundamentally change the roles of humans working in medical disciplines reliant on pattern recognition skills, which may be drastically reconfigured or rendered potentially obsolete (Fogel and Kvedar 2018; Coiera 2018, 2019), as in the following four examples (examples 2 to 4 also indicate aspects of AI where specific ethical caution is needed):

  1. 1.

    It is posited that robotic surgery may replace much human surgery by the late 2050s (Fogel and Kvedar 2018).

  2. 2.

    A patient-facing digital symptom checking programme is claimed to outperform “the average human doctor on a subset of the Royal College of General Practitioners exam” (Fraser et al., 2018). Yet this conclusion was based on a flawed validation process (Goldhahn et al., 2018) and is rejected by many patients and their advocates as hyperbole (Mittelman et al., 2018).

  3. 3.

    Machine learning appears to outperform psychiatrists in suicide prediction (Passos et al. 2016; Walsh et al., 2017), raising the possibility of ethical justification for increased remote electronic surveillance of digitally connected “e-patients” at risk (Fonseka et al., 2019).

  4. 4.

    It has been proposed that dermatologists may become obsolete in the diagnosis of skin malignancy, yet it has been established that errors in AI arise from misinterpretation of lesions in persons with darker skin, potentially perpetuating health inequities (Adamson and Smith 2018).

The technologization of medicine is a contemporary positivist metaphor (Salvador 2018) that demands scrutiny since it will affect patients, current practitioners, and students, and all these groups must come to a deep understanding of “the difference between what a machine says and what we must do” (Coiera 2019, 166).

The “AIM argument” has been insufficiently teased out in relation to the soundness of its premises, and these premises require further enquiry to objectively assess how physicians should respond. This paper highlights the identified and unidentified epistemic, ontologic, ethical, legal, and sociopolitical challenges that AIM poses for the contemporary physician and their patients.

Ontologic and Epistemic Issues of AIM

Ontological Differences

The meaning of ontology in AI differs from that in philosophy; rather than interrogating the nature of being, existence, categorization, and objective reality, ontology in AI pertains to the development of “machine-processable semantics of information sources that can be communicated between different agents (software and humans)” (Fensel 2001). AI “ontology” describes a machine-readable, precisely defined, and constrained model of concepts relating to a real-life phenomenon that permits domains of data to be constructed that “capture” knowledge, that are then manipulated algorithmically to permit “knowledge sharing and reuse” (Fensel 2001, 11).

“E-patients” are “extended” individuals informed by and responsive to both physical and virtual communities of other “e-people”—relatives and friends who access information on their behalf (Kovachev et al., 2017)—whose decisions can be influenced by the “wisdom” or opinions of unrelated or previously unconnected persons with whom their opinions and beliefs are shared (Colineau and Paris 2010).

AIM can be instantiated as an “expert iDoctor,” being an artificial member of the healthcare team “theoretically capable of replacing the judgment of primary care physicians” (Karches 2018, 91), as exemplified in the following headline articles on IEEE Spectrum which personifies technology as an active agent:

Laser Destroys Cancer Cells Circulating in the Blood. The first study of a new treatment in humans demonstrates a non-invasive, harmless cancer killer;

Smart Knife Detects Cancer in Seconds

By excluding mention of human agency, these statements imply autonomous machine function, potentially denigrating human capacities and skills (Karches 2018), and hence the actors in a clinical encounter are the patient, their various influences, the physician, and an instantiated “machine entity” in a therapeutic triad (Swinglehurst et al. 2014).

By de-emphasizing human agency, instantiated AIM raises the question of a new ontological argument supporting the existence of AIM as a “higher being.” The metaphysical non-inferiority of AI was demonstrated in 2016 when an AIM programme constructed a valid refutation of Gödel’s ontological argument, thereby demonstrating that “artificial intelligence systems—particularly higher-order automated theorem provers—are capable of assisting in the discovery and elucidation of new and philosophically relevant knowledge” (Benzmüller and Paleo 2016). In rationally disproving the existence of God, perhaps “the singularity” is closer as machine rationality is—at least—non-inferior to human rationality.

Epistemological Differences

The epistemology of AIM revolves around the deployability of parallel “learner” and “classifier” algorithms—which probabilistically transforms data into knowledge used to generate predictions. This raises epistemic concerns relating to matters such as

  1. 1.

    biased training data (for instance, relating to race and gender);

  2. 2.

    inconclusive correlations (for instance, predicting defendant recidivism);

  3. 3.

    intelligibility (“black box” inexplicable functions);

  4. 4.

    predictive inaccuracy (for instance, discharging asthma patients with pneumonia from hospital); and

  5. 5.

    discriminatory outcomes (predicting defendant recidivism, related to (1)). (Schönberger 2019).

Hence there remains fundamental disquiet about the potential agency that AIM may be delegated to have over human autonomy since it is “not appropriate to manage and decide about humans in the same way we manage and decide about objects or data, even if this is technically conceivable” (European Group on Ethics in Science and New Technologies 2018, 9).

Epistemic challenges also arise for students and physicians related to the use of information by e-patients (Kaczmarczyk et al. 2013; Masters 2017; Osborne and Kayser 2018; Grote and Berens 2019); how physicians relate to such patients, with resultant challenges to historical conceptions of privacy and confidentiality; unanticipated effects on healthcare equity; whether there is a discernible “medical IT ethics”; and whether Big Data can be employed for overtly coercive behavioural modification via “hypernudging” (Yeung 2017) under the guise of qualified paternalism (Souto-Otero and Beneito-Montagut 2016; Grote and Berens 2019).

Techne and Phrenos in AIM

Collecting, correctly analysing, and deploying information (Aristotelian techne) is not equivalent to possessing knowledge and judgement based on experience and expertise to achieve a good purpose (phrenos). The democratization of information and challenges to the sociological role of experts in modern “knowledge societies” means that physicians are no longer the sole custodians and mediators of a “body of knowledge and its application” (Grundmann 2017, 27). Expert physicians deploy scholarly and generalizable propositional knowledge. Non-propositional knowledge is derived from experience and cognitive resources and may, with time, become “more” propositional (for instance, through the Delphi approach) and then dynamically inform practice (Rycroft-Malone et al. 2004).

In comparison to most physicians, lay persons typically use less granular or rigorous propositional knowledge and various sources of non-propositional knowledge (some intensely personal and value-laden), some derived from sources such as relatives or a distant network of web-based contacts. Such sources of information, including unverified opinions and advice, are afforded high degrees of salience simply through the individual’s efforts and engagement with information-seeking (Gray et al. 2005), which can be weighed against medical expert information; the result may be trust in or mistrust of medical opinion and advice.

Trust

The erosion of implicit trust in medicine and distrust per se predates the internet-driven expansion of information (Mechanic and Schlesinger 1996; Meyer et al. 2008). A lack of personalized medical care can foster patients’ trust in online information which appears to be personalized and salient when accessed through non-Bayesian search engines specifically designed and patented to resonate with one’s interests and beliefs (Merriman and O’Connor 1999; Krishan, Chang, and Lambert 2002; Mason et al. 2002; Kublickis 2007). Personalized, salient “misinformation” may enable harmful beliefs and behaviours such as not interfering “with the natural process of inflammation” (Ritschl et al. 2018) and vaccination refusal, (Davis 2019; Dyer 2019; Heywood 2019) at odds with evidence-based best practice.

Expertise

Expertise is ascribed to a person through the process of consultation; the status of social and political stakeholders may be misunderstood, since “not all stakeholders are per se experts” (Grundmann 2017, 45). Lay persons as “influencers” can “claim” expertise (Leach 2019), patronage ascribes expertise to them, affirming the consequent. In contrast, licensure as a (medical) expert has an overtly public objective of independently certifying that licenced experts possess particular knowledge, deploy certain skills, and conform to certain behavioural standards (LaRosa and Danks 2018). However, licensure extends to professionals’ use of devices but not the devices themselves. The use of AI may affect and/or erode trust in the autonomy of doctors as the controllers of AI rather than simply being the professional group “licenced” for its use, devaluing the input of the physician (LaRosa and Danks 2018; Karches 2018).

Automation Bias and Complacency

With automation bias, humans preferentially accept automated/computerized recommendations as a “heuristic replacement of vigilant information seeking and processing” (Mosier and Skitka 1996, 203). Delegating to clinical decision support systems may enable false-positive errors of commission (inappropriately acting on incorrect advice) and false negative errors of omission (inaction due to non-notification) (Goddard, Roudsari, and Wyatt 2011).

In contrast, automation complacency arises when humans ascribe higher accuracy and lower error rates to technology compared to humans, and insufficiently scrutinize technologies’ operations (Cohen and Smetzer 2017). In both situations, AIM is afforded an unwarranted expertise which has “no basis for generalisation to truly novel situations, since it is simply grounded in past experiences” when persons lack “understanding of the ‘mechanisms’ by which the behavior or actions are generated” (LaRosa and Danks 2018, 2). In medical encounters, the third element is the trust relationship between doctor (and related institutions) and patient, which is affected by their trust in the predictive veracity of AIM (LaRosa and Danks 2018). Time constraints, cognitive load, user cognitive style, accountability frameworks, and heavy workload—typical of many medical encounters—are established drivers of automation bias (Goddard, Roudsari, and Wyatt 2011). Automation bias is particularly problematic in instances where there is no true “cut-point” between normality and abnormality (Goldenberg, Moss, and Zareba 2006).

The Veracity or “Truthfulness” of AIM Prediction Models

The performance of any AIM is critically sensitive to the fidelity of its data inputs, as exemplified by false-negative misdiagnoses of skin lesions in persons with pigmented skin (Adamson and Smith 2018), reflecting inappropriate overfitting to the training data (Coiera 2019) also evident in other decision-support programmes (Kim, Coiera, and Magrabi 2017; Fraser, Coiera, and Wong 2018; Coiera 2018, 2019), raising the question whether—even for unidimensional tasks—the physical interaction between physician and patient could or should be undertaken by a robotic physician, which seems unanswerable until the overfitting problem is better characterized (Gichoya et al. 2018).

Patients’ Views on What Counts for Knowledge

Persons in their teens in the early millennium (net health consumers after 2030) identified that the Internet was their primary source of health information; this information gains salience through the act of personalized searching (Gray et al. 2005). In 2011, 80 per cent of U.S. adult internet users sought information about at least one of fifteen healthcare topics, 23 per cent of social network users follow friends’ health updates, and routine “memorialization” of persons with certain health conditions occurs through social media (Fox 2011). Furthermore, “searching for health information on the Internet has a positive, relatively large, and statistically significant effect on an individual’s demand for health care” (Suziedelyte 2012, 1828). This behaviour has the potential to lead to poor quality care driven by patient satisfaction metrics unrelated to quality outcomes (Arnold, Kerridge, and Lipworth 2020) , presenting physicians with “new issues on how to manage the information, make good clinical decisions, and impart that information back to individuals with disease” (Deane 2019, xx).

Patients are also presented with challenges. Since individuals have increasingly free access to the data within their medical record and an ability through internet sources to interpret their results, there is a risk of misinterpretation (Fraccaro et al. 2018) and distress (Deane 2019). Patient’s self-directed testing raises problems when symptoms are ignored in the presence of a negative self-determined test (Ickenroth et al. 2010). It is unclear whether patients are prepared to (or should) assume responsibility for harms that arise. Finally, medical practitioners may also lack understanding of the implications of test results and particularly whether testing even advances patients’ interests (Arnold 2019).

Ethical Considerations of AIM

The Development of “Machine/AI ethics” and Health

“Machine ethics” and how intelligent systems interact with humans is not simply the “accidental dilemma” of autonomous driving vehicle vs human accidents, which are problematic (Fleetwood 2017) in ways that the traditional “trolley dilemma” is not. Autonomous vehicle behaviour is informed and governed by several forms of ethical decision-making algorithms (Leben 2017). In healthcare, there are broader issues with automated resource allocation, prioritization, benefit/loss dilemmas, and consequent existential threats (Kose and Pavaloiu 2017) with risk assessment algorithms employed in decision-making (Rasmussen 2012; Nagler, van den Hoven, and Helbing 2018). Human input is needed as an active veto (Verghese, Shah, and Harrington 2018) to avoid automated decisions resulting in unfair outcomes (Broome 1990).

The potential for Big Data to personalize preferences and direct consumers’ attention into or out of what has been described as a locus of self-resonance—the “echo chamber” or “filter bubble”—also permits the possibility of “Big Nudging” (Souto-Otero and Beneito-Montagut 2016) by employing personalized strategies to operationalize health and other governmental policies, affecting an individual’s autonomy through coercion, particularly when data from health devices linked to the “internet of things” covertly reporting to (for instance) health insurance decision algorithms (Bronsema et al. 2015; Helbing et al. 2019).

Moral Enhancement Through AIM, Distributive Justice, and Libertarian Paternalism

It is posited that ethically orientated and directed AI may be a partial solution to the contemporary “moral lag problem” (Klincewicz 2016).

Autonomy-enhancing agent-specific augmentation of moral judgement might overcome an agent’s inherent limitations whereby “moral AI” may promote collective “moral distributive justice” (Savulescu and Maslen 2015) through the elimination of patients’ or physicians’ arbitrary decisions based on racial, gender, or other stereotypes and unconscious biases (Klincewicz 2016). Robotic “moral nudging” has also been proposed (Borenstein and Arkin 2016); however, moral nudging by any agent can be considered a form of libertarian paternalism (Hausman and Welch 2010) at best, or outright paternalism at worst. If, as a result of any form of nudging, the range of choices available to an agent are neither constrained, forbidden, nor inherently “troublesome,” then, rather than being coercive, nudging can steer agents away from “poor” choices affected by social/peer pressure and framing, heuristics, lack of due attention, inappropriate optimism, overconfidence, loss aversion, bias to the status quo, inherent resistance to change, and simple error (Sunstein and Thaler 2003). Crucially, nudging should not advance the interests of a third party, and in this sense, if algorithmic decisions are likely to improve the well-being of an agent, it must be considered whether this is the primary aim or a by-product of an AIM system implemented by a healthcare organization or government instrumentality. If AIM primarily serves these entities’ ends, AIM potentially constrains rather than augments an agent’s autonomy and/or, is a coercive agent.

There is a significant burden of proof incumbent on machine ethicists to justify the development of artificial moral agents (AMAs) over and above the fact that their development is simply possible (van Wynsberghe and Robbins 2019). Van Wynsbergh and Robbins (amongst others) emphasize the complexity around:

  • the “inevitability” of AMAs, that AMAs can be relied upon to prevent harm occurring to humans and the related notion that harm is encompassed solely by “safety”;

  • the spurious conflation of the “black box” reasoning process of AIM, being both akin to yet superior to the unpredictability of human decision-making;

  • the stipulation that AMAs must not be used for immoral purposes; and

  • a rejection of concerns related to “moral deskilling” as described.

Without specific reference to the term AMA, Biller-Andorno and Biller recently proposed that in certain situations of medical uncertainty the capacity for augmented moral imagination and ethical insight may be better provided by machine learning (Biller-Andorno and Biller 2019).

As discussed, data inputs are crucial for appropriate outcomes of machine learning and hence the quality of data inputs arising from the electronic medical record (EMR) or other sources are likely to be insufficient for nuanced ethical guidance. This arises since it is well documented that information in the EMR is rarely if ever questioned after it is first obtained, witnessed by the near-ubiquitous practice (Tsou et al. 2017) and related critique of “cut and paste” or “cloned”’ entries in EMRs (Hirschtick 2006; Hartzband and Groopman 2008; Thielke, Hammond, and Helbig 2007; O’Donnell et al. 2009; O’Malley et al. 2010; Schenarts and Schenarts 2012; Thornton et al. 2013; Weis and Levy 2014). If incorrect, mutable, or absent, data inputs will have facets of rather than a complete “critical reality” and suffer from “distortions of data” (Smith and Koppel 2013). If narrative in the EMR is replaced by structured data codes (Wasserman 2011), this introduces bias and inaccuracies in any subsequent electronic determinations and recommendations based on this information.

The inherent risk of AIM prediction was evident with IBM’s oncology support software, particularly the fact that the “system was trained using synthetic data and was not refined enough to interpret ambiguous, nuanced, or otherwise “messy” patient health records,” and was reliant on exclusively U.S. medical protocols and hence led to “missed diagnoses and erroneous treatment suggestions, breaching the trust of doctors and hospitals” (Cowls et al. 2019, xx).

Purportedly “stable” patterns in a person’s prior decision-making may be difficult to substantiate, and it is suggested that comparison and extrapolation from population data—the “wisdom of crowds”—will appropriately inform AMAs (Biller-Andorno and Biller 2019), implying that decisions may be validly based on populism. Unless ethical decision-making is to be replaced by automated argumentum ad populum, it is inappropriate to remove “the bias [constraints] of human knowledge” (Biller-Andorno and Biller 2019, 1482).

These authors further state that “future generations may find it quite unthinkable to do entirely without a GPS. Perhaps the role of AI-assisted ethical decision making will be similar” (Biller-Andorno and Biller 2019, 1483). However, this is a poor analogy since there is no coherent link between the skills relevant to using GPS, which relate to unambiguous, verifiable outcomes, and the moral imagination and ability to reflect critically that characterizes ethical decision-making. AMAs may only be able to interpret moral situations once conventional deliberations have arrived at normative views and approximations. They may have limited applicability in truly novel situations and must not perpetuate or entrench biases and inequities such as may occur when AI is used in employment decisions (Caplan and Friesen 2017; Steels 2018; Israni and Verghese 2019).

Delegating to AMAs also runs the risk of succumbing to previously described automation bias/complacency (see above), particularly where there is no true “cut-point” of normality/abnormality, such as occurs in ethical conundrums. As Wallach et al. observe, human moral judgement “is a complex activity … a skill that many either fail to learn adequately or perform with limited mastery” (Wallach, Allen, and Smit 2008, 565). Apart from broadly shared transcultural values, it is evident that many cultures and individuals diverge from prevailing Western ethical systems and mores. Hence it can be impossible to agree upon criteria for judging the adequacy of moral decisions in multicultural societies. The irony of an “ethical GPS,” based on biased datasets (see below) in this connection is disturbing at least.

If ethical prediction algorithms “prove to be useful, reliable, and convenient, they might easily become standard tools with widespread use” (Biller-Andorno and Biller 2019, 1480); if so, concern clearly attends the question of with whom decisions regarding utility, reliability, and convenience will rest on whether those decision-makers will have simple or complex reasons to adopt artificially intelligent predictive models of individuals “best interests” that ultimately constrain individuals’ autonomy.

Autonomy

Data mined from a personal health or third party EMR augmented by social media data may assist in medical decision-making for a person temporarily or permanently incapacitated (the so-called “triple-burden“) in the absence of an available human substitute decision-maker—through an “AI-assisted autonomy algorithm” (Lamanna and Byrne 2018). However, rational persons’ preferences are inherently fluid (Benhabib and Day 1981), and it is not clear a priori whether a person with capacity would agree with an algorithmically derived treatment recommendation based on inferred preferences from social media, let alone whether they would permit such a decision to be implemented. Though social media and internet activity does give details as to one’s interests (Lamanna and Byrne 2018), there is a discontinuity between human objectives relating to the definition of the good—which cannot always be inferred from constructed social media identities or the internet—as distinct from the predefined objectives of decision algorithms and benefit/loss analyses that may be engineered into systems to limit cost/expenditure to a third-party payer (potentially) at the expense of human objectives (Kose and Pavaloiu 2017). Furthermore, if a human substitute decision-maker is present, it is questionable whether the decisions of the substitute decision-maker could be trumped by arguably more broadly informed AIM-derived decisions, ascribing hegemony to the decision-making reliability on AI (through automation bias).

Understanding AIM’s benefits and limitations may be more problematic for persons with suboptimal health literacy who may be inappropriately influenced by non-expert opinion or, alternatively, default to automation bias and overconfidence (Cohen and Smetzer 2017). There has been little debate as to the question of whether, through automation bias or overconfidence, healthcare may take a regressive and paternalistic turn dictated by AIM, rather than a path negotiated with physicians. It is also possible that automation bias and complacency will affect both the patient and the physician; in other words: doctor knows best—but the computer knows more and makes fewer mistakes.

AIM and the Potential for Patient Discrimination and Marginalization

AIM data sets must be unbiased regarding matters of age, race/ethnicity, gender identification, abilities, geographic location, and socio-economic status (Caplan and Friesen 2017; Parikh, Teeple, and Navathe 2019; Hwang, Kesselheim, and Vokinger 2019; Israni and Verghese 2019) to avoid automatically entrenching inequalities, as above. However, data sets are often incomplete as a result of patchy implementation of access to the Internet, data uploading, and capture (Alam et al. 2018), and a lack of operator skillset, infrastructure, and suitable hardware (Hughes et al. 2018), regardless of the functionality of internet speed and bandwidth. Inconsistent and non-uniform data sharing and access to EMR data (Wang and DeSalvo 2018) means that inequities will be exacerbated if human services are withdrawn through neoliberal cost-containment imperatives (Graddy and Fingerhood 2018) and incorrect (even disingenuous) assumptions of equality of access. The current example in Australia of the transition of clinical care from physical encounter to virtual consultations over a matter of weeks during COVID-19 raises the possibility that “efficiency” in the context of pandemics may be extrapolated to the non-pandemic future (Arnold and Kerridge 2020).

Nonetheless, the potential for social media as a positive means of delivering simple primary healthcare interventions has been explored (Wu et al. 2018), since physical proximity and geography are not constraints—with the emergence of the e-patient—to social interactions (Collins and Wellman 2010) or agency (Rannenberg, Royer, and Deuker 2009) However, the same caveats regarding health literacy, biased datasets, and non-uniform access noted above also apply here.

Bias and Stigmatization

Though facial recognition software (including facial expression recognition and inference of mood) is well advanced, this software functions less well in non-Caucasian persons (Venditti, Fleming, and Kugelmeyer 2019)—as has been demonstrated in dermatological diagnosis (Adamson and Smith 2018)—potentially introducing new sources of bias through inaccurate data inputs, even if ethically sentient AI systems can be developed. At the most basic level, data inputs from facial recognition software may improve verification that a physician or healthcare professional is interacting with the correct patient and may show promise for genetic syndrome recognition (Mohapatra 2015). The fidelity of these data inputs cannot go unquestioned since the potential for significant misadventure from patient misidentification exists through overfitting, automation bias, automation complacency, and other human factors, particularly when practitioners are under heavy cognitive load.

Seemingly unaware of extensive biomedical literature relating to the importance of non-verbal communication, Sikora and Burleson note “mounting evidence that body expression is as significant to communication” as verbal communication (Sikora and Burleson 2017, 548). Communication through gesture is inherently nuanced and individualized, and at present it is unlikely that AI interpretation of non-verbal communication has an acceptable degree of reliability (Sikora and Burleson 2017). If AI systems are unable to reliably interpret patients’ body language, facial expression, voice tone, and inflection—data easily available to physicians not distracted by data entry in the EMR—then there are likely to be erroneous judgements made by autonomous or based on semi-autonomous algorithms. These “data inputs” are foundational aspects of trust in person-centred care.

Potential New Forms of Harm to Patients

Near Misses

The benefits of IT in medicine are often lauded, but there is comparatively little investigation of errors and misadventures related to the use of IT. Faulty or absent data inputs, lack of facility with the technology per se, and changes in decisions consequent on IT technology/automation bias have resulted in trivial, near-miss, or consequential harms, including fatalities (Kim, Coiera, and Magrabi 2017). Missed diagnoses in dermatology have already been discussed (Adamson and Smith 2018).

Quantifiable Harms

A systematic review of health IT outcomes confirmed that 53 per cent of studies identified quantifiable harms including (rarely) death, with near misses in 29 per cent of studies (Kim, Coiera, and Magrabi 2017). Regulation pertaining to product updates, modifications, and retesting of performance is necessary to assess whether programmes or devices diverge with such modifications (Hwang, Kesselheim, and Vokinger 2019).

Denial of Service

With system adaptability may come a susceptibility to adversarial attacks which will potentially compromise data reliability (Huang et al. 2017), unforeseen privacy and confidentiality vulnerabilities, and the insertion of ransomware that paralyses rather than simply compromises patient care at the individual practice (Susło, Trnka, and Drobnik 2017) and healthcare system levels. This has potential effects on physician training (Zhao et al. 2018), with nationwide rather than local implications (Hughes 2017). Clinicians are (and will remain) the backstop for such problems. Recalling that “problems with IT are pervasive in health care,” these problems can affect care delivery and cause patient harm (Kim, Coiera, and Magrabi 2017, 258).

Drug Errors

Medication dispensing is particularly subject to automation bias and complacency, resulting in medical errors when autocomplete prescribing functions are either not checked or presumed to be correct. Under high cognitive load—such as multitasking and when being frequently interrupted or distracted (Papadakos and Bertman 2017)—humans default to heuristics, and failures in automation are less likely to be identified (Cohen and Smetzer 2017). Cognitive load is a common reason for inappropriate delegation to technology (Parasuraman and Manzey 2010), and hence system errors are less likely to be detected and corrected. Since physicians routinely employ workarounds in response to poor user interfaces or user experiences, they are known to deliberately defeat the inbuilt advantages of systems. For instance, “alert fatigue” results in deliberate deactivation of distracting medication interaction checkers and disabling of hard stop alerts (Martin and Wilson 2019).

Potential Behavioural Changes and New Forms of Harm to Physicians

Agency of the Physician

Medicine’s “most cherished and defining values including care for the individual and meaningful physician–patient interactions” may be compromised by adherence to neoliberal principles of “efficiency, calculability, predictability and control” enabled by AI (Dorsey and Ritzer 2016, 15). Managerial control of physician agency may be achieved by soft or hard-stop guidelines, decision tools, the specification of tasks to be completed, tests that are mandated or impermissible, and the implementation of treatment pathways by non-human automated means in EMRs.

“Disruption”

When applied to clinician behaviour, “disruptive” is pejorative (Rosenstein and O’Daniel 2005), yet when referring to technology, disruption has a distinctly positive and iconoclastic valence (Downes and Nunes 2013) and is held to be an unchallenged good, with unquestioned enthusiasm for the potential for IT to enhance student teaching and clinical care (Robertson, Miles, and Bloor 2010). However, the potential hazards that technology itself can have are downplayed. Negative effects on students’ learning range from annoyance and interruptions to a diminution in scholarship and study (Selwyn 2016) and may be clinically disruptive for physicians (Papadakos and Bertman 2017; Dhillon et al. 2018; Dhillon, Gewertz, and Ley 2019). Technology—when clinically disconnected and designed for documentation to mitigate medico-legal risk and facilitate billing—produces technology that “unnecessarily disrupts clinical work and frustrates clinicians, with less benefit than otherwise possible” (Coiera 2018, 2331).

Distracted Doctoring

The phenomenon of personal devices inappropriately used in the workplace and unnecessary technological interruptions clearly affecting patient safety has been well documented, with the result that limiting or quarantining the use of personal devices in healthcare settings has been advocated (Papadakos and Bertman 2017).

Interactions With the EMR

The incorporation of EMRs has often been noted to be largely due to the coding and billing requirements of healthcare organizations, resulting in a cost/quality trade-off implemented in the context of a “non-cooperative oligopoly with caregivers and administrators focusing on competing objectives” (Sharma et al. 2016, 26). Some health information technology is unfit for the delivery of care versus conformity with billing and documentation (Dhir et al. 2015). Presently it is often unclear how “EHRs are used to capture and represent what clinicians are thinking about the patients and their problems” (Colicchio and Cimino 2018, 172). Clinicians are now required to be simultaneously care providers, scribes, and records managers. Many EMRs are primarily designed to facilitate coding and billing rather than patient care, and intelligent clinician input is needed to configure and optimize these records (Ashton 2018) for the purpose of delivering satisfactory patient care.

Scribes as a Workaround for the EMR

To counter this impost, the implementation of human or non-human scribing to accommodate the needs of the EMR has been suggested (Doval 2018; Bates and Landman 2018). Medical scribes demonstrably increase physician productivity (Walker et al. 2014) and increase hospital revenue over and above the costs of scribes (Slomski 2019). However, if EMR usage was straightforward and truly labour-saving, there would seem little need to employ or deploy scribes (Doval 2018; Bates and Landman 2018; Ashton 2018; Mosaly, Guo, and Mazur 2019; Slomski 2019).

Physician Well-Being, Burnout and Behavioural Changes

Burnout resulting in morbidity in medical professionals is well recognized (Shanafelt et al. 2015; Shanafelt, Dyrbye, and West 2017), and the EMR is cited as a frequent contributor amongst other organizational factors (West, Dyrbye, and Shanafelt 2018). Burnout has been repetitively linked to EMR interactions (Mosaly, Guo, and Mazur 2019), yet some authors have proposed modifications to physicians’ work practices, effectively making the physician —not the EMR—the problem (Babbott et al. 2013). Junior doctors increasingly spend time remotely dealing with the EMR in supposed personal time (Canham et al. 2018). However, some groups (Rassolian et al. 2017) claim that the EMR per se is not responsible for burnout (citing lower levels than other studies) as distinct from more global workplace factors, though it seems impossible to disentangle the workplace’s requirement for EMR engagement from the purported “protective effect” of face-to-face interactions on burnout.

Malign Effects on Patient–Physician Interactions

The term “acquired autism” has been used to describe the potential for EMR compliance to have malign effects on physician–patient interactions (Loper 2018). Physicians are confounded by “the industry-driven expectation to simultaneously serve as curators of the EHR and physicians to patients” (Loper 2018, 1009), and the resultant behavioural change may be antithetical to the interactions needed to establish or maintain a trusting therapeutic relationship. This has been characterized as the “prioritisation of machine objectives over human objectives” (Kose and Pavaloiu 2017, 203).

Deskilling

AIM may contribute to skills loss (Lu 2016). If AIM becomes a new unattainable benchmark, the question of whether it is ethical for humans to undertake certain procedures will arise, particularly if medical error is prevalent and increasingly seen to be preventable (James 2013). However, preventability may largely be a factor of patient and physicians’ access to technology, and unequal access may create a hierarchical health system that is demonstrably unjust. If, as Biller-Andorno and Biller (2019) have contended, ethical AI is akin to using GPS (which has been refuted), then the ethical “navigational skills” of physicians may atrophy, and hence we run the risk of becoming “ethically lost,” if their analogy holds, should the machine fail (Biller-Andorno and Biller 2019).

Legal Issues of AIM

Previously, the privacy and confidentiality of physical medical data—whether for health-related usage or research—was more easily managed through informed patient consent or formal requests to access physical records. However, notions of the “ownership” of data acquired by professionals about “their” patients has long been quashed by legislation separating access rights from physical ownership (Parkinson 1995).

The breadth and depth of information held in EHRs, the ease of authorized and unauthorized access, and simplicity of transmission means that electronic records are fundamentally different from paper records, particularly since (non-physical) security breaches may paralyse whole health systems (Hassan 2018) rather than affecting one person’s confidentiality (Sade 2010). Problems encountered to date with access to shared medical information will necessitate new potentially cross-jurisdictional precedents (Polito 2012). Formal ethics certification for non-clinical health information professionals is non-standardized (Kluge, Lacroix, and Ruotsalainen 2018), creating cross-jurisdictional problems.

Students and healthcare professionals (Kuo et al. 2017) may inappropriately track unaware and non-consenting patients; though tracking is touted as a positive learning exercise, at least 50 per cent of students do not or cannot differentiate between tracking for educational purposes and curiosity-based inquiries (Brisson and Tyler 2016; Brisson et al. 2018). The latter actions are illegal and are actionable in most jurisdictions (De Simone 2019). Hence medical schools’ “informatics and EMR curricula need to teach students to engage meaningfully and judiciously with patients’ data” (Stern 2016, 1397), possibly with registries of consenting patients (Brisson et al. 2018).

In addition to these considerations encompassing conditions of use, system transparency, data content, and quality, it is important to articulate what “privacy protections exist for patients whose data are used,” and how this aligns with jurisdictional privacy legislation (Evans and Whicher 2018, 860). Data held in electronic health records may be de-identified and, through data linkage, generate beneficial research outcomes. There is a tension between beneficence (for the public) and private confidentiality, overriding contemporary notions of privacy and confidentiality according to the duty of “easy rescue,” particularly in circumstances of minimal risk as defined by research regulators (Mann, Savulescu, and Sahakian 2016). Further concepts such as altruism (McCann, Campbell, and Entwistle 2010), supererogation (Schaefer, Emanuel, and Wertheimer 2009), and the avoidance of “free-riding” (Allhoff 2005) are relevant to this argument.

Sociopolitical

Justified Innovation?

The question of justified innovation, for instance, implementing a predictive algorithm for the management of acute psychosis purporting to offer improved clinical outcomes in comparison to conventional physician-delivered care, has been posed (Martinez-Martin, Dunn, and Roberts 2018). Despite preliminary work (Koutsouleris et al. 2016), there is a clear difference between statistical and clinical validation, and hence achieving adequate informed consent is problematic when the algorithmic decision-making process is opaque to clinicians, patients, or courts (Martinez-Martin, Dunn, and Roberts 2018). Furthermore, the generalizability of a predictive model developed in one location/jurisdiction has considerable potential to reinforce or exacerbate biases with a compounding heuristic bias towards the implementation of such predictive models, and a resourcing bias since economic efficiencies are related to physician time-based costs. This will affect the fiduciary dimension of the relationship between the patient, clinicians, and healthcare organizations, public or private. Will non-insured patients be able to opt out of the use of such a predictive algorithm? Do patients with an acute psychosis have capacity to determine whether AI should be involved in their care, particularly if clinicians cannot explain the derivation of AI-derived recommendations? It is feasible that such patients’ autonomy may be constrained by a non-human team member—the AI algorithm.

Is There a “Moral Imperative” to Adopt AIM?

If patients employ AIM as a prelude to the medical consultation in a “flipped classroom” manner, it might seem necessary or even obligatory for person-centric physicians—at a minimum—to support or supplement their own individual human functioning through the “use of technology in order to help people become faster and more accurate at the tasks they are performing” (Luxton 2019). This autonomy-promoting rationale has been used in IBM’s Watson Health™ application, yet automation bias and complacency may “create [new] opportunities for error in diagnosis and treatment … more visible and potentially detrimental outcomes than what might have happened without the new technology” (Luxton 2019, 133) or the potential for patients and physicians to develop unrealistic or unfulfillable expectations (via automation bias and complacency) based on idealized extrapolations from a “superintelligent machine” (Luxton 2019).

Grote and Behrens (2019) have recently argued for the incorporation of AIM on the basis of the implications of the “equal weight view,” whereby the presence of a differing opinion should cast doubt on one’s own position, lest one be anchored in the “steadfast view” that epistemically privileges one’s own position. Appeal to an algorithm through normative alignment with a supposed “epistemic authority” risks ceding authority to technology; again, this invokes automation bias, if not complacency (Grote and Berens 2019).

Discussion

Artificial intelligence, machine learning, information technology, and the Internet have arisen within the cultural context of contemporary society (González 2017) and reflexively influence the ongoing construction of society and our interpersonal relations. IT may not simply be an artefact or tool; the means with which humans employ IT permits IT to function as an actor in human interactions (Introna 2005). In that context, AIM and related autonomous systems appear to offer on one hand “utopian freedom” and on the other “existential dystopia” (Salla et al. 2018). The hyperbole surrounding past and present promises regarding AIM and the future of medicine provokes a range of opinions spanning fear, scepticism, disappointment, and ambivalence to qualified or unqualified enthusiasm and optimism. AIM innovations are “predicted to drive the greatest evolutionary progress in human history, accelerating the emergence of new technology innovations and affecting the way we humans live and act“ (Salla et al. 2018, 1).

Yet, there is discomfort that the “advance of technology effaces something important” (Karches 2018, 92), insofar as technology appears to be a “background assumption about the world that shapes the way the world appears to us” (Karches 2018, 93). Contemporaneously, society is influenced and constructed by the medicalization narrative—the framing of normal processes such as ageing, pregnancy and death as medical events—with medicine interpreted as a hegemonic, self-creating, and sustaining activity (Illich 1975; Scott-Samuel 2003; Moynihan et al. 2002; Moynihan 2003). Prevalent scientism (LeDrew 2018) asserts that “some of the essential non-academic areas of human life can be reduced to (or translated into) science” and accompanying scientific expansionist claims that “all beliefs that can be known (or even rationally maintained) must and can be included within the boundaries of science” (Stenmark 1997, 19).

The related secular belief system of datafication underpins “precision medicine” (Van Dijck 2014), whereby technology permits “new phenomena and areas of ordinary life [to become] subject to measurement, attention, and medical interpretation” (Hofmann and Svenaeus 2018, 8). Datafication is reflected in prevalent genetic determinism (Dar-Nimrod and Heine 2011), supersedes post-modern relativism, and places an overreliance on screening, biomarkers, and the predictive role of -omics for patient care (Mandl and Manrai 2019). Regardless of the data that is incorporated in AIM, “these new algorithmic decision-making tools come with no guarantees of fairness, equitability, or even veracity” (Beam and Kohane 2018, 1318). This concern may compound when future persons with access to “democratized” information (Doval 2018) and personal versions of AI may drive a values-poor and data-centric form of shared decision-making as a new norm (Coiera 2018). A personally-controlled electronic medical record (EMR) incorporating data from primary care, hospital interactions, consultative doctor–patient interactions, and network-based “collaborations to integrate genetic and genomic knowledge into clinical care” may embolden patients to accept, override, or even ignore information or physician recommendations based on algorithmic decision aids (Herr et al. 2018, 143) in the name of person-centric care.

The potential denigration of human capacities and skills (Karches 2018) is exemplified by the example of robotic neurosurgery for compressive radiculopathy illustrates the confusion this may generate for patients (Schiff and Borenstein 2019). Consent is predicated on the patient’s understanding that the surgeon rather than the AIM system determines the need for surgery, a common misunderstanding provoking patient anxiety, as is the perceived extent and invasiveness of robotic surgery (Müller and Bostrom 2016). If the patient’s understanding of the role of AI in their care is unclear, it exposes all parties to potential liabilities which may be difficult to disentangle, with potential corporeal and legal adverse effects. The clinical encounter extends beyond the patient (and their various influences) and the physician, instantiating “AIM” as a team member in a therapeutic triad (Swinglehurst et al. 2014).

The traditional doctor–patient dyad has been affected by population diversity, globalized access to information, and increasing technologization of much of the urbanized populace (Swinglehurst et al. 2014), with the emergence of the e-patient, “equipped, enabled, empowered and engaged in their health and health care decisions” (Ferguson 2007, 6), who may either doubt or refute the notion of physician omniscience and actively engage in their own care (Hay et al. 2008). Reflecting income and education, access to and facility with technology has always been a determinant of health (Frank and Mustard 1994), and it is suggested that access to technology may potentially mitigate some social health disparities (Wangberg et al. 2007), at least amongst those with sufficient access. However, the potential for the amplification of health disparities with unequal access to (or dependence on) technology has largely been overlooked.

Future doctors must be prepared to interact with e-patients (Farnan et al. 2013), particularly those with an increasing level of familiarity with medical jargon and facility with biostatistics and research evaluation. Physicians should have an awareness of the role the Internet plays in patients’ lives and the ability to guide patients to reputable non-commercial sites. Physicians must now productively and safely use email, social media, electronic devices, and medical apps and manage their digital footprint which will be accessed by patients and, consequently, negotiate digital boundaries with patients (Masters 2017).

A critical examination of the potential for unintended consequences that AIM implementation may pose for humanistic patient care is needed so that AIM facilitates optimal patient care (Israni and Verghese 2019). However, AIM will fail to deliver more effective care until there is acknowledgment that what impedes the quality of care is not simply “a lack of data or analytics but changing the behavior of millions of patients and clinicians” (Emanuel and Wachter 2019, 2281).

Conclusion: What is to Be Done?

The rapid and uncritical assimilation of AIM into the physician–patient encounter is touted to bring in a new era of precision medicine and person-centricity through a greater ability to manipulate data both from the narrow perspective of the patient and from a wider perspective. It has the potential to bring the “power” of broad data linkage to enable truly preventative personalized medicine (Grote and Berens 2019). The e-patient and how the physician relates to them—whether devoid of technology, using technology, partially or fully delegating care to technology—creates an ontologically distinct situation from prior care models. There are both potential advantages and disadvantages with such technology in advancing the interests of patients and ontological and epistemic concerns for physicians and patients relating to the instantiation of AIM as a dependent, semi- or fully-autonomous agent in the encounter. Libertarian paternalism potentially exercised by AIM (and those who control it) has created challenges to conventional assessments of patient and physician autonomy. The presently unclear legal relationship between AI and its users cannot be settled presently, and progress in AIM and its implementation in patient care will necessitate an iterative discourse.

Though AI purports to free physicians to become more humanistic, the implementation of AIM may also be characterized as a new normative force to be exploited by neoliberal governmentality (Hilgers 2010). If this is couched as person-centric care, neoliberal imperatives for efficiency in healthcare creates the threat of the physician and patient being actively disengaged from one another, “rendering unnecessary the bodily expertise and caring attentiveness that characterize pre-technological practices” (Karches 2018, 95) and potentially affecting distributional justice and “fairness in healthcare” (Grote and Berens 2019, 205).

Physicians should neither uncritically accept nor unreasonably resist developments in AI but must actively engage and contribute to the discourse, since AIM will affect their roles and the nature of their work. The premises of the “AIM argument” require further teasing out in an ongoing dialectic. It will not be sufficient for future physicians to have simple technos in the use of AIM and IT; they will need to learn new conceptions of how physicians’ phrenos can be augmented by AIM to benefit patient care and will need to consider any consequences for the patient–physician relationship and the outcomes of care. The physician’s moral imaginative capacity must engage with the questions of the beneficence, autonomy, and justice of AIM and whether its integration in healthcare has the potential to interfere with patients’ and physicians’ ends, resulting in non-maleficence.