research-article

Open Access

How Good is Good Enough? Quantifying the Impact of Benefits, Accuracy, and Privacy on Willingness to Adopt COVID-19 Decision Aids

Authors:
Gabriel Kaptchuk

Boston University, Boston, MA, USA

Boston University, Boston, MA, USA

0000-0002-8886-4748
View Profile

,
Daniel G. Goldstein

Microsoft Research, New York, New York, USA

Microsoft Research, New York, New York, USA
View Profile

,
Eszter Hargittai

University of Zurich, Zurich, Switzerland

University of Zurich, Zurich, Switzerland
View Profile

,
Jake M. Hofman

Microsoft Research, New York, New York, USA

Microsoft Research, New York, New York, USA
View Profile

,
Elissa M. Redmiles

Max Plank Institute for Software Systems, Saarbrücken, Germany

Max Plank Institute for Software Systems, Saarbrücken, Germany
View Profile

Authors Info & Claims

Digital Threats: Research and Practice Volume 3 Issue 3Article No.: 27pp 1–18https://doi.org/10.1145/3488307

Published:26 March 2022Publication History

Digital Threats: Research and Practice

Abstract

An increasing number of data-driven decision aids are being developed to provide humans with advice to improve decision-making around important issues such as personal health and criminal justice. For algorithmic systems to support human decision-making effectively, people must be willing to use them. Yet, prior work suggests that accuracy and privacy concerns may both deter potential users and limit the efficacy of these systems.

We expand upon prior research by empirically modeling how accuracy and privacy influence intent to adopt algorithmic systems. We focus on an algorithmic system designed to aid people in a globally-relevant decision context with tangible consequences: the COVID-19 pandemic. We use statistical techniques to analyze surveys of 4,615 Americans to (1) evaluate the effect of both accuracy and privacy concerns on reported willingness to install COVID-19 apps; (2) examine how different groups of users weigh accuracy relative to privacy; and (3) we empirically develop the first statistical models, to our knowledge, of how the amount of benefit (e.g., error rate) and degree of privacy risk in a data-driven decision aid may influence willingness to adopt.

1 INTRODUCTION

Systems designed to provide data-driven, algorithmic advice as input to human decisions are increasingly prevalent [13, 16, 88]. Such systems have the potential to outperform decisions made by algorithms or humans alone [8]. They have been used for a wide variety of applications such as identifying health conditions and suggesting appropriate care [13, 88] or predicting the likelihood of a defendant skipping bail to aid judges’ decision-making [37, 46]. While many names are used to describe these systems, we will refer to them as data-driven decision aids.

To maximize the value that data-driven decision aids can bring to individuals and society, people must be willing to adopt them. Prior work has illustrated, however, that ensuring adoption is not easy [34, 35, 36, 63]. Humans quickly lose trust in data-driven advice when it has inaccuracies—whether stated [3, 47] or observed [21, 23, 75, 94]. This holds true even when empirical results show that humans perform worse without the advice, a phenomenon called algorithm aversion [21]. In addition to the accuracy concerns that may deter people from using algorithmic systems, users may forgo or stop using systems designed to help them due to privacy concerns about the quantity of personal information these systems may collect to power their algorithms [25, 28, 55, 64, 78, 84].

Thus, we must understand how humans react to inaccuracy and lack of privacy in data-driven decision aids to ensure that future technological advances are not stymied by low adoption.

Background: COVID-19 Apps. The spread of the 2019 novel coronavirus disease (COVID-19) has spurred the development of a new set of data-driven decision aids, including COVID-19 mobile contact-tracing apps [26, 85]. Contact-tracing is a longstanding epidemiological tool for identifying at-risk individuals in order to contain the spread of a disease [24]. Contact-tracing apps—a new type of data-driven decision aid not widely implemented [20] until the COVID-19 pandemic—are designed to support the scaling of manual contact-tracing efforts. These apps work by tracking the shared contacts of all system users, allowing for validated reports of infection (e.g., using a passcode from a medical provider test) to be quickly input into the system. Contact-tracing apps use these two pieces of information to notify individuals who have come into contact with someone who has tested positive for the virus and who has input that information into the system. These notifications serve as advice for potentially-exposed individuals to consider getting tested and to self-isolate. For the remainder of the paper, we refer to this type of system as a ‘COVID-19 app’.

There are two primary design patterns for COVID-19 apps: centralized approaches with a trusted party, e.g., France’s StopCOVID app, and decentralized approaches that need no trusted party, e.g. DP3T and Google-Apple Exposure Notification Framework-based apps [5, 15, 18, 22, 83]. Choosing between these designs can be seen as an attempt to weigh functionality, accuracy, and privacy. Centralized approaches can leverage location information to deliver hotspot information or provide governments with epidemiological insights [68], but must store user data centrally and/or track their user’s movements. Decentralized COVID-19 apps aim to preserve user’s privacy, but may not be able to deliver all the same features. It is important to note, however, that the advantages of each approach are not absolute; centralized apps may still fail to provide accurate or actionable information (e.g., when mobile coverage is weak) and decentralized apps may still disclose private information (e.g., who has downloaded the app via Bluetooth beacon exploits, retrospective tracking of individuals who have tested positive, or information leakage from implementation bugs) [10, 66, 68, 83, 86].

The benefit of any data-driven decision aid is limited by its adoption rate. This is especially true for systems like COVID-19 contact-tracing apps, whose effectiveness scales quadratically with adoption [15]. As such, it is imperative that we conduct work to understand the factors that impact users’ willingness to start using these systems. There are many considerations that may influence people’s willingness to adopt COVID-19 apps [68]. For example, a person may weigh the app’s features, the app’s benefits to themselves and their community [79], who is offering the app [39], how well the app will preserve the user’s privacy [15, 54, 83], and the app’s accuracy.¹ In this work, we focus on how the latter two considerations, privacy and accuracy, as well as the primary public health benefit of the app, reduction in infection rate, influence Americans’ intent to adopt COVID-19 apps. To do so, we ran two surveys, the first sampled to be a census representative of the United States population and the second using Amazon Mechanical Turk.

What sets our examination apart from prior work? First, we believe that we are the first to build predictive models to estimate decision-aid adoption on the basis of both privacy and accuracy. We take a quantitative approach – validated in prior work measuring user acceptance of different levels of accuracy [47] – and conduct large-scale human-subject experiments with 4,615 respondents. We utilize this data to develop statistical models of the impact of accuracy and privacy on decision aid adoption rates, specifically, predicting how numeric increases in benefits such as a reduction in the false negative (FN) rate relate to increases in intention to adopt. Our quantitative models allow us to better estimate user response to potential app designs. For example, Saxena et al. [76] have shown that using Bluetooth as a method for detecting proximity—the technique used by the Apple-Google infrastructure underlying many contact tracing apps [5]—has an approximate error rate of 7–15%. Our empirical models allow us to estimate user adoption rates for apps with such error rates.

Second, we study decision aids that are broadly relevant to all members of society, significantly increasing the external validity of this work. Many prior experiments on data driven decision-aids asked survey respondents to imagine themselves as decision makers or in unfamiliar circumstances. However, respondents often have little experience making the kinds of decision being studied, e.g. setting bail in the criminal justice system, and therefore have difficulty imagining themselves in the required context. This effect, called ego-centrism [27], makes it difficult for respondents to extract themselves sufficiently from their own life experiences and biases to give maximally realistic responses in hypothetical scenarios. This, in turn, limits the external validity of prior work examining how users will react to a novel data-driven decision aid. Unlike the decision aids often studied in prior work, the consequences of COVID-19 are very real for everyone, meaning respondents need not escape their ego-centrism and can respond based on their own personal experiences. Thus, COVID-19 provides a unique opportunity to study people’s willingness to adopt data-driven decision aids with high external validity: in a tangible, high risk situation that is relevant to the entire population.

In summary, our work investigates three research questions:

(RQ1).	Do accuracy (false negatives and false positives) and/or privacy influence whether people are willing to install a COVID-19 app?
(RQ2).	When considering installing a COVID-19 app, do people with different sociodemographic characteristics weigh accuracy and/or privacy considerations differently?
(RQ3).	How does the amount of public health benefit, accuracy, and/or privacy offered by a COVID-19 app influence people’s reported willingness to adopt COVID-19 apps?

While these are interesting questions to ask across various geographical and cultural contexts, we focus on answering them in the context of the United States. In sum, we find a significant, predictive relationship between the amount of public health benefit, false negative and false positive rate, and privacy risk offered by a particular app and respondents’ reported willingness to install (RQ3). We find that both privacy and accuracy significantly influence installation intent (RQ1) and that respondents with different sociodemographics (ages, genders, political leanings), experiences (knowing someone who died of COVID-19), and internet skills weigh accuracy vs. privacy concerns differently (RQ2).

By empirically developing statistical models of people’s willingness to adopt data-driven decision aids in the COVID-19 context we take an important step towards anticipating how users will react to other data-driven decision aids. These results can inform the design of future, more complex data-driven decision aids, including machine learning-powered models, to maximize their uptake and impact. Our findings offer guidance on the development of future data-driven decision aids and a methodological template for future quantitative modeling of how accuracy rates and privacy risk may affect the adoption, and ultimately the efficacy, of new data-driven decision aids.

2 RELATED WORK

Prior work has considered how accuracy [3, 47, 61, 75, 94, 95], privacy [3, 45, 52, 92, 96], and fairness [37, 50, 51, 74, 77, 81, 89] may impact societal acceptability and adoption of data-driven decision aids. Here, we review the relevant findings of this body of work.

Impact of Inaccuracies. There is a significant body of work studying the way inaccuracies impact users’ trust in machine learning systems. Dietvorst et al. [21], building off of earlier work by Dzindolet et al. [23], describe a phenomenon they term algorithmic avoidance, in which humans stop trusting algorithms once they see it make a mistake, even if the algorithm outperforms humans on the task. Yin et al. [93] show that a both a model’s stated accuracy on held-out data and its observed accuracy affect users’ trust in the model. Yu et al. [94, 95] explore the way that users’ trust in a model changes over time when observing errors, finding that system errors have an out-sized impact on user trust. Salem et al. [75] found that users are generally willing to work with robots that exhibit faulty behavior, although their trust in the efficacy of those robots significantly diminishes after observing errors. Panniello et al. [61] look at connecting the accuracy of recommendation systems to users’ trust in the recommended services and products and their willingness to purchase these services and products. They find that accuracy has only a limited impact on purchasing behavior, but does increase consumer’s trust. Finally, Kay et al. [47] systematically investigate how users perceive accuracy errors and establish an instrument for measuring an individual user’s willingness to tolerate inaccuracies; we leverage this pre-validated methodology in our work.

Impact of Privacy Risks. There has been significant work on the impact of privacy risk on users’ willingness to use various technologies. Most relevant to our work are examinations of privacy concerns related to IoT, location-tracking, and medical technologies. For instance, Hsu et al. [45] study the impact of privacy risks on users’ willingness to adopt and use Internet-of-Things devices, finding that privacy risks have only a weak effect on adoption. Xu et al. [92] and Zhao [96] each study the ways the privacy risks of location-based services impact users’ willingness to adopt the services powered by this technology. The former work finds that the intention of providers to implement privacy protections increases trust in the service and reduces perceived risk. The latter highlights that privacy concerns over location-based service systems could suppress users’ willingness to adopt them. Li et al. [52] investigate similar questions in the context of wearable medical devices, showing that potential users perform a privacy calculus informed by the device’s benefits, the health information’s sensitivity, the user’s attitude toward emerging technology, policy protections, and perceived prestige. Angst et al. [3] investigate users’ hesitance to adopt electronic medical records due to perceived privacy risks, demonstrating that concern for information privacy is a driving force behind the decision. Despite this large body of relevant work, to our knowledge, no prior work – particularly in the health domain – has examined how both privacy and accuracy affect users’ willingness to adopt a decision aid.

COVID-19 Apps. There have been many studies of the acceptance of COVID-19 contact-tracing apps [39, 53, 68, 79, 87], with most studies finding that users are concerned about app privacy, accuracy, and costs (e.g., mobile data use). However, to our knowledge no prior work quantified the impact of specific levels of accuracy and privacy on potential users intent to adopt these apps. End-user considerations for app adoption may vary based on the architecture of the contact-tracing app available to them [68]. Ahmed et al. [2] provide a comprehensive survey of the COVID-19 apps proposals and their underlying techniques. For a complete list of COVID-19 apps and proposals, we refer the reader to the citations contained within this work. Several important privacy-preserving COVID-19 app proposals include DP3T [83], PACT [15], BlueTrace [9], and the Google-Apple Exposure Notification Framework-based apps [5]. As described in Section 1, these apps generally broadcast rotating identifiers over Bluetooth. When someone tests positive for COVID-19, information is disseminated that allows all devices that the tested person was near that they likely were exposed to COVID-19. We note that similar technology-based contact tracing solutions were proposed to combat the 2014 Ebola outbreak, but were not widely deployed [20, 73].

Fairness and the Social Context of Data-driven Decision Aids. A related problem to the one we study in this work is the social impact of outsourcing critical decisions to data-driven decision aids. Prior work has shown that data-driven decision aids, and artificial intelligence more broadly, disproportionately harm marginalized populations and re-enforce existing social inequities. For instance, ProPublica found that the COMPAS algorithm used to predict recidivism in the United States consistently overstated the risks for racial minority groups [4]. More recent work has continued to highlight and explore the way automated decision making can harm both individuals and communities, e.g., [11, 12, 57]. That work has spurred research evaluating the fairness of these decision systems, e.g., [37, 50, 51, 74, 77, 81, 89]. Issues of fairness and potential harm are incredibly important and should be considered deeply when deploying data-driven decision aids such as those discussed in this work. These issues are orthogonal, but complementary, to the issues of privacy and accuracy we consider in the work.

3 METHODOLOGY

We conducted two surveys intended to answer our research questions. Our first survey examined how inaccuracies and/or privacy leaks impact respondents’ willingness to install a COVID-19 app (RQ1). Additionally, we gathered demographic information to understand how identity and life experience impact this decision (RQ2). With our second survey, we develop empirical models to predict how the amount of personal and communal health benefit, and degree of privacy risk, for a particular COVID-19 app affects people’s intent to install (RQ3).

Our questionnaires evaluate respondents’ willingness to install Bluetooth proximity-based contact-tracing apps, agnostic of app architecture (e.g., centralized vs. decentralized). As detailed in the Introduction, either centralized or decentralized apps may have accuracy issues or may experience privacy leaks [10, 66, 68, 83, 86]. For example, regarding privacy, the MIT Pathcheck Foundation—which creates decentralized apps for multiple government jurisdictions using the Google-Apple notification framework—states that “while [contact-tracing] apps aim to obscure the identity of the person who is infected, accidental release of information sufficient to identify the person can occur on rare occasions, similar to accidental release of protected health information” [66].

In this section, we briefly discuss our questionnaires, questionnaire validation, sampling approaches, analysis approaches, and the limitations of our work. All studies were IRB approved by a federally-recognized ethics review board. The full content for our first survey, including question wording, is included in Table 1 and the content for our second survey is included in Table 2.

Table 1.

Branch (Assign 1)	Scenario
Proximity Scenario	Imagine that there is a mobile phone app intended to help combat the coronavirus. This app will collect information about who you have been near (within 6 feet), without revealing their identities. The app will use this information to alert you if you have been near someone who was diagnosed with coronavirus. If you decide to inform the app that you have been diagnosed with coronavirus, the app will inform those you’ve been near that they are at risk, without revealing your identity.
Location Scenario	Imagine that there is a mobile phone app intended to help combat the coronavirus. This app will collect information about your location. The app will use this information to notify you, without revealing anyone’s identity: if you have been near someone who tested positive for coronavirus about locations near you that were recently visited by people who tested positive for coronavirus If you decide to report to the app that you have been diagnosed with coronavirus, the app will inform those you’ve been near that they are at risk without revealing your identity.
Control Branches	Text
Control - Accuracy	Imagine that this app will work perfectly. It will never fail to notify you when you are at risk nor will it ever incorrectly notify you when you are not at risk.
Control - Privacy	Imagine that this app perfectly protects your privacy. It will never reveal any information about you to the US government, to a tech company, to your employer, or to anyone else.
Control - Both	Imagine that this app works perfectly and protects your privacy perfectly. It will never fail to notify you when you are at risk nor will it ever incorrectly notify you when you are not at risk. It will also never reveal any information about you to the US government, to a tech company, to your employer, or to anyone else.
Experimental Branches	Text
Accuracy - False Negative	Imagine that this app occasionally fails to notify you when you have been near someone who was diagnosed with coronavirus.
Accuracy - False Positive	Imagine that this app occasionally notifies you that you have been near someone who has coronavirus when you actually have not been exposed.
Privacy	Imagine that this app might reveal information about [who you have been near/your location] to [entity]. This information may be used for a purpose unrelated to the fight against coronavirus.

First, each participant was randomly assigned to either the Proximity Scenario or the Location Scenario, seen in the first part of the table. Then, each participant was randomized into one of the branches in the second part of the table (either a control branch or an experimental branch). All participants were asked “Would you install this app?” after each branch in the second part of the table, with answer choices “Yes”, “No.” Additionally, the respondents were offered the answer choice “It depends on the [risk, chance of information being revealed, etc.]” in the experimental branches.

View Table

Table 1. Questions for Survey 1

First, each participant was randomly assigned to either the Proximity Scenario or the Location Scenario, seen in the first part of the table. Then, each participant was randomized into one of the branches in the second part of the table (either a control branch or an experimental branch). All participants were asked “Would you install this app?” after each branch in the second part of the table, with answer choices “Yes”, “No.” Additionally, the respondents were offered the answer choice “It depends on the [risk, chance of information being revealed, etc.]” in the experimental branches.

Table 2.

First, all participants were assigned to the same scenario, the proximity scenario, seen in the first part of the table. Then, each participant was randomized into one of the branches in the second part of the table. They were then asked if they would be willing to install the app.

View Table

Table 2. Description of Survey 2

First, all participants were assigned to the same scenario, the proximity scenario, seen in the first part of the table. Then, each participant was randomized into one of the branches in the second part of the table. They were then asked if they would be willing to install the app.

3.1 State of COVID-19 Pandemic During Our Surveys

In order to properly contextualize our surveys for the reader, we briefly recall the socio-political context in which our surveys were conducted. Both surveys were conducted with respondents located in the United States during May 2020, just four months after the first recorded American COVID-19 case on 21 January, 2020. COVID-19 apps, like the ones that we study in this work, were first discussed broadly in April 2020 as a supplement to traditional contact tracing. By early May, the first contact-tracing apps had been deployed, like the North Dakota app, which had serious privacy flaws [56, 58]. During this early phase of the pandemic, there was significant uncertainty about the practices and tools that would help mitigate the spread of COVID-19. For instance, the World Health Organization did not suggest that everyone should wear face masks until June 2020. When we conducted our surveys, most American states were in a state of emergency, and stay-at-home orders were starting to be lifted after several weeks. By the end of May, official statistics had recorded 100,000 American COVID-19 deaths [14]. Thus, our surveys were conducted during a moment of significant distress and fear, during which COVID-19 apps seemed like they might become a significant tool in suppressing the spread of COVID-19.

3.2 First Survey (RQ1 and RQ2)

In our first survey we looked at how accuracy and/or privacy considerations might influence willingness to adopt (RQ1) and how the relative weight of these considerations might vary with respondent demographics (RQ2). We used a vignette survey [6] to examine these questions. In a vignette survey, respondents are given a short description of a hypothetical situation and then asked a series of questions about how they would respond. This initial situation can also be supplemented with condition-specific descriptions that augment the vignette that is show to all respondents, which allows researchers to isolate specific, explicit differences between respondents in different branches. Vignette surveys have been shown to maximize external validity when examining hypothetical scenarios.

Questionnaire. We used a 2-by-2 between-subjects design to study accuracy and privacy concerns around contact tracing apps that collect two different types of data. We framed our questionnaire by asking respondents to imagine that there exists a contact-tracing app that uses Bluetooth proximity information (Proximity Scenario) or GPS location information (Location Scenario) to identify contact, assigning half of the respondents to the two groups at random. Each participant was then randomly assigned to either the control branch or the experimental branch, and given a series of branch-specific contexts. In the control branch, the respondents were, at random, assigned to one of the following three contexts: (1) the app has perfect accuracy, (2) the app has perfect privacy, or (3) the app has both perfect accuracy and perfect privacy. In the experimental branch, respondents were randomly assigned to one of three contexts: (1) the app has false negatives (“this app occasionally fails to notify you...”), (2) the app has false positives (“this app occasionally notifies you...when you actually have not been exposed”), and (3) the app might leak private information. For this final context, we asked about information leakage to four entities, drawn from prior work [39]: “non-profit institutions verified by the government”, “technology companies”, “the US government”, and “your employer.” In both branches, respondents were asked “Would you install this app?” to which they could respond “Yes”, “No”, or, in the experimental branch only:“It depends on the [risk, chance of information being revealed, etc.]”. See Table 1 for a more detailed description of this questionnaire.

Validation. The questionnaire was validated through expert reviews with multiple researchers. Additionally, we included three attention-check questions—one general attention check and two scenario-specific attention checks—to ensure respondents understood the decision scenario.

Sample. We contracted with the survey research firm Cint to administer the survey to 789 Americans in May 2020; we quota sampled on age, gender, income and race to ensure the sample was representative of US population demographics.

Analysis. We answer RQ1 using \( X^2 \) proportion tests corrected for multiple testing errors using Bonferroni-Holm correction where appropriate, to compare responses to our different sets of questions. We answer RQ2 by constructing two mixed-effects binomial logistic regression models. Our dependent variable (DV) is willingness to install the app, with “Yes” and “It depends on the risk” as a positive outcome and “No” as a negative outcome. We model responses to the accuracy and privacy questions separately, controlling for data type and entity in the privacy model, and both data and accuracy type in the accuracy model. We include as additional variables the respondents’ age, gender, race, internet skill (as measured using the Web Use Skill Index [38]), level of educational attainment, party affiliation, and if the respondent knows someone who died due to complications from COVID-19. We include a mixed-effects term to account for our repeated measures design.

3.3 Second Survey (RQ3)

We use the data from this survey to predict people’s intent to install COVID-19 apps based on the amount of public health (infection rate reduction) and individual health (notification of at-risk status – e.g., accuracy) benefit, or privacy risk, of a hypothetical COVID-19 tracking app (RQ3).

Questionnaire. All questions were asked in the context of an architecture-agnostic Bluetooth proximity-based contact-tracing app, as the type of information compromised (location vs. contacted individuals), as well as the entity that could compromise the information (Employer, Government, etc.) had relatively little effect on willingness to install in our first survey (see Section 4.1).

Participants in this survey were randomly assigned to one of four branches: public benefit, false negative, false positive, or explicit privacy. As the degree of privacy risk for contact-tracing apps is not yet quantified, we drew estimates of privacy risk from respondents own perceptions. To this end, respondents in the first three branches were first asked to estimate the risk that information collected by this contact tracing app would be leaked (implicit privacy assessment), using a modified version of the Paling Perspective scale [60] (see Table 2), which has been validated for use in eliciting digital security [42] as well as health risk perceptions. Then, each respondent was given a context about the benefit (or privacy risk) of the contact tracing app. Each context provides a baseline of the benefit, or privacy risk, for individuals who do not install the app (see Table 2 for per-branch baselines). All survey respondents were then asked “Would you install this app?,” with answer options “Yes” and “No”.

In the public benefit branch, respondents were given the context that individuals with the app installed were infected X% less often (see Table 2 for values of X) than those who did not have the app installed. For both the false positive and false negative branches, respondents were told that individuals without the app would be notified of a COVID-19 exposure 1 in 100 times. Respondents in the false negative branch were told that the app has a false negative rate of X%; respondents in the false positive branch were told that the app had a false negative rate of \( 0\% \), but a false positive rate of X%. Finally, respondents in the explicit privacy branch were told explicitly of the privacy risk of the app: X in 1,000 people who use this app will have their information compromised. They were then asked the same question as the false negative branch. See Table 2 for an overview.

Although we have used percentages above for brevity, no information in this survey was expressed in terms of percentages, following best practices from prior work on how to measure the acceptability of different levels of machine-learning system accuracy [47] and prior work in health risk communication and numeracy showing that people reason more accurately with frequency formats than with percentages [29, 30, 48, 72]. Similarly, we avoided technical terms like “false negative” and “false positive,” instead describing the practical ramifications of the situations. See Table 2 for questionnaire wording.

Validation. The questionnaire and respondent answers were validated in the same way as Survey 1.

Sample. We surveyed 3,826 Amazon Mechanical Turk workers located in the United States. These workers were split into different survey branches (see above) so all results sections note the number of responses analyzed. There are always concerns about the generalizability of crowdsourced results [41, 59, 70]. Recent work has shown that Amazon Mechanical Turk results generalize well [59, 62], including in the security and privacy domain [70]. To further address generalizability concerns in our application area in particular, we also conducted Survey 1 on Amazon Mechanical Turk. We found only one significant difference,² with small effect size between the two samples. As the goal of our work is not to provide precise point estimates of phenomena in the entire U.S. population [7], given the the quantitative nature of the RQ3 survey, the sample size required, and the lack of difference in the most relevant questions (about accuracy and privacy vs. leaks to particular entities), and prior work on the validity of crowd-sourced results, we chose to proceed with using Amazon Mechanical Turk.

Analysis. We develop predictive models using the data obtained in this survey via binomial logistic regression analysis.³ For benefits and accuracy, we construct models with willingness to install as the dependent variable, the amount of benefit or error (e.g., chance of FN) as the independent variable, and the respondents’ perceived implicit privacy risk for the app as a control variable. To evaluate the impact of privacy on decision making we construct a binomial logistic regression model with willingness to install as the dependent variable and explicit privacy risk as the independent variable. To further evaluate the impact of privacy on decision making and to validate our measurement of the implicit privacy perceptions, we use \( X^2 \) proportion tests to compare the proportion of respondents willing to install given an FN rate in the implicit and explicit privacy conditions.

Model Validation. We did not have a strong prior knowledge as to how exactly the outcomes varied with the quantitative values for the benefit, error, or risk for the app. Thus, we considered models of varying complexity (polynomial degree) to account for the observed responses. 80% of the data were used to fit and select a best model based on the average RMSE estimates across 5-fold cross validation at each of 10 potential polynomial degrees; final performance is quoted on the remaining 20% test set. We use first-degree models, which offered the lowest RMSE.

Important Considerations for Interpreting Models of Human Behavior. Models of human behavior notoriously achieve relatively low accuracy compared to many models developed in computer science and related fields, with \( R^2 \) for “good” models of human behavior approximating 30–40% explanation of variance and prediction accuracy between 60 and 70% [31, 33, 49, 69, 90]. This lower accuracy is due to a number of factors including high levels of variance in human behavior, estimation of performance on single-period-in-time measurements—which under-count predictive power for repeated decisions such as app installation—and compounded behavioral and self-report biases (see Limitations for more detail on how we mitigate these biases) [1, 43].

3.4 Limitations

As with all surveys, the answers represented in these results are people’s self-reported intentions regarding how to behave. As shown in prior literature on security, these intentions are likely to align directionally with actual behavior, but are likely to over-estimate actual behavior [32, 71]. As described in the questionnaire validation sections above, we took multiple steps to minimize self-report biases. The goal of this work is to show how willingness to adopt may be influenced by privacy/accuracy considerations, and thus model results should not be interpreted as exact adoption estimates.

4 RESULTS

4.1 Both Accuracy & Privacy Influence Whether People Want to Install a COVID App

Flaws in both accuracy and privacy significantly⁴ relate to respondents reported willingness to adopt COVID-19 apps, as shown in Figure 1. When considering apps purported to be perfect, we find that respondents do not significantly differentiate between perfect privacy vs. perfect accuracy vs. perfect accuracy & privacy (\( X^2 \) prop. omnibus test, p = 0.178).

Fig. 1. Reported willingness to install a COVID-19 contact tracing app depending on the app context. Std. error bars are shown.

We find that significantly (\( X^2 \) prop. test, \( p\!\lt \!0.001 \)) more people say their decision would depend on the amount of accuracy error than the amount of privacy risk (the yellow bars in Figure 1). We examine in more depth how the amount of accuracy error influences reported willingness to adopt below.

Respondents differentiate between types of accuracy error. When provided no information about the false positive rate, 8% fewer respondents reported being willing to install an app with false negatives compared to one with false positives or one with privacy leaks to any of the entities examined (\( X^2 \) prop. tests BH corrected, both with \( p\lt 0.01 \)).

Finally, focusing on privacy leaks, we find that respondents’ reported willingness to install did not significantly differ (\( X^2 \) prop. tests BH corrected, all with \( p\gt 0.05 \)) based on what data the app might leak to a particular entity, except for hypothetical leaks to the respondents’ employer. Only 23% of respondents were willing to install an app that might leak their locations to their employer while 31% were willing to install an app that might leak information about who they have been near (their proximity data) to their employer.

4.2 Some Americans Weigh Accuracy or Privacy Considerations More Highly than Others

In order to examine whether some Americans weigh accuracy or privacy considerations more highly than others, we construct two mixed-effects logistic regression models as described in Section 3.2. We evaluate model fit by building our model with an 80:20 train test split,⁵ however we note that these models are intended to provide descriptive insight into how respondents’ weigh accuracy and privacy considerations and should be interpreted as such. In the remainder of this section we report descriptively on a model built on the full data set.

First, considering respondents’ willingness to install apps with accuracy errors (Table 3 (left)), we validate that even when controlling for demographic variance, respondents were more comfortable with false positives (spurious exposure notifications) than false negatives (missed notifications after an exposure). Additionally, emphasizing the relevance of ego-centricity [27] in people’s considerations of data-driven decision aids, we find that those who know someone who died from COVID-19 were more likely to report being willing to install an app with accuracy errors than those who do not know someone who died.

Table 3.

Installing Apps With Accuracy Errors				Installing Apps With Privacy Errors
Variable	OR	CI	p value	Variable	OR	CI	p value
Question: False Positive	1.65	[1.18, 2.32]	\( \lt \)0.01**	Entity: Tech Company	1.18	[0.79, 1.76]	0.41
				Entity: Employer	0.81	[0.54, 1.2]	0.29
				Entity: NonProfit	2.38	[1.6, 3.56]	\( \lt \)0.01***
Data: proximity	0.94	[0.55, 1.61]	0.83	Data: proximity	1.97	[0.92, 4.21]	0.08
Age	0.99	[0.97, 1]	0.1	Age	0.93	[0.91, 0.95]	\( \lt \)0.01***
Gender: female	1.27	[0.2, 8.21]	0.8	Gender: female	0.38	[0.17, 0.82]	0.01*
COVID19 Death	5.56	[2.35, 13.13]	\( \lt \)0.01***	COVID19 Death	1.14	[0.37, 3.52]	0.82
High medical risk	1.60	[0.84, 3.03]	0.15	High medical risk	1.95	[0.79, 4.8]	0.15
Internet Skill	1.77	[1.22, 2.56]	\( \lt \)0.01**	Internet Skill	1.78	[1.23, 2.58]	\( \lt \)0.01**
Pol. leaning: Democrat	0.60	[0.34, 1.05]	0.07	Pol. Leaning: Democrat	2.70	[1.23, 5.94]	0.01*
Edu.: BA+	1.24	[0.61, 2.51]	0.55	Edu.: BS+	0.55	[0.2, 1.51]	0.25
Edu.: SC	1.54	[0.7, 3.36]	0.28	Edu.: SC	0.82	[0.27, 2.47]	0.72

Right: Mixed effects logistic regression model of willingness to install apps with privacy errors. Both: Question baseline is FN, data baseline is location, political leaning baseline is Republican, mixed effects term controls for within-subjects design. COVID19 death refers to respondents who know someone who died from complications due to COVID19. In both tables below, OR is the odds ratio.

View Table

Table 3. Left: Mixed Effects Logistic Regression Model of Willingness to Install Apps with Accuracy Errors

Right: Mixed effects logistic regression model of willingness to install apps with privacy errors. Both: Question baseline is FN, data baseline is location, political leaning baseline is Republican, mixed effects term controls for within-subjects design. COVID19 death refers to respondents who know someone who died from complications due to COVID19. In both tables below, OR is the odds ratio.

Second, considering privacy risk (Table 3 (right)), we find that respondents were more comfortable installing an app with potential privacy leaks to a nonprofit organization verified by the government than an app with potential leaks to any other entity (their employer, a technology company, or the U.S. government). In this controlled model we find that the type of data leaked (proximity vs. location data) did not significantly affect reported willingness to install (see the previous section for a more detailed examination of this point).

Women and respondents who are younger are less likely to report that they would install an app with privacy errors. The gender finding aligns with past work showing that women may be more privacy sensitive than men [44, 67]. Further, Democrats are more likely than Republicans to report intent to install an app with privacy risks, potentially reflecting the increasingly politicized nature of privacy [91] and the COVID-19 pandemic itself at the time of the survey.

Finally, those who have higher internet skills are more willing to install an app that has either errors in accuracy or privacy leaks. This is in line with findings from prior work showing that those with higher skills are more willing to install COVID-19 apps in general [39], perhaps due to greater confidence in their ability to install and use these apps [19, 40].

4.3 Amount of Public Health and Individual Benefit Influence Willingness to Install

In the findings above, we validate that the individual considerations of accuracy and privacy both impact reported willingness to install. Further, we find that amount of accuracy error is especially important to people’s decision making about whether or not to use a data-driven decision aid: at least 30% of our respondents reported their installation decision depended on the amount of error.

Thus, in our second survey we examine whether we can predict how a quantified amount of public health (i.e., infection rate reduction) or individual benefit (i.e., FN and FP rates) impact intended adoption rates. Figure 2 provides an overview of these findings. To examine the relationship between amount of benefit and willingness to install beyond visual inspection, we construct logistic regression models as described in Section 3.3.

Fig. 2. Willingness vs. amount of public (top left) and individual (top right) benefit, and FP rate (bottom).

With respect to public health benefit, on 20% test data, we can predict reported willingness to install the app with 63.6% accuracy (null model accuracy: 54.0%, threshold = 0.5).⁶ We find that, for every 1% reduction to infection rate offered by the app, respondents are 4% more likely to report that they would install (O.R. 95% CI: [0.95, 0.98], \( p\lt 0.001 \)).

With respect to accuracy, we can predict reported willingness to install a COVID-19 app based on the false negative rate with 70.0% accuracy (null model accuracy: 52.0%, threshold = 0.5) and based on false positive rate with 62.5% accuracy (null model accuracy: 41.0%, threshold = 0.5). For every 1% increase (O.R. 95% CI: [1.01, 1.02], \( p\lt 0.001 \)) in app sensitivity, respondents are 1% more likely to report that they would install. For every 1% decrease (O.R. 95% CI: [0.98,0.99], \( p\lt 0.001 \)) in false positive rate, respondents are 1% more likely to report that they would install.

4.4 Amount of Privacy Risk Also Influences Willingness to Install

Next, we model willingness to install based on explicit privacy risk. We can predict willingness to install based on privacy risk with 65.5% accuracy (null model accuracy: 34.5%, threshold = 0.5). For privacy risk (recall that magnitude of privacy risk is far smaller than benefit rates or accuracy errors), we observe that a 52% decrease in privacy risk results in a 1% increase in intent to install (O.R. 95% CI: [0.3, 0.74], \( p\lt 0.01 \)).

Further, we confirm the relevance of privacy risk, implicit or explicit, in respondent decision making – and validate the equivalency of explicitly stated privacy risk and our measurements of respondents’ implicit privacy perceptions – by comparing the proportion of respondents who were willing to install a COVID-19 app given an explicit statement of privacy risk (privacy risks were drawn from the portion of the implicit risk distribution reported by the majority of respondents) vs. their own implicit perception. We find no significant difference between the proportion of respondents who intend to install an app with a given false negative rate when relying on their own implicit privacy assumptions vs. an explicit statement of the risk of privacy leak.

Finally, further confirming the relevance of benefits, accuracy, and privacy, modeling respondents’ response to benefit/error rates (e.g., infection rates, FN rate, FP rate)—including implicit privacy risk as a control in the regression—significantly improves model fit in all three models (likelihood ratio tests [65], \( p\lt 0.05 \) for all models).

5 DISCUSSION AND CONCLUSION

In this work we statistically analyzed two surveys of a total of 4,615 Americans to better understand people’s willingness to use data-driven decision aids when these systems are imperfect. Understanding how to encourage adoption of these systems is critical as the importance of these systems grows.

We specifically focus our study on Americans’ intent to adopt data-driven decision aids in the context of COVID-19. We do so for three reasons. First, virtually everyone has been personally impacted by the global pandemic. As such, it is easy for respondents to imagine themselves as decision makers. This overcomes limitations of prior work in which subjects were asked to imagine themselves in an unfamiliar situation [27]. Second, the accuracy and privacy risks we ask respondents to imagine are far from theoretical [56, 58]. Third, given the immense need for reductions in COVID-19 infections, it is critical to understand how well COVID-19 data-driven decision aids need to perform to meet both individual and public health needs.

Our results offer clear evidence that both inaccuracies and privacy risks can impede the adoption of important data-derived decision aids. Specifically, we find that:

(RQ1). Users are more willing to install COVID-19 apps that have either perfect accuracy, perfect privacy, or both. This finding clearly motivates our other research questions, as it is natural to inquire how perfect the app must be.
(RQ2). Respondents with different socio-demographic characteristics, life experiences, or technological comfort consider accuracy and privacy differently. Namely, we find individuals with higher internet skill and those that knew someone who died from COVID-19 were more willing to tolerate accuracy errors. Additionally, women and younger respondents were less likely to install an app that risked disclosing private information. Users with higher internet skill, on the other hand, were more likely to install such an app. These results remind developers that the life experiences and proclivities of certain groups can significantly impact the adoption rate of a data-driven decision aid like a COVID-19 app. For instance, significantly higher tolerance for accuracy errors among respondents who knew someone who died from COVID-19 might indicate that these users considered the inconveniences that the app might pose to be worth it. Similarly, women’s lower tolerance for privacy risks is in line with prior work; this preference persists despite the pressing public health need.
(RQ3). Finally, we found that people care about the amount of privacy risk, in addition to amount of accuracy and/or benefit; but are largely neutral regarding to whom their data might leak. While proactively computing the risk of a data leak may be difficult when developing an app, our results indicate that increased efforts on the part of app developers can increase users’ trust and willingness to adopt. Put another way, it is important for developers to invest resources to decrease the chances of a data leak, even if they are not able to reduce the chances of data leakage to zero. Similarly, efforts to improve the accuracy of a system or efficacy of a system in addressing its stated goals, even by a few percentage points, will reap benefits in adoption.

Interestingly, users are less willing to make use of these systems in the presence of false negatives than false positives. In our study context of COVID-19, false negatives represent a threat to personal and communal safety, whereas false positives are merely inconvenient (requiring additional testing or quarantine). Of course, not using a COVID-19 app will guarantee that the user will never see any possible exposure notifications that the app would have generated. In that sense, not installing a COVID-19 app gives the user a 100% false negative rate and a 0% false positive rate. However, respondents may have assumed that installing an additional app on their device might have had other side effects, like privacy risks or decreased battery performance, that would outweigh the benefit of installing an app that fails to accomplish its main goal of producing notifications.

Since we first conducted our surveys, the COVID-19 pandemic has started to slow in the United States. Despite the initial excitement around COVID-19 apps, COVID-19 apps largely failed to live up to their promise [80, 82]. Reports indicate that accuracy problems may have contributed to the lack of uptake, along with significant logistical and policy failures [80]. While our work cannot make strong causal claims as to why COVID-19 apps were a failure, our results indicate that lax privacy controls in early deployed apps [56, 58] and concerns over the app’s accuracy or efficacy may have reduced enthusiasm for COVID-19 app adoption.

5.1 Implications Beyond COVID-19 Apps

While our results offer insight into data-driven decision aids in the context of a global pandemic, more broadly, they speak to the importance of both accuracy and privacy in modeling data-driven decision aid adoption. Although the goal is always to create perfect decision aids, inaccuracy is inherent. Moreover, because of the data-hungry nature of many machine-learning based decision aids, it is doubtful that invasive data collection will disappear in the near future. As such, our work suggests that decision aid designers should address these factors explicitly and consider them closely when deploying their systems. Critically, developers must remember that justifying low accuracy or high risk of privacy breaches with high utility and social good is unlikely to be a successful strategy for earning user trust. Especially for data-driven decision aids that need widespread adoption in order to be useful, ensuring that the system maintains appropriate accuracy thresholds and addresses privacy risk before deployment is critical to achieving and maintaining adoption rates.

Footnotes

¹ Due to the emerging nature of this work, no peer-reviewed publications on this topic were available for reference; thus, we include references to research-based pre-prints and popular press articles.
Footnote
² MTurk workers were significantly less likely to say they would install an app that was provided by a technology company and relied on proximity data: \( X^2 \) = 4.2621, \( p=0.039 \); Holm-Bonferroni correction was applied to account for multiple tests on the same sample.
Footnote
³ While we explored using more complex models such as decision trees, these models offered no performance benefits and thus we proceeded with the simplest modeling strategy.
Footnote
⁴ \( X^2 \) prop. tests in comparison to the control conditions, \( p\lt 0.05 \), Bonferroni-Holm (BH) correction.
Footnote
⁵ The accuracy model predicts reported willingness to install with an accuracy of 86.7% (null model: 51.3% accuracy) with threshold 0.5; the privacy model has an accuracy of 92.5%.
Footnote
⁶ As noted in Section 3.3, accuracy rates for models of human behavior are significantly lower than typical machine learning model accuracy rates; for more information, see Section 3.3.
Footnote

REFERENCES

[1] Abelson Robert P.. 1985. A variance explanation paradox: When a little is a lot. Psychological Bulletin 97, 1 (1985), 129–133.Google Scholar
Reference
[2] Ahmed Nadeem, Michelin Regio A., Xue Wanli, Ruj Sushmita, Malaney Robert, Kanhere Salil S., Seneviratne Aruna, Hu Wen, Janicke Helge, and Jha Sanjay K.. 2020. A survey of COVID-19 contact tracing apps. IEEE Access 8 (2020), 134577–134601. DOI:Google ScholarCross Ref
Reference
[3] Angst Corey M. and Agarwal Ritu. 2009. Adoption of electronic health records in the presence of privacy concerns: The elaboration likelihood model and individual persuasion. Management Information Systems Quarterly 33, 2 (June 2009), 339–370.Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[4] Angwin Julia, Larson Jeff, Mattu Surya, and Kirchner Lauren. 2016. Machine Bias: There’s Software Used Across the Country to Predict Future Criminals. And it’s Biased Against Blacks. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. (2016). (Accessed on Oct. 12, 2021).Google Scholar
Reference
[5] Apple and Google. 2020. Privacy-Preserving Contact Tracing. https://www.apple.com/covid19/contacttracing. (2020). (Accessed on Oct. 12, 2021).Google Scholar
Reference 1Reference 2Reference 3
[6] Atzmüller Christiane and Steiner Peter M.. 2010. Experimental vignette studies in survey research. Methodology 6, 3 (2010), 128–138. DOI:Google ScholarCross Ref
Reference
[7] Baker Reg, Blumberg Stephen J., Brick J. Michael, Couper Mick P., Courtright Melanie, Dennis J. Michael, Dillman Don, Frankel Martin R., Garland Philip, Groves Robert M., Kennedy Courtney, Krosnick Jon, and Lavrakas Paul J.. 2010. Research synthesis: AAPOR report on online panels. Public Opinion Quarterly 74, 4 (2010), 711–781.Google ScholarCross Ref
Reference
[8] Bansal Gagan, Nushi Besmira, Kamar Ece, Lasecki Walter S., Weld Daniel S., and Horvitz Eric. 2019. Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing. 2–11.Google ScholarCross Ref
Reference
[9] Bay Jason, Kek Joel, Tan Alvin, Hau Chai Sheng, Yongquan Lai, Tan Janice, and Quy Tang Anh. 2020. BlueTrace: A Privacy-preserving Protocol for Community-driven Contact Tracing Across Borders. Government Technology Agency-Singapore, Technical Report (2020).Google Scholar
Reference
[10] Bengio Yoshua, Ippolito Daphne, Janda Richard, Jarvie Max, Prud’homme Benjamin, Rousseau Jean-François, Sharma Abhinav, and Yu Yun William. 2020. Inherent privacy limitations of decentralized contact tracing apps. Journal of the American Medical Informatics Association 28, 1 (June 2020), 193–195. DOI:arXiv:https://academic.oup.com/jamia/advance-article-pdf/doi/10.1093/jamia/ocaa153/33428180/ocaa153.pdf.Google ScholarCross Ref
Reference 1Reference 2
[11] Benjamin Ruha. 2019. Race after technology: Abolitionist tools for the New Jim Code. Social Forces 98, 4 (2019), 1–3.Google Scholar
Reference
[12] Buolamwini Joy and Gebru Timnit. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. PMLR, 77–91.Google Scholar
Reference
[13] Caruana Rich, Lou Yin, Gehrke Johannes, Koch Paul, Sturm Marc, and Elhadad Noemie. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1721–1730.Google ScholarDigital Library
Reference 1Reference 2
[14] Prevention Center for Disease Control and. 2020. United States Coronavirus (COVID-19) Death Toll Surpasses 100,000. https://www.cdc.gov/media/releases/2020/s0528-coronavirus-death-toll.html. (28 May 2020).Google Scholar
Reference
[15] Chan Justin, Gollakota Shyam, Horvitz Eric, Jaeger Joseph, Kakade Sham, Kohno Tadayoshi, Langford John, Larson Jonathan, Singanamalla Sudheesh, Sunshine Jacob, et al. 2020. PACT: Privacy sensitive protocols and mechanisms for mobile contact tracing. arXiv preprint arXiv:2004.03544 (2020).Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[16] Chouldechova Alexandra and Roth Aaron. 2020. A snapshot of the frontiers of fairness in machine learning. Commun. ACM 63, 5 (2020), 82–89.Google ScholarDigital Library
Reference
[17] Condit Rich. 2020. Infection Fatality Rate – A critical missing piece for managing COVID-19. https://www.virology.ws/2020/04/05/infection-fatality-rate-a-critical-missing-piece-for-managing-covid-19/. (2020). (Accessed on May 8, 2020).Google Scholar
[18] coronawarn. CoronaWarn. https://www.coronawarn.app/de/. (2020). (Accessed on October 12, 2021).Google Scholar
Reference
[19] Coutinho Savia. 2008. Self-efficacy, metacognition, and performance. North American Journal of Psychology 10, 1 (2008), 165–172.Google Scholar
Reference
[20] Danquah Lisa O., Hasham Nadia, MacFarlane Matthew, Conteh Fatu E., Momoh Fatoma, Tedesco Andrew A., Jambai Amara, Ross David A., and Weiss Helen A.. 2019. Use of a mobile application for Ebola contact tracing and monitoring in northern Sierra Leone: A proof-of-concept study. BMC Infectious Diseases 19, 1 (2019), 810.Google Scholar
Reference 1Reference 2
[21] Dietvorst Berkeley J., Simmons Joseph P., and Massey Cade. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114–126. DOI:Google ScholarCross Ref
Reference 1Reference 2Reference 3
[22] Dillet Romain. 2020. France Releases Contact-tracing App StopCovid. TechCrunch. https://techcrunch.com/2020/06/02/france-releases-contact-tracing-app-stopcovid-on-android/. (2020). (Accessed on Oct. 1, 2020).Google Scholar
Reference
[23] Dzindolet Mary, Pierce Linda, Beck Hall, and Dawe Lloyd. 2002. The perceived utility of human and automated aids in a visual detection task. Human Factors 44, 1 (2002), 79–94. DOI:Google ScholarCross Ref
Reference 1Reference 2
[24] Eames K. T. and Keeling M. J.. 2003. Contact tracing and disease control. Proceedings of the Royal Society B: Biological Sciences 270, 1533 (Dec. 2003), 2565–2571.Google ScholarCross Ref
Reference
[25] Egelman Serge, Felt Adrienne Porter, and Wagner David. 2013. Choice architecture and smartphone privacy: There’s a price for that. In The Economics of Information Security and Privacy. Springer, 211–236.Google ScholarCross Ref
Reference
[26] Network EU eHealth. 2020. Mobile Applications to Support Contact Tracing in the EU’s Fight Against COVID-19. https://ec.europa.eu/health/sites/health/files/ehealth. (2020). (Accessed on Apr. 27, 2020).Google Scholar
Reference
[27] Epley Nicholas and Caruso Eugene M.. 2004. Egocentric ethics. Social Justice Research 17, 2 (2004), 171–187.Google Scholar
Reference 1Reference 2Reference 3
[28] Friedman Arik, Knijnenburg Bart P., Vanhecke Kris, Martens Luc, and Berkovsky Shlomo. 2015. Privacy aspects of recommender systems. In Recommender Systems Handbook. Springer, 649–688.Google ScholarCross Ref
Reference
[29] Gigerenzer Gerd, Hertwig Ralph, Broek Eva Van Den, Fasolo Barbara, and Katsikopoulos Konstantinos V.. 2005. “A 30% chance of rain tomorrow”: How does the public understand probabilistic weather forecasts? Risk Analysis: An International Journal 25, 3 (2005), 623–629.Google ScholarCross Ref
Reference
[30] Gigerenzer Gerd and Hoffrage Ulrich. 1995. How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review 102, 4 (1995), 684–704.Google ScholarCross Ref
Reference
[31] Glaeser Edward L., Sacerdote Bruce, and Scheinkman Jose A.. 1996. Crime and social interactions. The Quarterly Journal of Economics 111, 2 (1996), 507–548.Google ScholarCross Ref
Reference
[32] Glasgow Garrett, Butler Sarah, and Iyengar Samantha. 2020. Survey response bias and the ‘privacy paradox’: Evidence from a discrete choice experiment. Applied Economics Letters 28, 8 (2020), 625–629. DOI:Google ScholarCross Ref
Reference
[33] Goel Sharad, Hofman Jake M., Lahaie Sebastien, Pennock David M., and Watts Duncan J.. 2010. Predicting consumer behavior with Web search. Proceedings of the National Academy of Sciences (PNAS) 107, 41 (2010), 17486–17490. DOI:Google ScholarCross Ref
Reference
[34] Green Ben and Chen Yiling. 2019. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. In FAT*’19: Conference on Fairness, Accountability, and Transparency (FAT*’19).Google Scholar
Reference
[35] Green Ben and Chen Yiling. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–24.Google ScholarDigital Library
Reference
[36] Grgić-Hlača Nina, Engel Christoph, and Gummadi Krishna P.. 2019. Human decision making with machine assistance: An experiment on bailing and jailing. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–25.Google ScholarDigital Library
Reference
[37] Grgić-Hlača Nina, Redmiles Elissa M., Gummadi Krishna P., and Weller Adrian. 2018. Human perceptions of fairness in algorithmic decision making. In Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW’18. ACM Press. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[38] Hargittai Eszter. 2009. An update on survey measures of web-oriented digital literacy. Social Science Computer Review 27, 1 (2009), 130–137.Google ScholarDigital Library
Reference
[39] Hargittai Eszter, Redmiles Elissa M., Vitak Jessica, and Zimmer Michael. 2020. Americans’ willingness to adopt a COVID-19 tracking app. First Monday 25, 11 (2020). DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[40] Hargittai Eszter and Shaw Aaron. 2015. Mind the skills gap: The role of Internet know-how and gender in differentiated contributions to Wikipedia. Information, Communication & Society 18, 4 (2015), 424–442.Google ScholarCross Ref
Reference
[41] Hargittai Eszter and Shaw Aaron. 2020. Comparing Internet experiences and prosociality in Amazon Mechanical Turk and population-based survey samples. Socius 6, 1 (2020). DOI:Google ScholarCross Ref
Reference
[42] Herley Cormac, Redmiles E. M., and Suri Siddharth. 2020. A deterministic choice model for security behavior. Telecommunications Policy Research Conference (2020).Google Scholar
Reference
[43] Hofman Jake M., Sharma Amit, and Watts Duncan J.. 2017. Prediction and explanation in social systems. Science 355, 6324 (2017), 486–488.Google ScholarCross Ref
Reference
[44] Hoy Mariea Grubbs and Milne George. 2010. Gender differences in privacy-related measures for young adult Facebook users. Journal of Interactive Advertising 10, 2 (2010), 28–45.Google ScholarCross Ref
Reference
[45] Hsu Chin-Lung and Lin Judy Chuan-Chuan. 2016. An empirical examination of consumer adoption of Internet of Things services: Network externalities and concern for information privacy perspectives. Computers in Human Behavior 62 (2016), 516–527.Google ScholarDigital Library
Reference 1Reference 2
[46] Jung Jongbin, Concannon Connor, Shroff Ravi, Goel Sharad, and Goldstein Daniel G.. 2020. Simple rules to guide expert classifications. Journal of the Royal Statistical Society: Series A (Statistics in Society) 183, 3 (2020), 771–800.Google ScholarCross Ref
Reference
[47] Kay Matthew, Patel Shwetak N., and Kientz Julie A.. 2015. How good is 85%? A survey tool to connect classifier evaluation to acceptability of accuracy. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). Association for Computing Machinery, New York, NY, USA, 347–356. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[48] Keller Carmen and Siegrist Michael. 2009. Effect of risk communication formats on risk perception depending on numeracy. Medical Decision Making 29, 4 (2009), 483–490.Google Scholar
Reference
[49] Kleinberg Jon, Liang Annie, and Mullainathan Sendhil. 2017. The theory is predictive, but is it complete? An application to human perception of randomness. In Proceedings of the 2017 ACM Conference on Economics and Computation. 125–126.Google ScholarDigital Library
Reference
[50] Lee Min Kyung. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5, 1 (2018).Google Scholar
Reference 1Reference 2
[51] Lee Min Kyung, Kusbit Daniel, Kahng Anson, Kim Ji Tae, Yuan Xinran, Chan Allissa, See Daniel, Noothigattu Ritesh, Lee Siheon, Psomas Alexandros, et al. 2019. WeBuildAI: Participatory framework for algorithmic governance. Proceedings of the ACM on Human-Computer Interaction 3, (CSCW’19), 1–35.Google ScholarDigital Library
Reference 1Reference 2
[52] Li He, Wu Jing, Gao Yiwen, and Shi Yao. 2016. Examining individuals’ adoption of healthcare wearable devices: An empirical study from privacy calculus perspective. International Journal of Medical Informatics 88 (2016), 8–17. DOI:Google ScholarCross Ref
Reference 1Reference 2
[53] Li Tianshi, Cobb Camille, Yang Jackie, Baviskar Sagar, Agarwal Yuvraj, Li Beibei, Bauer Lujo, and Hong Jason I.. 2021. What makes people install a COVID-19 contact-tracing app? Understanding the influence of app design and individual difference on contact-tracing app adoption intention. Pervasive and Mobile Computing (2021), 101439. DOI:Google ScholarDigital Library
Reference
[54] Li Tianshi, Yang Jackie, Faklaris Cori, King Jennifer, Agarwal Yuvraj, Dabbish Laura, and Hong Jason I.. 2005. Decentralized is not risk-free: Understanding public perceptions of privacy-utility trade-offs in COVID-19 contact-tracing apps. https://arxiv.org/abs/2005.11957. arXiv preprint arXiv:2004.03544 (2005).Google Scholar
Reference
[55] Liu Bin, Kong Deguang, Cen Lei, Gong Neil Zhenqiang, Jin Hongxia, and Xiong Hui. 2015. Personalized mobile app recommendation: Reconciling app functionality and user privacy preference. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. 315–324.Google ScholarDigital Library
Reference
[56] Melendez Steven. 2020. North Dakota’s contact-tracing app shares location data. https://www.fastcompany.com/90508044/north- dakotas-covid-19-app-has-been-sending-data-to-foursquare-and-google. (2020). (Accessed on Oct. 12, 2021).Google Scholar
Reference 1Reference 2Reference 3
[57] Mitchell Margaret, Wu Simone, Zaldivar Andrew, Barnes Parker, Vasserman Lucy, Hutchinson Ben, Spitzer Elena, Raji Inioluwa Deborah, and Gebru Timnit. 2019. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 220–229.Google ScholarDigital Library
Reference
[58] Morse Jack. 2020. North Dakota launched a contact-tracing app. It’s not going well. https://mashable.com/article/north-dakota-contact-tracing-app/. (2020). (Accessed on Oct. 12, 2021).Google Scholar
Reference 1Reference 2Reference 3
[59] Mullinix Kevin J., Leeper Thomas J., Druckman James N., and Freese Jeremy. 2015. The generalizability of survey experiments. Journal of Experimental Political Science 2, 2 (2015), 109–138.Google ScholarCross Ref
Reference 1Reference 2
[60] Paling John. 2003. Strategies to help patients understand risks. British Medical Journal 327, 7417 (2003), 745–748.Google Scholar
Reference
[61] Panniello Umberto, Gorgoglione Michele, and Tuzhilin Alexander. 2016. Research note–In CARS we trust: How context-aware recommendations affect customers’ trust and other business performance measures of recommender systems. Information Systems Research 27, 1 (2016), 182–196.Google ScholarCross Ref
Reference 1Reference 2
[62] Paolacci Gabriele, Chandler Jesse, and Ipeirotis Panagiotis G.. 2010. Running experiments on Amazon Mechanical Turk. Judgment and Decision Making 5, 5 (2010), 411–419.Google ScholarCross Ref
Reference
[63] Poursabzi-Sangdeh Forough, Goldstein Daniel G., Hofman Jake M., Vaughan Jennifer Wortman, and Wallach Hanna. 2021. Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–52.Google ScholarDigital Library
Reference
[64] Rader Emilee and Slaker Janine. 2017. The importance of visibility for folk theories of sensor data. In Thirteenth Symposium on Usable Privacy and Security (SOUPS’17). 257–270.Google Scholar
Reference
[65] Rao Jon N. K. and Scott Alistair J.. 1984. On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data. The Annals of Statistics 12, 1 (1984), 46–60.Google Scholar
Reference
[66] Raskar Ramesh, Nadeau Greg, Werner John, Barbar Rachel, Mehra Ashley, Harp Gabriel, Leopoldseder Markus, Wilson Bryan, Flakoll Derrick, Vepakomma Praneeth, et al. 2020. COVID-19 contact-tracing mobile apps: Evaluation and assessment for decision makers. arXiv preprint arXiv:2006.05812 (2020).Google Scholar
Reference 1Reference 2Reference 3
[67] Redmiles Elissa M.. 2018. Net benefits: Digital inequities in social capital, privacy preservation, and digital parenting practices of US social media users. In Twelfth International AAAI Conference on Web and Social Media.Google Scholar
Reference
[68] Redmiles Elissa M.. 2020. User concerns & tradeoffs in technology-facilitated contact tracing. arXiv preprint arXiv:2004.13219 (2020).Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[69] Redmiles Elissa M., Chachra Neha, and Waismeyer Brian. 2018. Examining the demand for spam: Who clicks?. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–10.Google ScholarDigital Library
Reference
[70] Redmiles Elissa M., Kross Sean, and Mazurek Michelle L.. 2019. How well do my results generalize? Comparing security and privacy survey results from MTurk, web, and telephone samples. In 2019 IEEE Symposium on Security and Privacy (SP’19). IEEE, 1326–1343.Google Scholar
Reference 1Reference 2
[71] Redmiles Elissa M., Zhu Ziyun, Kross Sean, Kuchhal Dhruv, Dumitras Tudor, and Mazurek Michelle L.. 2018. Asking for a friend: Evaluating response biases in security user studies. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 1238–1255.Google ScholarDigital Library
Reference
[72] Riederer Christopher, Hofman Jake M., and Goldstein Daniel G.. 2018. To put that in perspective: Generating analogies that make numbers easier to understand. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–10.Google Scholar
Reference
[73] Sacks Jilian A., Zehe Elizabeth, Redick Cindil, Bah Alhoussaine, Cowger Kai, Camara Mamady, Diallo Aboubacar, Gigo Abdel Nasser Iro, Dhillon Ranu S., and Liua Anne. 2015. Introduction of mobile health tools to support Ebola surveillance and contact tracing in Guinea. Global Health: Science and Practice 3, 4 (2015), 646–659. DOI:Google ScholarCross Ref
Reference
[74] Saha Debjani, Schumann Candice, McElfresh Duncan C., Dickerson John P., Mazurek Michelle L., and Tschantz Michael Carl. 2020. Human comprehension of fairness in machine learning. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 152. DOI:Google ScholarDigital Library
Reference 1Reference 2
[75] Salem Maha, Lakatos Gabriella, Amirabdollahian Farshid, and Dautenhahn Kerstin. 2015. Would you trust a (faulty) robot? Effects of error, task type and personality on human-robot cooperation and trust. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI’15). 141–148. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[76] Saxena Nitesh et al. 2020. Smartphone-based Automated Contact Tracing: Is it Possible to Balance Privacy, Accuracy and Security? https://www.uab.edu/news/research/item/11299-smartphone-based-automated-contact-tracing-is-it-possible-to-balance-privacy-accuracy-and-security. (2020). (Accessed on May 8, 2020).Google Scholar
Reference
[77] Saxena Nripsuta Ani, Huang Karen, DeFilippis Evan, Radanovic Goran, Parkes David C., and Liu Yang. 2019. How do fairness definitions fare? Examining public attitudes towards algorithmic definitions of fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES’19). Association for Computing Machinery, New York, NY, USA, 99–106. DOI:Google ScholarDigital Library
Reference 1Reference 2
[78] Shih Fuming, Liccardi Ilaria, and Weitzner Daniel. 2015. Privacy tipping points in smartphones privacy preferences. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 807–816.Google ScholarDigital Library
Reference
[79] Simko Lucy, Calo Ryan, Roesner Franziska, and Kohno Tadayoshi. 2020. COVID-19 contact tracing and privacy: Studying opinion and preferences. arXiv preprint arXiv:2005.06056 (2020).Google Scholar
Reference 1Reference 2
[80] Singer Natasha. 2021. Why Apple and Google’s Virus Alert Apps Had Limited Success. https://www.nytimes.com/2021/05/27/business/apple-google-virus-tracing-app.html. (May 2021). (Accessed on Oct 21, 2021).Google Scholar
Reference 1Reference 2
[81] Srivastava Megha, Heidari Hoda, and Krause Andreas. 2019. Mathematical notions vs. human perception of fairness: A descriptive approach to fairness for machine learning. arXiv preprint arXiv:1902.04783 (2019).Google Scholar
Reference 1Reference 2
[82] Teague Vanessa. 2021. Not as Private as We Had Hoped: Unintended Privacy Problems in Some Centralized and Decentralized COVID-19 Exposure Notification Systems. Real World Cryptography 2021, https://www.youtube.com/watch?v=ne-2Le_egx8. (2021). (Accessed on Oct 21, 2021).Google Scholar
Reference
[83] Troncoso Carmela, Payer Mathias, Hubaux Jean-Pierre, Salathé Marcel, Larus James, Bugnion Edouard, Lueks Wouter, Stadler Theresa, Pyrgelis Apostolos, Antonioli Daniele, et al. 2020. Decentralized privacy-preserving proximity tracing. arXiv preprint arXiv:2005.12273 (2020).Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[84] Turjeman Dana and Feinberg Fred M.. 2019. When the Data Are Out: Measuring Behavioral Changes Following a Data Breach. SSRN, https://ssrn.com/abstract=3427254. (2019). DOI:Google ScholarCross Ref
Reference
[85] Valentino-DeVries Jennifer, Singer Natasha, and Krolik Aaron. 2020. A Scramble for Virus Apps That Do No Harm. https://www.nytimes.com/2020/04/29/business/coronavirus-cellphone-apps-contact-tracing.html. (2020). (Accessed on May 8, 2020).Google Scholar
Reference
[86] Vaudenay Serge. 2020. Analysis of DP3T. Cryptology ePrint Archive, Report 2020/399. (2020). https://eprint.iacr.org/2020/399.Google Scholar
Reference 1Reference 2
[87] Wyl Viktor von, Höglinger Marc, Sieber Chloé, Kaufmann Marco, Moser André, Serra-Burriel Miquel, Ballouz Tala, Menges Dominik, Frei Anja, and Puhan Milo Alan. 2021. Drivers of acceptance of COVID-19 proximity tracing apps in Switzerland: Panel survey analysis. JMIR Public Health and Surveillance 7, 1 (2021), e25701. DOI:Google ScholarCross Ref
Reference
[88] Wang Dayong, Khosla Aditya, Gargeya Rishab, Irshad Humayun, and Beck Andrew H.. 2016. Deep Learning for Identifying Metastatic Breast Cancer. arXiv 1606.05718. (2016).Google Scholar
Reference 1Reference 2
[89] Wang Ruotong, Harper F. Maxwell, and Zhu Haiyi. 2020. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarDigital Library
Reference 1Reference 2
[90] Watts Duncan J., Beck Emorie D., Bienenstock Elisa Jayne, Bowers Jake, Frank Aaron, Grubesic Anthony, Hofman Jake, Rohrer Julia M., and Salganik Matthew. 2018. Explanation, Prediction, and Causality: Three Sides of the Same Coin? OSF Preprints. http://osf.io/u6vz5. (2018). DOI:Google ScholarCross Ref
Reference
[91] Westin Alan F.. 2003. Social and political dimensions of privacy. Journal of Social Issues 59, 2 (2003), 431–453.Google ScholarCross Ref
Reference
[92] Xu Heng, Teo Hock-Hai, and Tan Bernard. 2005. Predicting the adoption of location-based services: The role of trust and perceived privacy risk. In 26th International Conference on Information Systems (ICIS’05). 897–910.Google Scholar
Reference 1Reference 2
[93] Yin Ming, Vaughan Jennifer Wortman, and Wallach Hanna. 2019. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12. DOI:Google ScholarDigital Library
Reference
[94] Yu Kun, Berkovsky Shlomo, Conway Dan, Taib Ronnie, Zhou Jianlong, and Chen Fang. 2016. Trust and reliance based on system accuracy. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization. 223–227.Google ScholarDigital Library
Reference 1Reference 2Reference 3
[95] Yu Kun, Berkovsky Shlomo, Taib Ronnie, Conway Dan, Zhou Jianlong, and Chen Fang. 2017. User trust dynamics: An investigation driven by differences in system performance. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. 307–317.Google ScholarDigital Library
Reference 1Reference 2
[96] Zhou Tao. 2011. The impact of privacy concern on user adoption of location-based services. Industrial Management & Data Systems 111, 2 (2011), 212–226. DOI:Google ScholarCross Ref
Reference 1Reference 2

Index Terms

How Good is Good Enough? Quantifying the Impact of Benefits, Accuracy, and Privacy on Willingness to Adopt COVID-19 Decision Aids
1. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Social aspects of security and privacy
    2. Usability in security and privacy

Recommendations

An Experimental Investigation of the Impact of Computer Based Decision Aids on Decision Making Strategies

Although Decision Support Systems DSSs have been in use since the early seventies, there is as yet no strong theoretical base for predicting how a DSS will influence decision making. Furthermore, the findings of various empirical studies on the outcomes ...
Read More
Does social good justify risking personal privacy?
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

When data-driven improvements involve personally identifiable data, or even data that can be used to infer sensitive information about individuals, we face the dilemma that we potentially risk compromising privacy. As we see increased emphasis on using ...
Read More
Accuracy Aware Privacy Preserving Decision Support
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Digital Threats: Research and Practice Volume 3, Issue 3
September 2022
246 pages
EISSN:2576-5337
DOI:10.1145/3551648
Editors:
Arun Lakhotia
University of Louisiana at Lafayette and Cythereal, USA
,
Leigh Metcalf
CERT, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 March 2022
- Online AM: 26 March 2022
- Accepted: 23 September 2021
- Revised: 28 June 2021
- Received: 7 December 2020
Published in dtrap Volume 3, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Exposure Notification
Contact Tracing
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 1,013
  Total Downloads
- Downloads (Last 12 months)488
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

How Good is Good Enough? Quantifying the Impact of Benefits, Accuracy, and Privacy on Willingness to Adopt COVID-19 Decision Aids

Digital Threats: Research and Practice

Abstract

1 INTRODUCTION

2 RELATED WORK

3 METHODOLOGY

3.1 State of COVID-19 Pandemic During Our Surveys

3.2 First Survey (RQ1 and RQ2)

3.3 Second Survey (RQ3)

3.4 Limitations

4 RESULTS

4.1 Both Accuracy & Privacy Influence Whether People Want to Install a COVID App

4.2 Some Americans Weigh Accuracy or Privacy Considerations More Highly than Others

4.3 Amount of Public Health and Individual Benefit Influence Willingness to Install

4.4 Amount of Privacy Risk Also Influences Willingness to Install

5 DISCUSSION AND CONCLUSION

5.1 Implications Beyond COVID-19 Apps

Footnotes

REFERENCES

Cited By

Index Terms

Recommendations

An Experimental Investigation of the Impact of Computer Based Decision Aids on Decision Making Strategies

Does social good justify risking personal privacy?

Accuracy Aware Privacy Preserving Decision Support

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media