research-article

Open Access

"Alexa, stop spying on me!": speech privacy protection against voice assistants

Authors:
Ke Sun

University of California San Diego

University of California San Diego
View Profile

,
Chen Chen

University of California San Diego

University of California San Diego
View Profile

,
Xinyu Zhang

University of California San Diego

University of California San Diego
View Profile

SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor SystemsNovember 2020Pages 298–311https://doi.org/10.1145/3384419.3430727

Published:16 November 2020Publication History

SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems

Pages 298–311

ABSTRACT

Voice assistants (VAs) are becoming highly popular recently as a general means of interacting with the Internet of Things. However, the use of always-on microphones on VAs imposes a looming threat on users' privacy. In this paper, we propose MicShield, the first system that serves as a companion device to enforce privacy preservation on VAs. MicShield introduces a novel selective jamming mechanism, which obfuscates the user's private speech while passing legitimate voice commands to the VAs. It achieves this by using a phoneme level jamming control pipeline. Our implementation and experiments demonstrate that MicShield can effectively protect a user's private speech, without affecting the VA's responsiveness.

References

Amazon.com. Amazon echo. https://www.amazon.com/echo/.Google Scholar
Google. Google home. https://store.google.com/product/google_home.Google Scholar
Greg Sterling. Alexa devices maintain 70% market share in u.s. according to survey. https://marketingland.com/alexa-devices-maintain-70-market-share-in-us-according-to-survey-265180.Google Scholar
Robert Williams. Study: Smart speaker ownership surges 36% to 53m US adults. https://www.mobilemarketer.com/news/study-smart-speaker-ownership-surges-36-to-53m-us-adults/545717/.Google Scholar
Amazon. Alexa, echo devices, and your privacy, amazon help & customer service. https://www.amazon.com/gp/help/customer/display.html?nodeId=GVP69FUJ48X9DK8V.Google Scholar
Google. More about data security and privacy on devices that work with assistant. https://support.google.com/googlenest/answer/7072285?hl=en.Google Scholar
Google. Google home mini. https://store.google.com/product/google_home_mini.Google Scholar
Russakovskii Artem. Google is permanently nerfing all home minis because mine spied on everything i said 24/7. https://www.androidpolice.com/2017/10/10/google-nerfing-home-minis-mine-spied-everything-said-247/#1.Google Scholar
Soo Youn. Alexa is always listening --- and so are amazon workers. https://abcnews.go.com/Technology/alexa-listening-amazon-workers/story?id=62331191.Google Scholar
Zack Wittaker. Amazonsays US government demands for customer data went up. https://techcrunch.com/2019/08/01/amazon-prism-transparency-data/.Google Scholar
Heather Kelly. How to make sure your amazon echo doesn't send secret recordings, 5 2018. https://money.cnn.com/2018/05/25/technology/amazon-alexa-stop-recording/index.html.Google Scholar
Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. Inaudible voice commands: The long-range attack and defense. In Proceedings of Usenix NSDI, 2018.Google Scholar
Yuxin Chen, Huiying Li, Steven Nagels, Zhijing Li, Pedro Lopes, Ben Y Zhao, and Haitao Zheng. Wearable microphone jamming. In Proceedings of ACM CHI, 2020.Google ScholarDigital Library
Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. Backdoor: Making microphones hear inaudible sounds. In Proceedings of ACM MobiSys, 2017.Google ScholarDigital Library
François Grondin and François Michaud. Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations. Robotics and Autonomous Systems, 2019.Google ScholarCross Ref
Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. Dolphinattack: Inaudible voice commands. In Proceedings of ACM CCS, 2017.Google Scholar
Takeshi Sugawara, Benjamin Cyr, Sara Rampazzi, Daniel Genkin, and Kevin Fu. Light commands: Laser-based audio injection attacks on voice-controllable systems. 2019.Google Scholar
Yitao He, Junyu Bian, Xinyu Tong, Zihui Qian, Wei Zhu, Xiaohua Tian, and Xinbing Wang. Canceling Inaudible Voice Commands Against Voice Control Systems. In Proceedings of ACM MobiCom, 2019.Google Scholar
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. Hidden voice commands. In Proceedings of USENIX Security Symposium, 2016.Google Scholar
Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. Cocaine noodles: exploiting the gap between human and machine speech recognition. In 9th USENIX Workshop on Offensive Technologies, 2015.Google Scholar
Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, XiaoFeng Wang, and Carl A Gunter. Commandersong: A systematic approach for practical adversarial voice recognition. In Proceedings of USENIX Security Symposium, 2018.Google Scholar
Nicholas Carlini and David Wagner. Audio adversarial examples: Targeted attacks on speech-to-text. In Proceedings of IEEE Security and Privacy Workshops (SPW), 2018.Google ScholarCross Ref
CMUSphinx, 2019. https://cmusphinx.github.io/.Google Scholar
Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. Skill squatting attacks on amazon alexa. In Proceedings of USENIX Security Symposium, 2018.Google Scholar
Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In Proceedings of IEEE Security and Privacy, 2019.Google ScholarCross Ref
Yu-Chih Tung and Kang G Shin. Exploiting sound masking for audio privacy in smartphones. In Proceedings of ACM AsiaCCS, 2019.Google ScholarDigital Library
Gaurav Srivastava, Kunal Bhuwalka, Swarup Kumar Sahoo, Saksham Chitkara, Kevin Ku, Matt Fredrikson, Jason Hong, and Yuvraj Agarwal. Privacyproxy: Leveraging crowdsourcing and in situ traffic analysis to detect and mitigate information leakage, 2017.Google Scholar
Yuvraj Agarwal and Malcolm Hall. Protectmyprivacy: Detecting and mitigating privacy leaks on ios devices using crowdsourcing. In Proceedings of ACM MobiSys, 2013.Google Scholar
Ashwin Rao, Justine Sherry, Arnaud Legout, Arvind Krishnamurthy, Walid Dabbous, and David Choffnes. Meddle: Middleboxes for increased transparency and control of mobile traffic. 2012.Google ScholarDigital Library
Jianwei Qian, Haohua Du, Jiahui Hou, Linlin Chen, Taeho Jung, and Xiang-Yang Li. Hidebehind: Enjoy voice input with voiceprint unclonability and anonymity. In Proceedings of ACM SenSys, 2018.Google Scholar
J. Clark and P. C. van Oorschot. Sok: Ssl and https: Revisiting past challenges and evaluating certificate trust model enhancements. In Proceedings of IEEE Symposium on Security and Privacy, 2013.Google ScholarDigital Library
Alexa privacy and data handling overview. https://d1.awsstatic.com/product-marketing/A4B/White%20Paper%20-%20Alexa%20Privacy%20and%20Data%20Handling%20Overview.pdf.Google Scholar
Igor Bobriakov. Comparison of top 10 speech processing APIs. https://medium.com/activewizards-machine-learning-company/comparison-of-top-10-speech-processing-apis-2293de1d337f.Google Scholar
International standard iec 61672:2003. International Electrotechnical Commission, 2003.Google Scholar
Noise and hearing loss prevention. https://www.asha.org/public/hearing/Noise-and-Hearing-Loss-Prevention/.Google Scholar
A. D. Wyner. The wire-tap channel. The Bell System Technical Journal, 54(8):1355--1387, Oct 1975.Google ScholarCross Ref
ITU-T Recommendation. Perceptual evaluation of speech quality (pesq): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. Rec. ITU-T P. 862, 2001.Google Scholar
Antony W Rix, John G Beerends, Michael P Hollier, and Andries P Hekstra. Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs. In Proceedings of IEEE ICASSP, 2001.Google ScholarDigital Library
Amazon Transcribe, 2019. https://aws.amazon.com/transcribe/.Google Scholar
Google Cloud Speech-to-Text, 2019. https://cloud.google.com/speech-to-text/.Google Scholar
Amazon.com: Echo dot (3rd gen) - smart speaker with alexa - charcoal: Amazon devices. https://www.amazon.com/Echo-Dot/dp/B07FZ8S74R.Google Scholar
Bjørn Karmann. Project Alias, 2019. https://www.instructables.com/id/Project-Alias/.Google Scholar
Amir Anhari. Alexa dataset - build voice-first applications. https://www.kaggle.com/aanhari/alexa-dataset.Google Scholar
Alex Graves and Jürgen Schmidhuber. Framewise phoneme classification with bidirectional lstm networks. In Proceedings of IEEE International Joint Conference on Neural Networks, 2005.Google ScholarCross Ref
John S Garofolo et al. Darpa timit acoustic-phonetic speech database. National Institute of Standards and Technology (NIST), 15:29--50, 1988.Google Scholar
Lawrence R Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of IEEE, 77(2):257--286, 1989.Google ScholarCross Ref
The CMU Pronouncing Dictionary, 2019. http://www.speech.cs.cmu.edu/cgi-bin/cmudict.Google Scholar
Tomi Kinnunen, Evgenia Chernenko, Marko Tuononen, Pasi Fränti, and Haizhou Li. Voice activity detection using mfcc features and support vector machine. In Int. Conf. on Speech and Computer (SPECOM07), Moscow, Russia, volume 2, pages 556--561, 2007.Google Scholar
Logan Blue, Luis Vargas, and Patrick Traynor. Hello, is it me you're looking for? differentiating between human and electronic speakers for voice interface security. In Proceedings of ACM WiSec, 2018.Google Scholar
Muhammad Ejaz Ahmed, Il-Youp Kwak, Jun Ho Huh, Iljoo Kim, Taekkyung Oh, and Hyoungshick Kim. Void: A fast and light voice liveness detection system. In Proceedings of USENIX Security Symposium, 2018.Google Scholar
Amazon. Google home mini. https://www.amazon.com/gp/help/customer/display.html?nodeId=202201630.Google Scholar
John D'Errico. Surface fitting using gridfit. MathWorks file exchange, 643, 2005. https://www.mathworks.com/matlabcentral/fileexchange/8998-surface-fitting-using-gridfit.Google Scholar
The respeaker 6 mic array for raspberry pi, 2019. https://respeaker.io.Google Scholar
Don H Johnson and Dan E Dudgeon. Array signal processing: concepts and techniques. PTR Prentice Hall Englewood Cliffs, 1993.Google ScholarDigital Library
Sanjib Sur, Teng Wei, and Xinyu Zhang. Autodirective Audio Capturing through a Synchronized Smartphone Array. In Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys), 2014.Google ScholarDigital Library
J Widder and A Morcelli. Basic principles of mems microphones, 2016. https://www.edn.com/basic-principles-of-mems-microphones/.Google Scholar
Kazunori Miura. Ultrasonic directive speaker. Elektor Magazine, 3:2011, 2011.Google Scholar
Filterless 3w class-d stereo audio amplifier (datasheet). https://www.diodes.com/assets/Datasheets/PAM8403.pdf.Google Scholar
John D. Cutnell, Kenneth W. Johnson, David Young, Shane Stadler. Physics. Wiley, 11 edition.Google Scholar
H Tijdeman. On the propagation of sound waves in cylindrical tubes. Journal of Sound and Vibration, 1975.Google ScholarCross Ref
Environmental health criteria - ultrasound, 1982. https://apps.who.int/iris/bitstream/handle/10665/37263/9241540826-eng.pdf?sequence=1&isAllowed=y.Google Scholar
Pimoroni pHAT DAC24-bit/192khz sound card. https://shop.pimoroni.com/products/phat-dac.Google Scholar
Theano, 2019. https://github.com/Theano/Theano.Google Scholar
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. Audio set: An ontology and human-labeled dataset for audio events. In Proceedings of IEEE ICASSP, 2017.Google Scholar
Ina 219 zero-drift, bidirectional current/power monitor with i2c interface. http://www.ti.com/lit/ds/symlink/ina219.pdf.Google Scholar
Matthias R Mehl, Simine Vazire, Nairán Ramírez-Esparza, Richard B Slatcher, and James W Pennebaker. Are women really more talkative than men? Science, 2007.Google Scholar
Johnson Dave. How to save battery on your samsung galaxy s10 in 4 simple ways. https://www.businessinsider.com/how-to-save-battery-on-samsung-galaxy-s10.Google Scholar
James Morra. Ai chip brings always-on alexa to battery-powered devices. https://www.electronicdesign.com/technologies/embedded-revolution/article/21808470/ai-chip-brings-alwayson-alexa-to-batterypowered-devices.Google Scholar
Google Cloud Text-to-Speech, 2019. https://cloud.google.com/text-to-speech/.Google Scholar
Amazon Polly, 2019. https://aws.amazon.com/polly/.Google Scholar
IBM Text-to-Speech, 2019. https://www.ibm.com/cloud/watson-text-to-speech.Google Scholar
20 helpful amazon echo voice commands for you to try. https://www.popsci.com/20-amazon-echo-voice-commands/.Google Scholar
Yeonjoon Lee, Yue Zhao, Jiutian Zeng, Kwangwuk Lee, Nan Zhang, Faysal Hossain Shezan, Yuan Tian, Kai Chen, and XiaoFeng Wang. Using sonar for liveness detection to protect smart speakers against remote attackers. In Proceedings of ACM IMWUT (UbiComp), 2020.Google ScholarDigital Library
Linghan Zhang, Sheng Tan, and Jie Yang. Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In Proceedings of ACM CCS, 2017.Google ScholarDigital Library

Index Terms

"Alexa, stop spying on me!": speech privacy protection against voice assistants
1. Human-centered computing
  1. Ubiquitous and mobile computing
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections

Recommendations

Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-seeking Behaviors with Smart Speakers

Smart speakers with voice assistants, like Amazon Echo and Google Home, provide benefits and convenience but also raise privacy concerns due to their continuously listening microphones. We studied people's reasons for and against adopting smart speakers, ...
Read More
Packet-Hiding Methods for Preventing Selective Jamming Attacks

The open nature of the wireless medium leaves it vulnerable to intentional interference attacks, typically referred to as jamming. This intentional interference with wireless transmissions can be used as a launchpad for mounting Denial-of-Service ...
Read More
VoicePM: A Robust Privacy Measurement on Voice Anonymity
WiSec '23: Proceedings of the 16th ACM Conference on Security and Privacy in Wireless and Mobile Networks

Voice-based human-computer interaction has become pervasive in laptops, smartphones, home voice assistants, and Internet of Thing (IoT) devices. However, voice interaction comes with security and privacy risks. Numerous privacy-preserving measures have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems
November 2020
852 pages
ISBN:9781450375900
DOI:10.1145/3384419

Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 November 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
privacy protection
selective jamming
voice assistant
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate174of867submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 2,198
  Total Downloads
- Downloads (Last 12 months)606
- Downloads (Last 6 weeks)90
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

"Alexa, stop spying on me!": speech privacy protection against voice assistants

SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-seeking Behaviors with Smart Speakers

Packet-Hiding Methods for Preventing Selective Jamming Attacks

VoicePM: A Robust Privacy Measurement on Voice Anonymity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

"Alexa, stop spying on me!": speech privacy protection against voice assistants

SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-seeking Behaviors with Smart Speakers

Packet-Hiding Methods for Preventing Selective Jamming Attacks

VoicePM: A Robust Privacy Measurement on Voice Anonymity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media