ABSTRACT
E-commerce privacy policies tend to consist of many ambiguities in language that protects companies more than the customers. Types of ambiguities found are currently divided into four patterns: mitigation (downplaying frequency), enhancement (emphasizing nonessential qualities), obfuscation (hedging claims and obscuring causality), and omission (removing agents). A number of phrases have been identified as creating ambiguities within these four categories. When a customer accepts the terms and conditions of a privacy policy, words and phrases (from the category of mitigation) such as "occasionally" or "from time to time" actually give the e-commerce vendor permission to send as many spamming email offers as they deem necessary . Our study uses techniques based on Latent Semantic Analysis to discover the underlying semantic relations between words in privacy policies. Additional potential ambiguities and other word relations are found automatically. Words are clustered according to their topic in privacy policies using principal directions. This provides us with a ranking of the most significant words from each clustered topic as well as a ranking of the privacy policy topics. We also extract a signature that forms the basis of a typical privacy policy. These results lead to the design of a system used to analyze privacy policies called Hermes. Given an arbitrary privacy policy our system provides a list of the potential ambiguities along with a score that represents the similarity to a typical privacy policy.
- Pollach, I. 2007 What's Wrong with online Privacy Policies?, Communications of the ACM, Volume 50--9, 103--108. Google ScholarDigital Library
- Lassez, J-L., Rossi, R., Sheel, S., Mukkamala, S. 2008 Signature Based Intrusion Detection System using Latent Semantic Analysis, IJCNN, 1068--1074.Google Scholar
- Landauer, T. K., Foltz, P. W., Laham, D. 1998 Introduction to Latent Semantic Analysis. Discourse Processes, 25, 259--284.Google ScholarCross Ref
- Landaur, T. K. and Dumais, S. T. 1997 A Solution to Plato's Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge, Psychological Review, vol. 104, pp. 211--240.Google ScholarCross Ref
- Landauer, T. K. and Littman, M. L. 1990 Fully automatic cross language document retrieval using latent semantic indexing, Proceedings of the Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research., 31--38.Google Scholar
- Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. 1990 Indexing by latent semantic analysis. JSIS, 41(6), 391--407.Google ScholarCross Ref
- J-L. Lassez, J-L., Rossi, R., Jeev, K. 2008 Ranking Links on the Web: Search and Surf Engines, Lecture Notes of Artificial Intelligence, IEA/AIE, 199--208. Google ScholarDigital Library
- Eckart, C. and Young, G. 1936 The approximation of one matrix by another of lower rank, Psychometrika, 1, 211--218.Google ScholarCross Ref
- Berry, M.&Browne M. 1999 Understanding Search Engines: Mathematical Modeling and Text Retrieval, SIAM. Google ScholarDigital Library
- Golub, G., Reinsch, C. 1970 Singular value decomposition and least squares solutions. Numer. Math. 14, 403--420.Google ScholarDigital Library
- Ezor, Jonathan, Clicking Through. Bloomberg Press, 1999, and personal communication with the authors (August 2006).Google Scholar
- Earp, J.D., Anton, A.I., Aiman-Smith, L&Stufflebeam, W.H. 2005 Examining Internet Privacy Policies Within the Context of User Privacy Values. IEEE Transactions on Engineering Management, 52(2), 227--237.Google ScholarCross Ref
Index Terms
- Automatically identifying relations in privacy policies
Recommendations
E-P3P privacy policies and privacy authorization
WPES '02: Proceedings of the 2002 ACM workshop on Privacy in the Electronic SocietyEnterprises collect large amounts of personal data from their customers. To ease privacy concerns, enterprises publish privacy statements that outline how data is used and shared. The Platform for Enterprise Privacy Practices (E-P3P) defines a fine-...
A Gap in Perceived Importance of Privacy Policies between Individuals and Companies
CONGRESS '09: Proceedings of the 2009 World Congress on Privacy, Security, Trust and the Management of e-BusinessAlthough several studies have examined individuals’ privacy concerns and companies’ privacy policy disclosures, only a few studies examined whether customers’ privacy concerns are adequately addressed in companies’ privacy policy disclosures. This study ...
A Large Publicly Available Corpus of Website Privacy Policies Based on DMOZ
CODASPY '21: Proceedings of the Eleventh ACM Conference on Data and Application Security and PrivacyStudies have shown website privacy policies are too long and hard to comprehend for their target audience. These studies and a more recent body of research that utilizes machine learning and natural language processing to automatically summarize privacy ...
Comments