Abstract
Social media offers a unique window into attitudes like racism and homophobia, exposure to which are important, hard to measure and understudied social determinants of health. However, individual geo-located observations from social media are noisy and geographically inconsistent. Existing areas by which exposures are measured, like Zip codes, average over irrelevant administratively-defined boundaries. Hence, in order to enable studies of online social environmental measures like attitudes on social media and their possible relationship to health outcomes, first there is a need for a method to define the collective, underlying degree of social media attitudes by region. To address this, we create the Socio-spatial-Self organizing map, "SS-SOM" pipeline to best identify regions by their latent social attitude from Twitter posts. SS-SOMs use neural embedding for text-classification, and augment traditional SOMs to generate a controlled number of non-overlapping, topologically-constrained and topically-similar clusters. We find that not only are SS-SOMs robust to missing data, the exposure of a cohort of men who are susceptible to multiple racism and homophobia-linked health outcomes, changes by up to 42% using SS-SOM measures as compared to using Zip code-based measures.
- Luca Maria Aiello, Rossano Schifanella, Daniele Quercia, and Francesco Aletta. 2016. Chatty maps: constructing sound maps of urban areas from social media data. Open Science , Vol. 3, 3 (2016), 150690.Google ScholarCross Ref
- Allison C Aosved, Patricia J Long, and Emily K Voller. 2009. Measuring sexism, racism, sexual prejudice, ageism, classism, and religious intolerance: The intolerant schema measure. Journal of Applied Social Psychology , Vol. 39, 10 (2009), 2321--2354.Google ScholarCross Ref
- Fernando Bacc ao, Victor Lobo, and Marco Painho. 2004. Geo-self-organizing map (Geo-SOM) for building and exploring homogeneous regions. Geographic Information Science (2004), 22--37.Google Scholar
- Jamie Bartlett, Jeremy Reffin, Noelle Rumball, and Sarah Williamson. 2014. Anti-social media. Demos (2014), 1--51.Google Scholar
- Jacob Bor, Atheendar S Venkataramani, David R Williams, and Alexander C Tsai. 2018. Police killings and their spillover effects on the mental health of black Americans: a population-based, quasi-experimental study. The Lancet (2018).Google Scholar
- Christoph Brauer. 2012. An Introduction to Self-Organizing Maps. (2012).Google Scholar
- Pete Burnap and Matthew L Williams. 2015. Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet , Vol. 7, 2 (2015), 223--242.Google ScholarCross Ref
- Richard M Carpiano, Brian C Kelly, Adam Easterbrook, and Jeffrey T Parsons. 2011. Community and drug use among gay men: The role of neighborhoods and networks. Journal of Health and Social Behavior , Vol. 52, 1 (2011), 74--90.Google ScholarCross Ref
- David H Chae, Sean Clouston, Mark L Hatzenbuehler, Michael R Kramer, Hannah LF Cooper, Sacoby M Wilson, Seth I Stephens-Davidowitz, Robert S Gold, and Bruce G Link. 2015. Association between an internet-based measure of area racism and black mortality. PloS one , Vol. 10, 4 (2015), e0122963.Google ScholarCross Ref
- Irfan Chaudhry. 2015. # Hashtagging hate: Using Twitter to track racism online. First Monday , Vol. 20, 2 (2015).Google ScholarCross Ref
- Kyung-Hee Choi, Chong-suk Han, Jay Paul, and George Ayala. 2011. Strategies of managing racism and homophobia among US ethnic and racial minority men who have sex with men. AIDS education and prevention: official publication of the International Society for AIDS Education , Vol. 23, 2 (2011), 145.Google Scholar
- Rumi Chunara, Lindsay Bouton, John W Ayers, and John S Brownstein. 2013. Assessing the online social environment for surveillance of obesity prevalence. PloS one , Vol. 8, 4 (2013), e61373.Google ScholarCross Ref
- Rumi Chunara, Lauren E Wisk, and Elissa R Weitzman. 2017. Denominator issues for personally generated data in population health monitoring. American journal of preventive medicine , Vol. 52, 4 (2017), 549--553.Google Scholar
- Rodney Clark, Norman B Anderson, Vernessa R Clark, and David R Williams. 1999. Racism as a stressor for African Americans: A biopsychosocial model. American psychologist , Vol. 54, 10 (1999), 805.Google Scholar
- Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement , Vol. 20, 1 (1960), 37--46.Google Scholar
- Dana Collins. 2009. “We're There and Queer” Homonormative Mobility and Lived Experience among Gay Expatriates in Manila. Gender & Society , Vol. 23, 4 (2009), 465--493.Google ScholarCross Ref
- Angelo Brandelli Costa, Denise Ruschel Bandeira, and Henrique Caetano Nardi. 2013. Systematic review of instruments measuring homophobia and related constructs. Journal of Applied Social Psychology , Vol. 43, 6 (2013), 1324--1332.Google ScholarCross Ref
- Justin Cranshaw, Raz Schwartz, Jason I Hong, and Norman Sadeh. 2012. The livehoods project: Utilizing social media to understand the dynamics of a city. In ICWSM . 58.Google Scholar
- Justin Cranshaw and Tae Yano. 2010. Seeing a home away from the home: Distilling proto-neighborhoods from incidental data with latent topic modeling. In CSSWC Workshop at NIPS , Vol. 10.Google Scholar
- William A Darity Jr. 2003. Employment discrimination, segregation, and health. American Journal of Public Health , Vol. 93, 2 (2003), 226--231.Google ScholarCross Ref
- Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. arXiv preprint arXiv:1703.04009 (2017).Google Scholar
- Munmun De Choudhury, Shagun Jhaver, Benjamin Sugar, and Ingmar Weber. 2016. Social Media Participation in an Activist Movement for Racial Equality.. In ICWSM . 92--101.Google Scholar
- Da Deng and Nikola Kasabov. 2000. ESOM: An algorithm to evolve self-organizing maps from online data streams. In Proc. IJCNN , Vol. 6. IEEE, 3--8.Google ScholarCross Ref
- Dong-Po Deng, Tyng-Ruey Chuang, and Rob Lemmens. 2009. Conceptualization of place via spatial clustering and co-occurrence analysis. In Proc. LBSN . ACM, 49--56. Google ScholarDigital Library
- C'icero Nogueira Dos Santos and Maira Gatti. 2014. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts.. In COLING . 69--78.Google Scholar
- Dustin T Duncan, Ichiro Kawachi, SV Subramanian, Jared Aldstadt, Steven J Melly, and David R Williams. 2013. Examination of how neighborhood definition influences measurements of youths' access to tobacco retailers: a methodological note on spatial misclassification. American journal of epidemiology , Vol. 179, 3 (2013), 373--381.Google Scholar
- James E Egan, Victoria Frye, Steven P Kurtz, Carl Latkin, Minxing Chen, Karin Tobin, Cui Yang, and Beryl A Koblin. 2011. Migration, neighborhoods, and networks: approaches to understanding how urban environmental conditions affect syndemic adverse health outcomes among gay, bisexual and other men who have sex with men. AIDS and Behavior , Vol. 15, 1 (2011), 35--50.Google ScholarCross Ref
- Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating named entities in Twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. Association for Computational Linguistics, 80--88. Google ScholarDigital Library
- Vanessa Frias-Martinez, Victor Soto, Heath Hohwald, and Enrique Frias-Martinez. 2012. Characterizing urban landscapes using geolocated tweets. In 2012 PASSAT and SocialCom. IEEE, 239--248. Google ScholarDigital Library
- Victoria Frye, Beryl Koblin, John Chin, John Beard, Shannon Blaney, Perry Halkitis, David Vlahov, and Sandro Galea. 2010. Neighborhood-level correlates of consistent condom use among men who have sex with men: a multi-level analysis. AIDS and Behavior , Vol. 14, 4 (2010), 974--985.Google ScholarCross Ref
- George C Galster. 2008. Quantifying the effect of neighbourhood on individuals: Challenges, alternative approaches, and promising directions. Schmollers jahrbuch , Vol. 128, 1 (2008), 7--48.Google Scholar
- JL Giraudel and S Lek. 2001. A comparison of self-organizing map algorithm and some conventional statistical methods for ecological community ordination. Ecological Modelling , Vol. 146, 1 (2001), 329--339.Google ScholarCross Ref
- Yoav Goldberg and Omer Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014).Google Scholar
- Rachel Green and John Sheppard. 2013. Comparing Frequency-and Style-Based Features for Twitter Author Identification. In FLAIRS Conference .Google Scholar
- Diansheng Guo, Mark Gahegan, Alan M MacEachren, and Biliang Zhou. 2005. Multivariate analysis and geovisualization with an integrated geographic knowledge discovery approach. Cartography and Geographic Information Science , Vol. 32, 2 (2005), 113--132.Google ScholarCross Ref
- Perry N Halkitis and Rafael Perez Figueroa. 2013. Sociodemographic characteristics explain differences in unprotected sexual behavior among young HIV-negative gay, bisexual, and other YMSM in New York City. AIDS patient care and STDs , Vol. 27, 3 (2013), 181--190.Google Scholar
- Livia Hollenstein and Ross Purves. 2010. Exploring place through user-generated content: Using Flickr tags to describe city cores. Journal of Spatial Information Science , Vol. 2010, 1 (2010), 21--48.Google Scholar
- Tom Huang, Anas Elghafari, Kunal Relia, and Rumi Chunara. 2017. High-resolution temporal representations of alcohol and tobacco behaviors from social media data. Proceedings of the ACM on human-computer interaction , Vol. 1, CSCW (2017). Google ScholarDigital Library
- Suradej Intagorn and Kristina Lerman. 2011. Learning boundaries of vague places from noisy annotations. In Proc. ACM SIGSPATIAL. ACM, 425--428. Google ScholarDigital Library
- Jennifer Jee-Lyn Garc'ia and Mienah Zulfacar Sharif. 2015. Black lives matter: a commentary on racism and public health. American journal of public health , Vol. 105, 8 (2015), e27--e30.Google Scholar
- Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In European conference on machine learning. Springer, 137--142. Google ScholarDigital Library
- Isaac L Johnson, Subhasree Sengupta, Johannes Schöning, and Brent Hecht. 2016. The geography and importance of localness in geotagged social media. In Proc. CHI. ACM, 515--526. Google ScholarDigital Library
- Malia Jones and Anne R Pebley. 2014. Redefining neighborhoods using common destinations: Social characteristics of activity spaces and home census tracts compared. Demography , Vol. 51, 3 (2014), 727--752.Google ScholarCross Ref
- Edson C Kitani, Emilio M Hernandez, Gilson A Giraldi, and Carlos E Thomaz. 2011. Exploring and Understanding the High Dimensional and Sparse Image Face Space: a Self-Organized Manifold Mapping. In New Approaches to Characterization and Recognition of Faces. InTech.Google Scholar
- Kimmo Kiviluoto. 1996. Topology preservation in self-organizing maps. In IEEE Neural Networks, Vol. 1. IEEE, 294--299.Google ScholarCross Ref
- Beryl A Koblin, James E Egan, Andrew Rundle, James Quinn, Hong-Van Tieu, Magdalena Cerdá , Danielle C Ompad, Emily Greene, Donald R Hoover, and Victoria Frye. 2013. Methods to measure the impact of home, social, and sexual neighborhoods of urban gay, bisexual, and other men who have sex with men. PloS one , Vol. 8, 10 (2013), e75878.Google ScholarCross Ref
- Teuvo Kohonen. 1990. The self-organizing map. Proc. IEEE , Vol. 78, 9 (1990), 1464--1480.Google ScholarCross Ref
- Frauke Kreuter, Stanley Presser, and Roger Tourangeau. 2008. Social desirability bias in CATI, IVR, and Web surveys the effects of mode and question sensitivity. Public Opinion Quarterly , Vol. 72, 5 (2008), 847--865.Google ScholarCross Ref
- Géraud Le Falher, Aristides Gionis, and Michael Mathioudakis. 2015. Where is the Soho of Rome? Measures and algorithms for finding similar neighborhoods in cities. In ICWSM .Google Scholar
- Ryong Lee, Shoko Wakamiya, and Kazutoshi Sumiya. 2011. Discovery of unusual regional social activities using geo-tagged microblogs. World Wide Web , Vol. 14, 4 (2011), 321--349. Google ScholarDigital Library
- Cheng-Yuan Liou and Hsin-Chang Yang. 1996. Handprinted character recognition based on spatial topology distance measurement. IEEE TPAMI , Vol. 18, 9 (1996), 941--945. Google ScholarDigital Library
- Jason Liu, Elissa R Weitzman, and Rumi Chunara. 2017. Assessing Behavioral Stages From Social Media Data. In Proc. CSCW , Vol. 2017. NIH Public Access, 1320. Google ScholarDigital Library
- Qiliang Liu, Min Deng, Yan Shi, and Jiaqiu Wang. 2012. A density-based spatial clustering algorithm considering both spatial proximity and attribute similarity. Computers & Geosciences , Vol. 46 (2012), 296--309. Google ScholarDigital Library
- Scott B MacKenzie and Philip M Podsakoff. 2012. Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing , Vol. 88, 4 (2012), 542--555.Google ScholarCross Ref
- Ruchit Nagar, Qingyu Yuan, Clark C Freifeld, Mauricio Santillana, Aaron Nojima, Rumi Chunara, and John S Brownstein. 2014. A case study of the New York City 2012--2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. Journal of medical Internet research , Vol. 16, 10 (2014).Google ScholarCross Ref
- Anastasios Noulas, Salvatore Scellato, Cecilia Mascolo, and Massimiliano Pontil. 2011. An empirical study of geographic user activity patterns in foursquare. ICwSM , Vol. 11 (2011), 70--573.Google Scholar
- Daniel Preoct iuc-Pietro, Justin Cranshaw, and Tae Yano. 2013. Exploring venue-based city-to-city similarity measures. In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing. ACM, 16. Google ScholarDigital Library
- William M Rand. 1971. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical association , Vol. 66, 336 (1971), 846--850.Google ScholarCross Ref
- Tye Rattenbury and Mor Naaman. 2009. Methods for extracting place semantics from Flickr tags. ACM Transactions on the Web (TWEB) , Vol. 3, 1 (2009), 1. Google ScholarDigital Library
- Leandro Araújo Silva, Mainack Mondal, Denzil Correa, Fabr'icio Benevenuto, and Ingmar Weber. 2016. Analyzing the Targets of Hate in Online Social Media.. In ICWSM. 687--690.Google Scholar
- Shivam Srivastava, Shiladitya Pande, and Sayan Ranu. 2015. Geo-social clustering of places from check-in data. In Data Mining (ICDM), 2015 IEEE International Conference on. IEEE, 985--990. Google ScholarDigital Library
- Seth Stephens-Davidowitz. 2014. The cost of racial animus on a black candidate: Evidence using Google search data. Journal of Public Economics , Vol. 118 (2014), 26--40.Google ScholarCross Ref
- Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014. Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification.. In ACL (1) . 1555--1565.Google Scholar
- KE Tobin, M Cutchin, CA Latkin, and LM Takahashi. 2013. Social geographies of African American men who have sex with men (MSM): A qualitative exploration of the social, spatial and temporal context of HIV risk in Baltimore, Maryland. Health & place , Vol. 22 (2013), 1--6.Google Scholar
- Roger Tourangeau, Kenneth A Rasinski, and Norman Bradburn. 1991. Measuring happiness in surveys: A test of the subtraction hypothesis. Public Opinion Quarterly , Vol. 55, 2 (1991), 255--266.Google ScholarCross Ref
- Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter.. In SRW@ HLT-NAACL. 88--93.Google Scholar
- Michael J White. 1983. The measurement of spatial segregation. American journal of sociology (1983), 1008--1018.Google Scholar
- Matthew L Williams and Pete Burnap. 2015. Cyberhate on social media in the aftermath of Woolwich: A case study in computational criminology and big data. British Journal of Criminology , Vol. 56, 2 (2015), 211--238.Google ScholarCross Ref
- Hujun Yin. 2002. Data visualisation and manifold mapping using the ViSOM. Neural Networks , Vol. 15, 8 (2002), 1005--1016. Google ScholarDigital Library
Index Terms
- Socio-spatial Self-organizing Maps: Using Social Media to Assess Relevant Geographies for Exposure to Social Processes
Recommendations
Socio-spatial affiliation networks
We model a location-based social network as an affiliation network, where the affiliations are the locations visited by the users.We identify clear signs of location-based homophily.The type of common locations between users encodes more information ...
Self organizing maps as models of social processes: the case of electoral preferences
WSOM'11: Proceedings of the 8th international conference on Advances in self-organizing mapsWe propose the use of self-organizing maps as models of social processes, in particular, of electoral preferences. In some voting districts patterns of electoral preferences emerge, such that in nearby areas citizens tend to vote for the same candidate ...
Cartograms, Self-Organizing Maps, and Magnification Control
WSOM '09: Proceedings of the 7th International Workshop on Advances in Self-Organizing MapsThis paper presents a simple way to compensate the magnification effect of Self-Organizing Maps (SOM) when creating cartograms using Carto-SOM. It starts with a brief explanation of what a cartogram is, how it can be used, and what sort of metrics can be ...
Comments