Skip to main content

Studying Urban Space from Textual Data: Toward a Methodological Protocol to Extract Geographic Knowledge from Real Estate Ads

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2022 Workshops (ICCSA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13378))

Included in the following conference series:

  • 600 Accesses

Abstract

Real estate ads are a rich source of information when studying social representation of residential space. However, extracting knowledge from them poses some methodological challenges namely in terms its spatial content. The use of techniques from artificial intelligence to find and extract knowledge and relationships from textual data improves the classical approaches of Natural Language Processing (NLP). This paper will first conceptualize what kind of information on urban space can be targeted in real estate ads. It will then propose an automated protocol based on artificial intelligence to extract named entities and relationships among them. The extracted information will finally be modeled as RDF graphs and queried through GeoSPARQL. First results will be proposed from the case study of real estate ads on the French Riviera, with a focus on toponymy. Perspectives of quantitative spatial analysis of the geolocated RDF models of real-estate ads will also be highlighted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stosic, D.: ‘par’ et ‘à travers’ dans l’expression des relations spatiales: comparaison entre le français et le serbo-croate (2002). https://hal.archives-ouvertes.fr/tel-00272907/

  2. Relph, E.: Place and placelessness (1976). https://doi.org/10.4135/9781446213742.n5

  3. Alba, M., et al.: La publicité immobilière à l’assaut de l’environnement dans une grande ville du Sud, Mexico, 1950–2000. Ecol. Polit. 39(1), 55 (2010). https://doi.org/10.3917/ecopo.039.0055

  4. Blanchi, A., et al.: The real estate ads, a new data source to understand the social representation of urban space. In: ECTQG21 (2021)

    Google Scholar 

  5. Shearmur, R., et al.: From Chicago to L.A. and back again: a Chicago-inspired quantitative analysis of income distribution in Montreal. Prof. Geogr. 56(1), 109–126 (2004). https://doi.org/10.1111/j.0033-0124.2004.05601016.x

  6. Thomas, M.-P.: Les choix résidentiels: Une approche par les modes de vie, pp. 1–41 (2018)

    Google Scholar 

  7. Sigaud, T.: Accompagner les mobilités résidentielles des salariés: l’épreuve de l’entrée en territoire. Espaces et sociétés 162, 129–145 (2015)

    Google Scholar 

  8. Bailly, A.: Ditances et espaces : vingt ans de géographie des représentations. Espac. géographique 14(3), 197–205 (1985).https://doi.org/10.3406/spgeo.1985.4033

  9. McKenzie, G., et al.: The ‘nearby’ exaggeration in real estate. In: Proceedings of the Cognitive Scales of Spatial Information, CoSSI 2017 (2017)

    Google Scholar 

  10. Lancia, F.: Word co-occurrence and similarity in meaning: some methodological issues. Mind Infin. Dimens., 1–39 (2007)

    Google Scholar 

  11. McKenzie, G., et al.: Identifying urban neighborhood names through user contributed online property listings. ISPRS Int. J. GeoInf. 7(10), 388 (2018)

    Article  Google Scholar 

  12. Hu, Y., et al.: A Semantic and sentiment analysis on online neighborhood reviews for understanding the perceptions of people toward their living environments. Ann. Am. Assoc. Geogr. 109(4), 1052–1073 (2019)

    Google Scholar 

  13. Shrivarsheni: How to Train spaCy to Autodetect New Entities (NER) (2020). https://www.machinelearningplus.com/nlp/training-custom-ner-model-in-spacy/

  14. Andrey from Prodigy Support: Former ensemble NER et extraction de relations (RE), pp. 3–5 (2021). www.support.prodi.gy/t/training-ner-and-relations-extraction-re-together/3911

  15. Wang, J., et al.: NeuroTPR: a neuro-net toponym recognition model for extracting locations from social media messages. Trans. GIS 24(3), 719–735 (2020). https://doi.org/10.1111/tgis.12627

    Article  Google Scholar 

  16. Benesty, M.: NER algo benchmark: spaCy, Flair, m-BERT and camemBERT on anonymizing French commercial legal cases. Towards Data Science (2019). https://towardsdatascience.com/benchmark-ner-algorithm-d4ab01b2d4c3

  17. Hu, Y., et al.: How do people describe locations during a natural disaster: an analysis of tweets from hurricane Harvey. In: Leibniz International Proceedings of Informatics, LIPIcs, vol. 177, no. 23, pp. 1–16 (2020)

    Google Scholar 

  18. Cadorel, L., et al.: Geospatial knowledge in housing advertisements: capturing and extracting spatial information from text (2021). HAL Id: hal-03518717

    Google Scholar 

  19. Duffy, S.: Is Flair a suitable alternative to SpaCy? (2020). https://medium.com/@sapphireduffy/is-flair-a-suitable-alternative-to-spacy-6f55192bfb01

    Google Scholar 

  20. Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Front. Cell Dev. Biol. 8, 673 (2020)

    Article  Google Scholar 

  21. Sanford NLP Group: Stanza - A Python NLP Library for Many Human Languages | Stanza. https://stanfordnlp.github.io/stanza/. https://universaldependencies.org/

  22. Alfared, R.: Acquisition de grammaire catégorielle de dépendances de grande envergure (2013). HAL Id: tel-00822996

    Google Scholar 

  23. Hérault, M.: La Riviera, pays de l’éternel printemps: Imaginaire paysager et transferts culturels, à Nice et dans son territoire, du Grand Tour à nos jours, Thèse de Doctorat, Sorbonne Université, Paris (2021). https://www.theses.fr/2021SORUL022

Download references

Acknowledgement

This research was carried out thanks to a research grant by KCityLabs, KINAXIA Group (CIFRE Agreement with UMR ESPACE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alicia Blanchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Blanchi, A., Fusco, G., Emsellem, K., Cadorel, L. (2022). Studying Urban Space from Textual Data: Toward a Methodological Protocol to Extract Geographic Knowledge from Real Estate Ads. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13378. Springer, Cham. https://doi.org/10.1007/978-3-031-10562-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10562-3_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10561-6

  • Online ISBN: 978-3-031-10562-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics