Abstract
The geographical presentation of a house, which refers to the sightseeing and topography near the house, is a critical factor to a house buyer. The street map is a type of common data in our daily life, which contains natural geographical presentation. This paper sources real estate data and corresponding street maps of houses in the city of Los Angeles. In the case study, we proposed an innovative method, attention-based multi-modal fusion, to incorporate the geographical presentation from street maps into the real estate appraisal model with a deep neural network. We firstly combine the house attribute features and street map imagery features by applying the attention-based neural network. After that, we apply boosted regression trees to estimate the house price from the fused features. This work explored the potential of attention mechanism and data fusion in the applications of real estate appraisal. The experimental results indicate the competitiveness of proposed method among state-of-the-art methods.
Similar content being viewed by others
References
Ai Y, Li Z, Gan M, Zhang Y, Yu D, Chen W, Ju Y (2018) A deep learning approach on short-term spatiotemporal distribution forecasting of dockless bike-sharing system. Neural Comput Appl
Antipov EA, Pokryshevskaya EB (2012) Mass appraisal of residential apartments: an application of Random forest for valuation and a CART-based approach for model diagnostics. Expert Syst Appl 39(2):1772–1778
Bency AJ, Rallapalli S, Ganti RK, Srivatsa M, Manjunath BS (2017) Beyond spatial auto-regressive models: Predicting housing prices with satellite imagery. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 320–329
Bessinger Z, Jacobs N (2016) Quantifying curb appeal. In: 2016 IEEE International Conference on Image Processing (ICIP), pp 4388–4392
Bidanset PE, Lombard JR (2014) Evaluating spatial model accuracy in mass real estate appraisal: A comparison of geographically weighted regression and the spatial lag model. Cityscape: A J Policy Dev Res 16(3):169–182
Bin J, Tang S, Liu Y, Wang G, Gardiner B, Liu Z, Li E (2017) Regression model for appraisal of real estate using recurrent neural network and boosting tree. In: 2017 2nd IEEE international conference on computational intelligence and applications (ICCIA), pp 209–213
Cao J, Cao M, Wang J, Yin C, Wang D, Vidal PP (2018) Urban noise recognition with convolutional neural network. Multimed Tools Appl
Ċetkoviċ J, Lakiċ S, Lazarevska M, żarkoviċ M, Vujoṡeviċ S, Cvijoviċ J, Gogiċ M (2018) Assessment of the real estate market value in the european market by artificial neural networks application. Complex 2018:1–10
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining - KDD’16, San Francisco, CA, USA, pp 785–794
Chollet F et al (2015) Keras. https://github.com/keras-team/keras, Accessed April 16 2018
Crosby H, Davis P, Jarvis SA (2015) Exploring new data sources to improve UK land parcel valuation. In: Proceedings of the 1st international ACM SIGSPATIAL workshop on smart cities and urban analytics - UrbanGIS’15, pp 32–35
Demetriou D (2016) A spatially based artificial neural network mass valuation model for land consolidation. Environ Plan B: Urban Anal City Sci 44(5):864–883
Dimopoulos T, Yiorkas C (2017) Implementing GIS in real estate price prediction and mass valuation: the case study of nicosia district. In: 5th international conference on remote sensing and geoinformation of the environment (RSCy2017)
Dubey A, Naik N, Parikh D, Raskar R, Hidalgo CA (2016) Deep learning the city: Quantifying urban perception at a global scale. In: Computer vision – ECCV 2016, Cham, pp 196–212
Estated (2018) Property reports. https://estated.com/reports, Accessed May 01 2018
Fan GZ, Ong SE, Koh HC (2006) Determinants of house price: a decision tree approach. Urban Stud 43(12):2301–2315
Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann Stat 29(5):1189–1232
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055. https://doi.org/10.1109/tmm.2017.2729019
Gebru T, Krause J, Wang Y, Chen D, Deng J, Aiden EL, Fei-Fei L (2017) Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proc Natl Acad Sci USA 114(50):13,108–13,113
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Graczyk M, Lasota T, Trawiński B, Trawiński K (2010) Comparison of bagging, boosting and stacking ensembles applied to real estate appraisal. In: Proceedings of the Second International Conference on Intelligent Information and Database Systems: Part II, Hue City, Vietnam, pp 340–350
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167v3
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: Cortes C, Lawrence N D, Lee D D, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28. Curran Associates Inc., pp 2017–2025
Johnson K, Kuhn M (2013) Applied predictive modeling. Springer, Berlin
Kauko TJ (2002) Modelling the locational determinants of house prices: neural network and value tree approaches. PhD thesis, Universiteit Utrecht, Utrecht, Netherlands
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980v9
Konig D, Adam M, Jarvers C, Layher G, Neumann H, Teutsch M (2017) Fully convolutional region proposal networks for multispectral person detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Lasota T, Sachnowski P, Trawiṅski B (2009) Comparative analysis of regression tree models for premises valuation using statistica data miner. In: Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems. Springer, Berlin, pp 776–787
Liang Y, Ke S, Zhang J, Yi X, Zheng Y (2018) GeoMAN: Multi-level attention networks for geo-sensory time series prediction. In: Proceedings of the 27th international joint conference on artificial intelligence
Liu X, Xu Q, Yang J, Thalman J, Yan S, Luo J (2018) Learning multi-instance deep ranking and regression network for visual house appraisal. IEEE Tran Knowl Data En 30(8):1496–1506. https://doi.org/10.1109/tkde.2018.2791611
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, Omnipress, USA, ICML’10, pp 807–814
Ordonez V, Berg TL (2014) Learning high-level judgments of urban perception. In: Computer Vision – ECCV 2014, Cham, pp 494–510
Park B, Bae JK (2015) Using machine learning algorithms for housing price prediction: the case of Fairfax County, Virginia housing data. Expert Syst Appl 42 (6):2928–2934
Ramírez I, Cuesta-Infante A, Pantrigo JJ, Montemayor AS, Moreno JL, Alonso V, Anguita G, Palombarani L (2018) Convolutional neural networks for computer vision-based detection and recognition of dumpsters. Neural Comput Appl
Salesses P, Schechtner K, Hidalgo CA (2013) The collaborative image of the city: mapping the inequality of urban perception. PLoS ONE 8(7):e68,400
Selim H (2009) Determinants of house prices in turkey: Hedonic regression versus artificial neural network. Expert Syst Appl 36(2):2843–2852
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations (ICLR 2015)
Sırmaċek B, Ünsalan C (2010) Using local features to measure land development in urban regions. Pattern Recognit Lett 31(10):1155–1159
Song J, Guo Y, Gao L, Li X, Hanjalic A, Shen HT (2018) From deterministic to generative: Multimodal stochastic RNNs for video captioning. IEEE Trans Neural Net Learn pp 1–12. https://doi.org/10.1109/tnnls.2018.2851077
Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans Image Process 27(7):3210–3221. https://doi.org/10.1109/tip.2018.2814344
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Stamen (2017) Openstreetmap. http://maps.stamen.com, Accessed April 01 2018
Sun F, Li W, Guan Y (2018). https://doi.org/10.1007/s11042-018-6591-3
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), vol 2015
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates, Inc., pp 5998–6008
Vo N (2014) A new conceptual automated property valuation model for residential housing market. PhD thesis, Victoria University, Victoria, Canada
Vrijdag K (2016) Auction price prediction: An instance-transfer learning approach. PhD thesis, Eindhoven University of Technology, Eindhoven, Netherlands
Wang X, Gao L, Song J, Shen H (2017) Beyond frame-level CNN: Saliency-aware 3-d CNN with LSTM for video action recognition. IEEE Signal Process Lett 24(4):510–514. https://doi.org/10.1109/lsp.2016.2611485 https://doi.org/10.1109/lsp.2016.2611485
Wang X, Gao L, Wang P, Sun X, Liu X (2018) Two-stream 3-d convNet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20(3):634–644. https://doi.org/10.1109/tmm.2017.2749159
Wilson I, Paris S, Ware J, Jenkins D (2002) Residential property price time series forecasting with neural networks. Knowl-Based Syst 15(5):335–341
Worzala E, Lenk M, Silva A (1995) An exploration of neural networks and its application to real estate valuation. J Real Estate Res 10(2):185–201
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning, Lille, France, vol 37, pp 2048–2057
Yang Z, Yu W, Liang P, Guo H, Xia L, Zhang F, Ma Y, Ma J (2018) Deep transfer learning for military object recognition under small training set condition. Neural Comput Appl
You Q, Pang R, Cao L, Luo J (2017) Image-based appraisal of real estate properties. IEEE Trans Multimed 19(12):2751–2759
Zhang L, Lin L, Liang X, He K (2016) Is faster r-CNN doing well for pedestrian detection?. In: Computer vision – ECCV, vol 2016, pp 443–457
Zhou X, Shen Y, Zhu Y, Huang L (2018) Predicting multi-step citywide passenger demands using attention-based neural networks. In: Proceedings of the 11th ACM international conference on web search and data mining - WSDM’18
Zhou Y, Zhang L, Yi Z (2017) Predicting movie box-office revenues using deep neural networks. Neural Comput Appl. https://doi.org/10.1007/s00521-017-3162-x
Acknowledgements
This study was supported by Mitacs Accelerate Program (IT10011) through the collaboration between Data Nerds and the University of British Columbia (Okanagan). The authors present the appreciation to Fang Shi, Shuo Liu (University of British Columbia), Dr. Huan Liu (China University of Geosciences) and Kaiqi Zhang (AECOM New York) for the precious discussion when the work was carried out.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bin, J., Gardiner, B., Liu, Z. et al. Attention-based multi-modal fusion for improved real estate appraisal: a case study in Los Angeles. Multimed Tools Appl 78, 31163–31184 (2019). https://doi.org/10.1007/s11042-019-07895-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-07895-5