Skip to main content

How Can We Implement a Multidimensional Data Warehouse Using NoSQL?

  • Conference paper
  • First Online:
Book cover Enterprise Information Systems (ICEIS 2015)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 241))

Included in the following conference series:

Abstract

The traditional OLAP (On-Line Analytical Processing) systems store data in relational databases. Unfortunately, it is difficult to manage big data volumes with such systems. As an alternative, NoSQL systems (Not-only SQL) provide scalability and flexibility for an OLAP system. We define a set of rules to map star schemas and its optimization structure, a precomputed aggregate lattice, into two logical NoSQL models: column-oriented and document-oriented. Using these rules we analyse and implement two decision support systems, one for each model (using MongoDB and HBase).We compare both systems during the phases of data (generated using the TPC-DS benchmark) loading, lattice generation and querying.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008). ACM

    Article  Google Scholar 

  2. Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26, 65–74 (1997)

    Article  Google Scholar 

  3. El Malki, M., Teste, O., Kopliku, A., Chevalier, M., Tournier, R.: Implementation of multidimensional databases with document-oriented NoSQL. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 379–390. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  4. Kopliku, A., Chevalier, M., Malki, M.E., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL Systems. In: Morzy, T., Valduriez, P., Ladjel, B. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 79–91. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  5. Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on NoSQL technologies. In: IEEE International Conference on Research Challenges in Information Systems (RCIS), pp. 480–485. IEEE (2015)

    Google Scholar 

  6. Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), vol. 1, pp. 172–183. SciTePress (2015)

    Google Scholar 

  7. Colliat, G.: Olap, relational, and multidimensional database systems. ACM SIGMOD Rec. 25(3), 64–69 (1996)

    Article  Google Scholar 

  8. Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: 16th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 67–70. ACM (2013)

    Google Scholar 

  9. Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a MongoDB and hadoop platform for scientific data analysis. In: 4th Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013)

    Google Scholar 

  10. Dehdouh, K., Boussaid, O., Bentayed, F., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: 21st International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469–475 (2015)

    Google Scholar 

  11. Bentayeb, F., Boussaid, O., Kabachi, N., Dehdouh, K.: Towards an OLAP environment for column-oriented data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 221–232. Springer, Heidelberg (2014)

    Google Scholar 

  12. Bentayeb, F., Dehdouh, K., Boussaid, O.: Columnar NoSQL star schema benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 281–288. Springer, Heidelberg (2014)

    Google Scholar 

  13. Floratou, A., Teletia, N., Dewitt, D., Patel, J., Zhang, D.: Can the elephants handle the NoSQL onslaught? In: International Conference on Very Large Data Bases (VLDB) 5(12), 1712–1723. VLDB Endowment (2012)

    Google Scholar 

  14. Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7(2–3), 215–247 (1998)

    Article  Google Scholar 

  15. Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International Conference on Data Engineering (ICDE), pp. 152–159. IEEE Computer Society (1996)

    Google Scholar 

  16. Han, D., Stroulia, E.: A three-dimensional data model in Hbase for large time-series dataset analysis. In: 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA), pp. 47–56. IEEE (2012)

    Google Scholar 

  17. Jacobs, A.: The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)

    Article  Google Scholar 

  18. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Indianapolis (2013)

    Google Scholar 

  19. Kim, J., Moon, Y.-S., Lee, S., Lee, W.: Efficient distributed parallel top-down computation of R-OLAP data cube using mapreduce. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 168–179. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. LeFevre, J., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: International Conference on Management of data (SIGMOD), pp. 1591–1602. ACM (2014)

    Google Scholar 

  21. Li, C.: Transforming relational database into Hbase: A case study. In: International Conference on Software Engineering and Service Sciences (ICSESS), pp. 683–687. IEEE (2010)

    Google Scholar 

  22. Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data Knowl. Eng. (DKE) 59(2), 348–377 (2006). Elsevier

    Article  Google Scholar 

  23. Morfonios, K., Konakas, S., Ioannidis, Y., Kotsis, N.: R-OLAP implementations of the data cube. ACM Comput. Surv. 39(4), 12 (2007). ACM

    Article  Google Scholar 

  24. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: International Conference on Management of data (SIGMOD), pp. 165–178. ACM (2009)

    Google Scholar 

  25. Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and Graphic Languages for OLAP Manipulations. Int. J. Data Warehouse. Min. (IJDWM) 4(1), 17–46 (2008). IGI Publishing

    Article  Google Scholar 

  26. Simitsis, A., Vassiliadis, P., Sellis, T.: Optimizing ETL processes in data warehouses. In: International Conference on Data Engineering (ICDE), pp. 564–575. IEEE (2005)

    Google Scholar 

  27. Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012)

    Article  Google Scholar 

  28. Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: 33rd International Conference on Very large Data Bases (VLDB), pp. 1150–1160. ACM (2007)

    Google Scholar 

  29. Strozzi, C.: NoSQL – A relational database management system (2007–2010). http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page

  30. Vajk, T., Feher, P., Fekete, K., Charaf, H.: Denormalizing data into schema-free databases. In: 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 747–752. IEEE (2013)

    Google Scholar 

  31. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N.: ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Engineering Bulletin, 23(4), IEEE, pp. 42–47, 2000

    Google Scholar 

  32. Tahara, D., Diamond, T., Abadi, D.J.: Sinew: a SQL system for multi-structured data. In: International Conference on Management of data (SIGMOD), pp. 815–826. ACM (2014)

    Google Scholar 

  33. TPC-DS. Transaction Processing Performance Council, Decision Support benchmark, version 1.3.0 (2014). http://www.tpc.org/tpcds/

  34. Wrembel, R.: A survey of managing the evolution of data warehouses. Int. J. Data Warehouse. Min. (IJDWM) 5(2), 24–56 (2009). IGI Publishing

    Article  Google Scholar 

  35. Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–108. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is supported by the ANRT funding under CIFRE-Capgemini partnership.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed El Malki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R. (2015). How Can We Implement a Multidimensional Data Warehouse Using NoSQL?. In: Hammoudi, S., Maciaszek, L., Teniente, E., Camp, O., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2015. Lecture Notes in Business Information Processing, vol 241. Springer, Cham. https://doi.org/10.1007/978-3-319-29133-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29133-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29132-1

  • Online ISBN: 978-3-319-29133-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics