A Novel Multivariate Mapping Method for Analyzing High-Dimensional Numerical Datasets

Aldana-Bobadilla, Edwin; Molina-Villegas, Alejandro

doi:10.1007/978-3-319-41561-1_23

A Novel Multivariate Mapping Method for Analyzing High-Dimensional Numerical Datasets

Edwin Aldana-Bobadilla¹⁴ &
Alejandro Molina-Villegas¹⁵

Conference paper
First Online: 28 June 2016

1575 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9728))

Abstract

In modern science, dealing with high dimensional datasets is a very common task due to the increasing availability of data. Multivariate data analysis represents challenges in both theoretical and empirical levels. Until now, several methods for dimensionality reduction like Principal Component Analysis, Low Variance Filter and High Correlated Columns has been proposed. However, sometimes the reduction achieved by existing methods is not accurate enough to analyze datasets where, for practical reasons, more reduction of the original dataset is required. In this paper, we propose a new method to transform high dimensional dataset into a one-dimensional. We show that such transformation preserves the properties of the original dataset and thus, it can be suitable for many applications where a high reduction is required.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aldana-Bobadilla, E., Alfaro-Prez, C.: Finding the optimal sample based on shannon entropy and genetic algorithms. In: Sidorov, G., Galicia-Haro, S.N. (eds.) MICAI 2015. LNCS, vol. 9413, pp. 353–363. Springer, Heidelberg (2015)
Chapter Google Scholar
Cox, K.A., Dante, H.M., Maher, R.J.: Product appearance inspection methods and apparatus employing low variance filter, 17 August 1993. US Patent 5,237,621
Google Scholar
Doane, D.P.: Aesthetic frequency classifications. Am. Stat. 30(4), 181–183 (1976)
MathSciNet Google Scholar
Gowda, K.C., Krishna, G.: The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Trans. Inf. Theor. 25(4), 488–490 (1979)
Article Google Scholar
Hyndman, R.J.: The problem with sturges rule for constructing histograms. Monash University (1995)
Google Scholar
Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. Am. Stat. 50(4), 361–365 (1996)
Google Scholar
Kalegele, K., Takahashi, H., Sveholm, J., Sasai, K., Kitagata, G., Kinoshita, T.: On-demand data numerosity reduction for learning artifacts. In: 2012 IEEE 26th International Conference on Advanced Information Networking and Applications (AINA), pp. 152–159. IEEE (2012)
Google Scholar
Lane, D.M.: Online statistics education: an interactive multimedia course of study (2015). http://onlinestatbook.com/2/graphing_distributions/histograms.html. Accessed 03 Dec 2015
Liu, H., Motoda, H.: Instance Selection and Construction for Data Mining, vol. 608. Springer, Heidelberg (2013)
Google Scholar
Reeves, C.R., Bush, D.R.: Using genetic algorithms for training data selection in RBF networks. In: Liu, H., Motoda, H. (eds.) Instance Selection and Construction for Data Mining, vol. 608, pp. 339–356. Springer, Heidelberg (2001)
Chapter Google Scholar
Shlens, J.: A tutorial on principal component analysis (2014). arXiv preprint arXiv:1404.1100
Skalak, D.B.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 293–301 (1994)
Google Scholar
Randall Wilson, D., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)
Article MATH Google Scholar
Lei, Y., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. ICML 3, 856–863 (2003)
Google Scholar

Download references

Acknowledgments

The authors acknowledge the support of Consejo Nacional de Ciencia y tecnología (CONACyT) and Centro de Investigación y Estudios Avanzados-CINVESTAV.

Author information

Authors and Affiliations

CINVESTAV-Unidad Tamaulipas, Ciudad Victoria, Mexico
Edwin Aldana-Bobadilla
The National Commission for Knowledge and Use of Biodiversity, Mexico City, Mexico
Alejandro Molina-Villegas

Authors

Edwin Aldana-Bobadilla
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Molina-Villegas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edwin Aldana-Bobadilla .

Editor information

Editors and Affiliations

IBaI, Inst of Comp Vision and applied Comp Sci, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aldana-Bobadilla, E., Molina-Villegas, A. (2016). A Novel Multivariate Mapping Method for Analyzing High-Dimensional Numerical Datasets. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-41561-1_23
Published: 28 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41560-4
Online ISBN: 978-3-319-41561-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics