ABSTRACT
In this paper we address the issue of using local embeddings for data visualization in two and three dimensions, and for classification. We advocate their use on the basis that they provide an efficient mapping procedure from the original dimension of the data, to a lower intrinsic dimension. We depict how they can accurately capture the user's perception of similarity in high-dimensional data for visualization purposes. Moreover, we exploit the low-dimensional mapping provided by these embeddings, to develop new classification techniques, and we show experimentally that the classification accuracy is comparable (albeit using fewer dimensions) to a number of other classification procedures.
- R. Agrawal, C. Faloutsos, and A. Swami. Efficient Similarity Search in Sequence Databases. In Proc. of the 4th FODO, pages 69--84, Oct. 1993. Google ScholarDigital Library
- N. Beckmann, H. Kriegel, and R. Schnei. The r * -tree: an efficient and robust access method for points and rectangles. In Proceedings of ACM SIGMOD Conference, 1990. Google ScholarDigital Library
- R. Bellman. Adaptive Control Processes. Princeton Univ. Press, 1961.Google ScholarCross Ref
- C. Bentley and M. O. Ward. Animating multidimensional scaling to visualize n-dimensional data sets. In In Proc. of lnfo Vis, 1996. Google ScholarDigital Library
- K. Chan and A. W.-C. Fu. Efficient Time Series Matching by Wavelets. In Proc. of ICDE, pages 126--133, Mar. 1999. Google ScholarDigital Library
- T. Cover and P. Hart. Nearest Neighbor Pattern Classification. IEEE Trans. on Information Theory, pp. 21--27, 1967.Google ScholarCross Ref
- C. Domeniconi, J. Peng, and D. Gunopulos. An Adaptive Metric Machine for Pattern Classification. Advances in Neural Information Processing Systems, 2000.Google Scholar
- C. Faloutsos and K.-I. Lin. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proc. ACM SIGMOD, pages 163--174, May 1995. Google ScholarDigital Library
- C. Faloutsos, M. Ranganathan, and I. Manolopoulos. Fast Subsequence Matching in Time Series Databases. In Proceedings of ACM SIGMOD, pages 419--429, May 1994. Google ScholarDigital Library
- J. Friedman. Flexible Metric Nearest Neighbor Classification. Tech. Report, Dept. of Statistics, Stanford University, 1994.Google Scholar
- T. Hastie and R. Tibshirani. Discriminant Adaptive Nearest Neighbor Classification. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 18, No. 6, pp. 607--615, 1996. Google ScholarDigital Library
- S. Haykin. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company New York, 1994. Google ScholarDigital Library
- T. Ho. Nearest Neighbors in Random Subspaces. Lecture Notes in Computer Science: Advances in Pattern Recognition, pp. 640--648, 1998. Google ScholarDigital Library
- A. Inselberg and B. Dimsdale. Parallel coordinates: A tool for visualizing multidimensional geometry. In In Proc. of IEEE Visualization, 1990. Google ScholarDigital Library
- J. C. L. J. B. Tenenbaum, V. de Silva. A global geometric framework for nonlinear dimensionality reduction. Science v. 290 no. 5500, pages 2319--2323, 2000.Google Scholar
- I. T. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, 1989.Google Scholar
- E. Keogh, K. Chakrabarti, S. Mehrotra, and M. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. In Proc. of ACM SIGMOD, pages 151--162, 2001. Google ScholarDigital Library
- H. S. S. D. D. Lee. The manifold ways of perception. Science, v. 290 no. 5500, pages 2268--2269.Google Scholar
- R. C. T. Lee, J. R. Slagle, and H. Blum. A triangulation method for the sequential mapping of points from N-space to two-space. IEEE Transactions on Computers, pages 288--92, Mar. 1977.Google ScholarDigital Library
- D. Lowe. Similarity Metric Learning for a Variable-Kernel Classifier. Neural Computation, 7(1):72--85, 1995. Google ScholarDigital Library
- G. McLachlan. Discriminant Analysis and Statistical Pattern Recognition. New York: Wiley, 1992.Google Scholar
- C. Merz and P. Murphy. UCI Repository of Machine Learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html, 1996.Google Scholar
- T. Poggio and F. Girosi. Networks for approximation and learning, proc. IEEE 78, 1481, 1990.Google ScholarCross Ref
- M. Polito and P. Perona. Grouping and dimensionality reduction by locally linear embedding. In NIPS, 2001.Google Scholar
- J. Quinlan. C4.5: Programs for Machine Learning. Morgan-Kaufmann Publishers, Inc., 1993. Google ScholarDigital Library
- S. R. L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science v. 290 no. 5500, pages 2223--2326, 2000.Google Scholar
Index Terms
- Non-linear dimensionality reduction techniques for classification and visualization
Recommendations
Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion
Abstract--We propose an eigenvector-based heteroscedastic linear dimension reduction (LDR) technique for multiclass data. The technique is based on a heteroscedastic two-class technique which utilizes the so-called Chernoff criterion, and successfully ...
Supervised nonlinear dimensionality reduction for visualization and classification
When performing visualization and classification, people often confront the problem of dimensionality reduction. Isomap is one of the most promising nonlinear dimensionality reduction techniques. However, when Isomap is applied to real-world data, it ...
Comments