Statistics, Data Mining, and Machine Learning in Astronomy 7 Dimensionality and Its Reduction “A mind that is stretched by a new idea can never go back to its original dimensions ” (Oliver Wendell Hol[.]
7 Dimensionality and Its Reduction “A mind that is stretched by a new idea can never go back to its original dimensions.” (Oliver Wendell Holmes) ith the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set For example, with the SDSS imaging data we could measure arbitrary numbers of properties or features for any source detected on an image (e.g., we could measure a series of progressively higher moments of the distribution of fluxes in the pixels that make up the source) From the perspective of efficiency we would clearly rather measure only those properties that are directly correlated with the science we want to achieve In reality we not know the correct measurement to use or even the optimal set of functions or bases from which to construct these measurements This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set The techniques we will describe here are related to concepts we have discussed when describing Gaussian distributions (§3.5.2), density estimation (§6.1), and the concepts of information content (§5.2.2) We will start in §7.1 with an exploration of the problems posed by highdimensional data In §7.2 we will describe the data sets used in this chapter, and in §7.3, we will introduce perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA) In the remainder of the chapter, we will introduce several alternative techniques which address some of the weaknesses of PCA W 7.1 The Curse of Dimensionality Imagine that you have decided to purchase a car and that your initial thought is to purchase a “fast” car Whatever the exact implementation of your search strategy, let us assume that your selection results in a fraction r < of potential matches If you expand the requirements for your perfect car such that it is “red,” has “8 cylinders,” and a “leather interior” (with similar selection probabilities), then your selection probability would be r If you also throw “classic car” into the mix, the selection probability becomes r The more selection conditions you adopt, the tinier is the