Information theoretic multi robot path planning

Information-Theoretic Multi-Robot Path Planning Cao Nannan (B.Sc., East China Normal University, 2009) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTE SCIENCE SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE 2012 DECLARATION I hereby declare that this thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. Name: Date: Acknowledgements First of all, I am grateful to God for his great mercy, immeasurable love and consistent guidance. Second, I want to express my sincere gratitude to my supervisor, Assist. Prof. Low Kian Hsiang. During the period we work together, not only he share with me a lot of knowledge, but he also teach me how to work carefully and seriously. Without him, there would be no this thesis. I really appreciate his patience and support. Third, I want to thank all fellow brothers and sisters Zeng Yong, Luochen, Kang Wei, Xiao Qian, Prof. Tan and Zhengkui who always love me as a younger brother in family. And I really enjoy the fellowship time when we study bible and worship together. I also want to thank all friends in AI 1 lab and AI 3 lab, especially Lim Zhanwei, Ye Nan, Bai Haoyu, Xu Nuo, Chen Jie, Trong Nghia Hoang, Jiangbo and Ruofei who have helped me to check and revise my thesis. Last but not least, I would like to thank my parents who always support me and encourage me when I need. i Contents List of Tables 1 List of Figures 2 1 Introduction 4 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Background 9 2.1 Transect Sampling Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Gaussian Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Entropy and Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Related Work 15 3.1 Design-based vs. Model-based Strategies . . . . . . . . . . . . . . . . . . . . 15 3.2 Polynomial-time vs. Non-polynomial-time Strategies . . . . . . . . . . . . . 16 3.3 Non-guaranteed vs. Performance-guaranteed Sampling Paths . . . . . . . . 17 3.4 Multi-robot vs. Single-robot Strategies . . . . . . . . . . . . . . . . . . . . . 17 4 Maximum Entropy Path Planning 19 4.1 Notations and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 iMASP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 ii 4.3 MEPP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4 Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.5 Performance Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5 Maximum Mutual Information Path Planning 26 5.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.3 Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.4 M2 IPP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.5 Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.6 Performance Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 Experimental Results 35 6.1 Data Sets and Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 35 6.2 Temperature Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.3 Plankton Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.4 Time Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.5 Criterion Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7 Conclusions 50 Appendices 51 A Maximum Entropy Path Planning 52 A.1 Proof for Lemma 2.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 A.2 Proof for Lemma 4.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 A.3 Proof for Lemma 4.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 A.4 Proof for Corollary 4.5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 A.5 Proof for Theorem 4.5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 iii B Maximum Mutual Information Path Planning 61 B.1 Proof for Lemma 5.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 B.2 Proof for Other Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 B.3 Proof For Lemma 5.6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 B.4 Proof For Theorem 5.6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Bibliography 72 iv Abstract Research in environmental sensing and monitoring is especially important in supporting environmental sustainability efforts worldwide, and has recently attracted significant attention and interest. A key direction of this research lies in modeling and predicting the spatiotemporally varying environmental phenomena. One approach is to use a team of robots to sample the area and model the measurement values at unobserved points. For smoothly varying and hot-spot fields, there is some work which has been done to model the fields well. However, there is still a class of common environmental fields called anisotropic fields in which the spatial phenomena are highly correlated along one direction and less correlated along the perpendicular direction. We exploit the environmental structure to improve the sampling performance and time efficiency of planning for anisotropic fields. In this thesis, we cast the planning problem into a stagewise decision-theoretic problem. we adopt Gaussian Process to model spatial phenomena. Maximum entropy criterion and maximum mutual information criterion are used to measure the informativeness of the observation paths. It is found that for many GPs, correlation of two points exponentially decreases with the distance between the two points. With this property, for maximum entropy criterion, we propose a polynomial-time approximation algorithm, MEPP, to find the maximum entropy paths. We also provide a theoretical performance guarantee for this algorithm. For maximum mutual information criterion, we propose another polynomial-time approximation algorithm, M2 IPP. Similar to the MEPP, a performance guarantee is also provided for this algorithm. We demonstrate the performance advantages of our algorithms on two real data sets. To get lower prediction error, three priciples have also been proposed to select the criterion for different environmental fields. v List of Tables 3.1 Comparisons of different exploration strategies (DB: design-based, MB: modelbased, PT: polynomial-time NP: non-polynomial-time, NO: non-optimized, NG: non-guaranteed, PG: performance-guaranteed, UP: unknown-performance MR: multi-robot SR: single-robot). . . . . . . . . . . . . . . . . . . . . . . . 16 1 List of Figures 1.1 The density of chlorophyll-a in Gulf of Mexico. The values along the coastal line are close to each other, which is highly correlated. The values along the perpendicular direction changes a lot, which is less correlated. . . . . . . . . 5 2.1 Transect sampling task in a temperature field. . . . . . . . . . . . . . . . . . 10 2.2 The value of K(p1 , p2 ) exponentially decreases to zero and the posterior variance σp21 |p2 exponentially increases to prior variance as the distance between point p1 and point p2 linearly increases. . . . . . . . . . . . . . . . . . . . . 5.1 12 Visualization of applying m-order Markov property to maximum mutual information criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 Visualization of the approximation method of the M2 IPP algorithm. . . . . 29 6.1 Temperature fields which distributed over 25m × 150m are discretized into 5 × 30 grids with learned hyper-parameters. . . . . . . . . . . . . . . . . . . 6.2 Plankton density field which distributed over 314m×1765m is discretized into a 8 × 45 grid with 1 = 27.5273 m, 2 = 134.6415 m, σs2 = 1.4670, and σn2 = 0.2023. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 36 The results of ENT(π) for different algorithms with different number of robots on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 36 40 The results of MI(π) for different algorithms with different number of robots on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2 LIST OF FIGURES 6.5 The results of ERR(π) for different algorithms with different number of robots on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 The results of ENT(π) for different algorithms with different number of robots on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 45 The results of ERR(π) for different algorithms with different number of robots on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 44 The results of MI(π) for different algorithms with different number of robots on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 43 46 The running time of different algorithms with different number of robots on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.10 The running time of different algorithms with different number of robots on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.11 Sampling points selected by different criteria. . . . . . . . . . . . . . . . . . 47 3 Chapter 1 Introduction 1.1 Motivation Research in environmental sensing and monitoring are especially important in supporting environmental sustainability efforts worldwide and has recently gained significant attention and practical interest. A key direction of this research lies in modeling and predicting the spatiotemporally varying environmental phenomena, which affects our natural and built-up habitats for aiding our understanding of them and the decision making of the policy makers. The spatiotemporal structures and properties of phenomena vary with the environmental physical/biochemical conditions. For example, the phenomena in some environmental fields could be smoothly varying. Other environmental fields could have a few hot spots. Due to different spatiotemporal structures, the sampling performance of different classes of sampling strategies will be different. The work of [Low, 2009] shows that adaptive sampling can exploit the hotspots well and non-adaptive sampling can map the smoothly varying environmental fields accurately. However, there is still a class of common environmental fields called anisotropic fields in which the spatial phenomena are highly correlated along one direction and less correlated along the perpendicular direction (e.g. Fig. 1.1). Due to ocean current, the anisotropic fields can be easily found in ocean phenomena. Typically, the anisotropic fields can be found in following spatial phenomena: 4 Chapter 1. Introduction 0 35 20 30 40 60 25 80 20 100 15 120 10 140 5 160 0 180 0 20 40 60 80 100 120 140 160 180 Figure 1.1: The density of chlorophyll-a in Gulf of Mexico. The values along the coastal line are close to each other, which is highly correlated. The values along the perpendicular direction changes a lot, which is less correlated. 1. Ocean phenomena: phytoplankton concentration [Franklin and Mills, 2007], sea surface temperature [Hosoda and Kawamura, 2005], salinity field [Budrikait˙e and Duˇcinskas, 2005] and velocity field of ocean current [Lynch and McGillicuddy Jr., 2001]; 2. Soil phenomena: heavy mental concentration [McGrath et al., 2004], surface soil moisture [Zhang et al., 2011], soil radioactivity [Rabesiranana et al., 2009] and gold concentrations [Samal et al., 2011]; 3. Biological phenomena: pollen dispersal [Austerlitz et al., 2007], seed dispersal [Sánchez 5 Chapter 1. Introduction et al., 2011]; 4. Other phenomena: rainfall [Prudhomme and Reed, 1999], groundwater contaminant plumes [Rivest et al., 2012; Wu et al., 2005], air pollution [Boisvert and Deutsch, 2011]. So, for this class of environmental fields, how can we exploit the environmental structure to improve sampling performance? To monitor an environmental field in the ocean, land or forest, some work has been done to find the most informative set of static sensor placements [Guestrin et al., 2005; Krause et al., 2006; Das and Kempe, 2008b; Garnett et al., 2010]. However, if the area to monitor is very large, the number of sensors required will be large. For some applications, such as monitoring plankton bloom in ocean or pH value in a river, the movement of water discourages static sensor placements as well. In contrast, a team of robots (e.g., unmanned aerial vehicles, autonomous underwater vehicles [Rudnick et al., 2004]) which can move around to sample the area will be a desirable solution. To explore an environmental field, planning sampling paths for the robots become the fundamental problem. However, the work of [Ko et al., 1995; Guestrin et al., 2005] shows that the problem of selecting the most informative set of static points is NP-complete. And we are not aware of any work which can find the most informative paths in polynomial time without strong assumption. So, for anisotropic fields, can we also exploit the environmental structure to improve time efficiency of planning? 1.2 Objective To provide exploration strategy for multiple robots, this thesis aims to address the following issue: How can we exploit the environmental structure to improve sampling performance as well as the time efficiency of planning for anisotropic fields? In statistics community, some work [McBratney et al., 1981; Xiao et al., 2004; Webster 6 Chapter 1. Introduction and Oliver, 2007; Ward and Jasieniuk, 2009; Wackernagel, 2009] has been done to do sampling design for anisotropic fields. To tackle anisotropic effects, these work will adjust grid spacing so that the less correlated direction will be sampled more than other directions. However, firstly, these strategies are all for static sensors. As a result, they suffers from the disadvantages of static sensors which has been stated above. Secondly, these work did not consider the computational efficiency of planning. In robotics community, the work of [Low et al., 2009] has defined the information-theoretic Multi-Robot Adaptive Sampling Problem (iMASP). However, for any environmental field, the time complexity of iMASP exponentially increases with the length of planning horizon. To reduce the time complexity, the work of [Low et al., 2011] has assumed that the measurements in next stage only depends on the measurements in current stage. However, for the fields which have large correlations, this assumption is too strong. The work of [Singh et al., 2007] has proposed a quasi-polynomial algorithm to find the most informative paths with specified budget of cost. They proposed two heuristics, spatial-decomposition and branch-and-bound search, to reduce time complexity. However, spatial-decomposition violates the continuous spatial correlations of environmental fields. And no performance guarantee is provided for the branch-and-bound search algorithm. 1.3 Contributions To do point sampling and prediction, environmental fields are discretized into grids. The planning problem is cast into a stagewise decision-theoretic problem. With sampled observations, we adopt Gaussian Process [Rasmussen and Williams, 2006] to model spatial phenomena. Maximum entropy criterion [Shewry and Wynn, 1987] and maximum mutual information criterion [Guestrin et al., 2005] are proposed to measure the informativeness of observation paths. It is found that for many GPs, correlation of two points exponentially decreases with the distance between the two points. With this property, our work propose two information-theoretic algorithms which can trade off between sampling performance and time complexity. Especially, for anisotropic fields, if the robots explore the field along 7 Chapter 1. Introduction the small correlated direction, our algorithms can guarantee the sampling performance of observation paths with little planning time. The specific contributions of the thesis include: • Formalization of Maximum Entropy Path Planning (MEPP) algorithm: A polynomialtime approximation algorithm, MEPP, is proposed to find the maximum entropy paths. We also provide a theoretical performance guarantee on the sampling performance of the MEPP algorithm for a class of exploration tasks called transect sampling task. • Formalization of Maximum Mutual Information Path Planning (M2 IPP) algorithm: For maximum mutual information criterion, we propose another polynomial-time approximation algorithm, M2 IPP. A theoretical performance guarantee on the sampling performance of the M2 IPP algorithm for the transect sampling task is provided as well. • Evaluation of performance: We evaluate the sampling performance of our proposed algorithms on two real-world data sets. The performance is measured with three metrics: entropy, mutual information and prediction error. The results of our algorithms demonstrate advantages over other state-of-the-art algorithms. This thesis will be organized as follows. In chapter 2, some background is reviewed. In chapter 3, related work on exploration strategy is provided. In chapters 4 and 5, our two proposed algorithms are explained in detail. In chapter 6, experiments on two real-world data sets are presented. And we conclude this thesis in chapter 7. 8 Chapter 2 Background In this chapter, we review some background to formalize our problem. In section 2.1, a class of exploration tasks called transect sampling task to which our algorithms can be applied is presented. With sampled observations, we adopt Gaussian Process to model the environmental field, which is reviewed in section 2.2. Entropy and mutual information are used to measure the informativeness of the sampling paths, which are reviewed in section 2.3. 2.1 Transect Sampling Task For a discretized unobserved field, the transect sampling task [St˚ ahl et al., 2000; Thompson and Wettergreen, 2008] assumes that the number of columns is much larger than the number of sampling locations in each column. For example, the following figure 2.1 shows a temperature field which spans over a 25 m × 150 m area is discretized into a 5 × 30 grid of sampling locations (white dots). In this discretized field, each robot is constrained to explore forward from leftmost column to rightmost column, with one sampling location for each column. Thus, the action space for each robot given its current location comprises the 5 locations in the right adjacent column. For the constraint on exploring forward, the robots with limited maneuverability can explore the area with less complex planning paths 9 Chapter 2. Background which can be achieved more reliably. Figure 2.1: Transect sampling task in a temperature field. In this thesis, we assume that the robots will perform the transect sampling task. So the travelling cost of each robot is the horizontal length of the field. And the action space for each robot is limited. Multiple robots will be applied to explore the field. We assume that the number of robots will be less than the number of sampling locations in each column. Our proposed algorithms will find the paths with maximum entropy and the paths with maximum mutual information for multiple robots. 2.2 Gaussian Process With sampled observations, we adopt the Gaussian Process [Rasmussen and Williams, 2006] to model the environmental field. The GP model has been widely used to model environmental fields in spatial statistics [Webster and Oliver, 2007]. A Gaussian Process is a collection of random variables, any finite number of which have a multivariate Gaussian Distribution. To specify this distribution, a mean function M(·) and a symmetric positive definite covariance function K(·, ·) have to be defined for a Gaussian Process. For example, given a vector A of points and the corresponding vector ZA of random measurements on these points, P (ZA ) is a multivariate Gaussian Distribution. It can be specified with a mean vector µA and a covariance matrix ΣAA . For the mean vector µA , each entry corresponds to each point u in vector A with M(u). Similarly, in the covariance matrix ΣAA , each entry corresponds to each pair of points u, v in vector A with K(u, v). If we have the measurements zA for vector A, given any other unobserved point y, with Bayes rules, we can know that P (Zy |zA ) is also a Gaussian Distribution. For this Gaussian Distribution, 2 the posterior mean µy|A and the posterior variance σy|A which correspond to the predicted 10 Chapter 2. Background measurement value and the uncertainty at the unobserved point y are given by: µy|A = µy + ΣyA Σ−1 AA (zA − µA ) (2.1) 2 σy|A = K(y, y) − ΣyA Σ−1 AA ΣAy (2.2) where µy and µA are the prior means which are returned by the mean function M(·), ΣyA is the covariance vector and each entry of ΣyA corresponds to each point u in vector A with K(u, y). If there is a vector B of unobserved points, we have: µB|A = µB + ΣBA Σ−1 AA (zA − µA ) (2.3) ΣB|A = ΣBB − ΣBA Σ−1 AA ΣAB (2.4) where µB|A is a posterior mean vector and ΣB|A is a posterior covariance matrix. From (2.2) and (2.4), it is known that the posterior variance does not depend on the measurements of observed points zA . 2.2.1 Covariance Function In this thesis, we assume that the GP model has a constant mean function and a stationary covariance function. Hence, the mean function M(·) which can be learned with prior data or expert knowledge will return a constant prior mean for any point. The covariance function K(·, ·) will not depend on the locations of two points but the distance between two points. The covariance function used in this thesis is: K(u, v) 1 σs2 exp{− (u − v)T M −2 (u − v)} + σn2 δuv 2 (2.5) where σs2 is the signal variance, σn2 is the noise variance, M is a diagonal matrix with horizontal and vertical length scales 1 and 2, and if u equals v, δuv is 1, otherwise 0. With the covariance function 2.5, the following lemma shows the least measurement error in Gaussian Process: 11 Chapter 2. Background Lemma 2.2.1. In Gaussian Process, given an unobserved point y and any observed vector 2 A of points, if the noise variance is σn2 , the posterior variance σy|A should be larger than σn2 . The proof for above result is shown in Appendix A.1. By Lemma 2.2.1, the posterior variance of an unobserved point can be lower bounded. With the covariance function 2.5, correlation of two points will exponentially decrease with the distance between the two points. For example, given two points p1 and p2 , when the distance between point p1 and point p2 linearly increases, the value of K(p1 , p2 ) will exponentially decrease to zero and the posterior variance σp21 |p2 will exponentially increase to the prior variance, which is shown in Fig. 2.2. 1.4 1.2 1 0.8 0.6 0.4 K(p1,p2) σ2p 0.2 |p 1 0 6 5 4 3 2 1 0 1 distance 2 3 4 2 5 6 Figure 2.2: The value of K(p1 , p2 ) exponentially decreases to zero and the posterior variance σp21 |p2 exponentially increases to prior variance as the distance between point p1 and point p2 linearly increases. From Fig. 2.2, it can be known that correlation of two points exponentially decreases with the distance of the two points. And when K(p1 , p2 ) is close to zero, the information that point p2 can provide for point p1 is very little. Therefore, given an unobserved point y and a vector A of observed points, we can remove the points A˜ in vector A to approximate ˜ K(u, y) is close to zero. the posterior variance where for each point u in A, 12 Chapter 2. Background 2.3 Entropy and Mutual Information For a transect sampling task, with sampled observations, the uncertainty at each unobserved point can be obtained based on the GP model. With the uncertainty, entropy and mutual information are used to quantify the informativeness of observation paths. 2.3.1 Entropy Let X be the domain of the environmental field which is discretized into grid cell locations. Given observation paths P, let X \P be the unobserved part of the field. Let ZX denote the vector of random measurements on the points in X . Let ZP and ZX \P denote the vector of random measurements on the points in P and X \P, respectively. To minimize the uncertainty of the unobserved part, with entropy metric, the problem can be formalized as: P ∗ = arg min H(ZX \P |ZP ) (2.6) P∈T where T is the set of all possible paths in the field. For a vector A of a points, it can be shown that the joint entropy of the corresponding vector ZA of random measurements is: H(ZA ) = − p(ZA ) log p(ZA )d(ZA ) = 1 log((2πe)a |ΣAA |) 2 (2.7) As a result, for (2.6), we have: H(ZX \P |ZP ) = t 1 log(2πe) + log(|ΣX \P|P |) 2 2 (2.8) where t is the size of X \P and ΣX \P|P is the posterior covariance matrix which can be obtained with (2.4). With (2.8), the conditional entropy of the unobserved part for paths P can be obtained. If we use an exhaustive algorithm, the optimal paths can be found. However, the number of possible paths exponentially increases with the length of the columns in the field. If the field is large, it is intractable to solve this problem optimally. 13 Chapter 2. Background With the chain rule of entropy, we have: H(ZX ) = H(ZP ) + H(ZX \P |ZP ). (2.9) Because H(ZX ) is constant, the problem of minimizing the uncertainty of the unobserved part H(ZX \P |ZP ) is equivalent to: P ∗ = arg max H(ZP ). (2.10) P∈T In this thesis, we will present an efficient non-myopic algorithm, the MEPP, to find the paths with maximum entropy. 2.3.2 Mutual Information Another metric, mutual information, is also proposed to measure the informativeness of observation paths. Given observation paths P and unobserved part X \P, the mutual information between ZP and ZX \P is: I(ZP ; ZX \P ) = H(ZX \P ) − H(ZX \P |ZP ). (2.11) Based on the mutual information, the problem can be formalized as: P ∗ = arg max I(ZP ; ZX \P ) (2.12) P∈T where T is the set of all possible paths in the field. With (2.4) and (2.7), the mutual information for paths P can be evaluated in closed form. In this thesis, we will present another non-myopic algorithm, the M2 IPP, to find the paths with maximum mutual information. 14 Chapter 3 Related Work To monitor an environmental field, the robots need to sample locations which can give more information about the measurement values at unobserved points. Different work has developed various methods to select the sampling locations, which are shown in table 3.1. In particular, our strategies are model-based which can find sampling paths for multiple robots within polynomial time. Moreover, the performance of the sampling paths can be guaranteed. The differences between our work and other related work are compared below. 3.1 Design-based vs. Model-based Strategies To sample an unobserved area, some work [Rahimi et al., 2003; Batalin et al., 2004; Rahimi et al., 2005; Singh et al., 2006; Popa et al., 2006; Low et al., 2007] has designed various strategies. Based on a designed strategy, the robots adaptively sample new locations until the strategy condition is satisfied. Because the sampling locations are selected based on the designed strategy, the performance of the sampling paths cannot be quantified. Moreover, some of these strategies [Rahimi et al., 2003; Batalin et al., 2004; Singh et al., 2006] need to pass the area multiple times to sample new locations. However, these strategies will not be suitable for some robots which are energy-constrained. 15 Chapter 3. Related Work Table 3.1: Comparisons of different exploration strategies (DB: design-based, MB: model-based, PT: polynomial-time NP: non-polynomial-time, NO: non-optimized, NG: non-guaranteed, PG: performance-guaranteed, UP: unknown-performance MR: multi-robot SR: single-robot). ❤❤❤❤ ❤❤❤ ❤❤ Characteristics ❤❤❤❤ ❤❤❤ Exploration strategies ❤ ❤ Rahimi et al., 2003; Rahimi et al., 2005 Batalin et al., 2004 Popa et al., 2006 Singh et al., 2006 Low et al., 2007 Meliou et al., 2007 Singh et al., 2007; Singh et al., 2009 Zhang and Sukhatme, 2007 Low et al., 2008; Low et al., 2009 Binney et al., 2010 Low et al., 2011 MEPP M2 IPP DB MB PT NP × × × × × NO NG PG × × × × × × × × × × × × × × × × × × × × × UP × × × × × × × × × × MR SR × × × × × × × × × × × × × × × × Instead, our strategies, like those in [Meliou et al., 2007; Zhang and Sukhatme, 2007; Singh et al., 2007; Low et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010; Low et al., 2011], assume that the environmental field is realized from a statistical model. Based on the model, the informativeness of the sampling paths can be quantified. The problem becomes how to find the most informative paths. In contrast to design-based strategies, mode-based strategies need some prior knowledge about the environmental field to train the model. With trained model, the paths can be planned before sampling the area. Because the sampling paths are already known, the robots do not need to pass the area multiple times. 3.2 Polynomial-time vs. Non-polynomial-time Strategies Among those model-based strategies, some of these strategies [Meliou et al., 2007; Zhang and Sukhatme, 2007; Singh et al., 2007; Low et al., 2008; Low et al., 2009; Binney et al., 2010] cannot find the sampling paths in polynomial time. For example, in the work of [ Meliou et al., 2007; Singh et al., 2007; Binney et al., 2010], the time complexity for the proposed algorithms is quasi-polynomial. In the work of [Zhang and Sukhatme, 2007; 16 Chapter 3. Related Work Low et al., 2008; Low et al., 2009], the time complexity for the proposed algorithms will exponentially increase with the length of planning horizon. However, our work, like another work [Low et al., 2011] can find the sampling paths in polynomial time. For those designbased strategies, because the sampling locations are selected based on designed strategies, the time complexity is not a main concern. 3.3 Non-guaranteed vs. Performance-guaranteed Sampling Paths Among those model-based strategies, some of these strategies [Meliou et al., 2007; Singh et al., 2007; Low et al., 2008; Low et al., 2009; Binney et al., 2010] cannot guarantee the performance of the sampling paths. Because the time complexity for these strategies are non-polynomial time, different heuristics (e.g., greedy heuristic, branch-and-bound search, anytime heuristic search algorithm) have been used to to reduce time complexity. However, no performance guarantee has been provided for these heuristics. Although the work of [ Zhang and Sukhatme, 2007] can find the optimal paths, they need assume that the information gain from each location is independent from other locations. However, this assumption violate the spatial correlations of environmental fields. Instead, our work, like another work [Low et al., 2011], can provide theoretical guarantees for the sampling paths. Although the work of [Low et al., 2011] can provide performance guarantees, it also need assume that the measurements in next stage only depend on the measurements in current stage. Our work relax this strong assumption by utilizing a longer path history. And theoretical guarantees are provided for the optimal paths of our algorithms. For those design-based strategies, the informativeness of the sampling locations cannot be quantified. As a result, the performance of those sampling paths is unknown. 3.4 Multi-robot vs. Single-robot Strategies Some work [Rahimi et al., 2003; Batalin et al., 2004; Rahimi et al., 2005; Singh et al., 2006; Popa et al., 2006; Zhang and Sukhatme, 2007; Meliou et al., 2007; Binney et al., 2010] can 17 Chapter 3. Related Work only generate a path for single robot. For a small sampling task, single robot is easy to coordinate and deploy. However, it will be difficult for single robot to accomplish a large sampling task. Instead, our work like those in [Singh et al., 2007; Low et al., 2007; Low et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010; Low et al., 2011] can generate multiple paths for multiple robots. With multiple robots, a large sampling task can be completed easily and fast. 18 Chapter 4 Maximum Entropy Path Planning In this chapter, we propose the MEPP (Maximum Entropy Path Planning) algorithm, which can find the paths with maximum entropy. Before presenting our own work, we introduce the information-theoretic Multi-Robot Adaptive Sampling Problem (iMASP). Although the optimal paths can be theoretically found by the algorithm for iMASP, its time complexity exponentially increases with the length of planning horizon. To reduce time complexity and provide a tight performance guarantee, we exploit the covariance function for the property that correlation of two points exponentially decreases with the distance between the two points. With this property, the MEPP algorithm is proposed in section 4.3. In section 4.4, the analysis for its time complexity is provided, which shows the MEPP algorithm is polynomial time. We provide a performance guarantee for the MEPP algorithm in section 4.5. 4.1 Notations and Preliminaries Let the transect be discretized into a r × n grid of sampling locations. The columns of the field are indexed in an increasing order. The leftmost column is indexed as ‘1’, rightmost column as ‘n’. Each planning stage corresponds to each column with the same index. In each stage, every robot takes an observation which comprises its location and measurement. 19 Chapter 4. Maximum Entropy Path Planning We assume that there are k robots to explore the area and k is less than the number of rows. In stage i, let xi denote the row vector of these k sampling locations and Zxi denote the corresponding row vector of k random measurements. And let xji indicate the j-th (1 ≤ j ≤ k) location in vector xi . In addition, let xi:l represent the vector of all sampling locations from stage i to stage l (i.e., xi:l (xi , . . . , xl ) ) and Zxi:l denote the vector of all (Zxi , . . . , Zxl )). corresponding random measurements (i.e., Zxi:l Given vectors x1 , . . . , xn , the robots can sample the area from leftmost column to rightmost column. Given vector xi−1 of locations, we assume that the robots can deterministically move to vector xi of locations. Let Xi denote the set of all possible xi in stage i. Let X be a variable that can denote Xi in any stage. Because the sampling points in each stage are the same, the number of possible vectors |X| in each stage is the same. To save energy, we also assume that each robot will not cross the paths of other robots. As a result, given the number of rows r, the number of all possible vectors |X| in each stage is Crk . 4.2 i MASP To find the paths with maximum entropy, the work of [Low et al., 2009] has defined iMASP. Given observation paths x1:n , using the chain rule of entropy, we have n H(Zxi |Zx1:i−1 ). H(Zx1:n ) = H(Zx1 ) + (4.1) i=2 Based on (4.1), the work of [Low et al., 2009] has proposed the following n-stage dynamic programming equations to calculate maximum conditional entropy in each stage: ∗ Vi∗ (x1:i−1 ) = max H(Zxi |Zx1:i−1 ) + Vi+1 (x1:i ) (4.2) Vn∗ (x1:n−1 ) = max H(Zxn |Zx1:n−1 ) (4.3) xi ∈Xi xn ∈Xn for stages i = 1, . . . , n − 1. For the first stage, because there is no previous stage, x1:0 is a vector which has no element. Hence, H(x1 |x1:0 ) is equvilent to H(x1 ). Because the field is 20 Chapter 4. Maximum Entropy Path Planning modeled with Gaussian Process, the conditional entropy in each stage is defined as follows: H(Zxi |Zx1:i−1 ) = 1 log(2πe)k |Σxi |x1:i−1 |, 2 (4.4) where Σxi |x1:i−1 is defined in (2.4). Based on (4.4), the optimal paths of iMASP are x∗1:n (x∗1 , . . . , x∗n ) where for stages i = 1, . . . , n, given x∗1:i−1 , x∗i is the vector in (4.2) or (4.3) which returns the largest value. It can be computed that the time complexity of the algorithm for iMASP is O(|X|n (kn)3 ). As a result, the time complexity will exponentially increase with the length of planning horizon. To avoid this intractable complexity, an anytime heuristic search algorithm [Korf, 1990] has been used to approximate the optimal paths. However, no performance guarantee is provided for this heuristic search algorithm. 4.3 MEPP Algorithm To balance the time complexity and the performance guarantee, we exploit the covariance function for the property that correlation of two points exponentially decreases with the distance between the two points. As a result, when we predict the posterior variance of an unobserved point y given a vector A of points, we can remove the points A˜ from vector ˜ K(u, y) is a small A to approximate the posterior variance, where for each point u in A, value. With this property, H(Zxi |Zx1:i−1 ) can be approximated by H(Zxi |Zxi−m:i−1 ) where max K(xji , xji−m−1 ) is a small value. And we can prove that the entropy decrease for this 1≤j,j ≤k truncation can be bounded. Consequently, the joint entropy H(Zx1:n ) can be approximated by the following formula : n H(Zx1:n ) ≈ H(Zx1:m ) + H(Zxi |Zxi−m:i−1 ). (4.5) i=m+1 21 Chapter 4. Maximum Entropy Path Planning According to (4.5), the following dynamic programming equations are proposed to approximate maximum conditional entropy in each stage: me Vime (xi−m:i−1 ) = max H(Zxi |Zxi−m:i−1 ) + Vi+1 (xi−m+1:i ) (4.6) Vnme (xn−m:n−1 ) = max H(Zxn |Zxn−m:n−1 ) (4.7) xi ∈Xi xn ∈Xn for stages i = m + 1, . . . , n − 1. To get the optimal vector xme 1:m in first m stages, we can use the following equation: me xme 1:m = arg max H(Zx1:m ) + Vm+1 (x1:m ) (4.8) x1:m ∈X1:m where X1:m is the set of all possible x1:m over the first m stages. Based on (4.4), the optimal paths of the MEPP algorithm are xme 1:n me me me (xme 1:m , xm+1 , . . . , xn ) where x1:m is from (4.8) and me for stages i = m + 1, . . . , n, given xme i−m:i−1 , xi is the vector in (4.6) or (4.7) which returns the largest value. It can be found that when m = 1, the MEPP algorithm is the same as the Markov-Based iMASP in the work of [Low et al., 2011]. As a result, our work generalizes the work of [Low et al., 2011] by utilizing a longer path history. 4.4 Time Analysis Theorem 4.4.1. Let |X| be the number of possible vectors in each stage. Determining the optimal paths based on m-order Markov property for the MEPP algorithm requires O(|X|m+1 [n + (km)3 ]) time, where n is the number of columns. Given vector xi−m:i−1 , to get the posterior entropy H(Zxi |Zxi−m:i−1 ) over all possible xi ∈ Xi , we need |X| × O((km)3 ) = O(|X|(km)3 ) operations. And in each stage, there are |X|m possible xi−m:i−1 over m previous stages. Hence, in each stage, to get the optimal values for |X|m vectors, we need |X|m × O(|X|(km)3 ) = O(|X|m+1 (km)3 ) operations. Because we have used the stationary covariance function, the covariance function only depends on the distance between points. Thus, the entropy values calculated for one stage are the same 22 Chapter 4. Maximum Entropy Path Planning as the values in other stages. We can propagate the optimal values from stage n−1 to m+1 and the time needed is O(|X|m+1 (n − m − 1)). To get vector xme 1:m , we need to compute the joint entropy H(Zx1:m ) for all possible x1:m over first m stages. Hence, the time needed m 3 to get the vector xme 1:m is O(|X| (km) ). As a result, the time complexity for the MEPP algorithm is O(|X|m+1 [(n − m − 1) + (km)3 ] + |X|m (km)3 ) = O(|X|m+1 [n + (km)3 ]). Comparing with the iMASP which requires O(|X|n (kn)3 ), this algorithm scales well with large n. Though, it is less efficient than Markov-Based iMASP which needs O(|X|2 (n+ k 3 )), the MEPP algorithm is also efficient in practice, which is demonstrated in section 6.4 4.5 Performance Guarantees In section 4.3, we have defined the MEPP algorithm with the m-order Markov property. The following lemma shows the optimality of the results of the MEPP algorithm in terms of conditional entropy with m previous vectors: Lemma 4.5.1. Let xme 1:n be the optimal paths of the MEPP algorithm, for any other paths x1:n , we have: n H(Zxme |Zxme ) i i−m:i−1 H(Zxme )+ 1:m i=m+1 n ≥H(Zx1:m ) + H(Zxi |Zxi−m:i−1 ) (4.9) i=m+1 where Zxme and Zx1:n are the vectors of random measurements for the paths xme 1:n and x1:n 1:n respectively. The proof for this result is shown in Appendix A.2. From this lemma, given the optimal paths x∗1:n of iMASP, inequality (4.9) still holds. This is because if we consider the conditional entropy in each stage with all previous vectors, the joint entropy of the paths x∗1:n is the maximal one. However, if we consider the conditional entropy in each stage only with m previous vectors, the paths xme 1:n are the optimal paths. 23 Chapter 4. Maximum Entropy Path Planning Let ω1 and ω2 be the horizontal and vertical width of the grid cell. Let 2 2 /ω2 1 1 /ω1 and denote the normalized horizontal and vertical length scales, respectively. Given vector xi−m−1 and vector xi−m:i−1 , for any vector xi , the entropy decrease can be bounded by the following lemma: Lemma 4.5.2. Let ε 2 σs2 exp{− (m+1) }. Given vector xi−m−1 and vector xi−m:i−1 , for 2 2 1 any vector xi , the entropy decrease can be bounded by H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi ) ≤ k 2 log{1 + ε2 } σn2 (σn2 + σs2 ) (4.10) The proof for this lemma is shown in Appendix A.3. With a similar proof, given vector xi−t:i−1 in t previous stages, where t ≥ m, the entropy decrease H(Zxi−t−1 |Zxi−t:i−1 ) − H(Zxi−t−1 |Zxi−t:i−1 , Zi ) is less than k 2 log{1 + ε2 2 (σ 2 +σ 2 ) }. σn n s As a result, with the chain rule of entropy, given vector xi and vector xi−m:i−1 in m previous stages, the entropy decrease for losing the vectors in all further previous stages can be bounded by following corollary: Corollary 4.5.3. Given vector xi and vector xi−m:i−1 in m previous stages, the entropy decrease for losing the vectors in all further previous stages can be bounded by H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 ) ≤ (i − m − 1)k 2 log{1 + ε2 }. σn2 (σn2 + σs2 ) (4.11) The proof for this corollary is shown in Appendix A.4. From this corollary, H(Zxi |Zxi−m:i−1 ) is close to H(Zxi |Zx1:i−1 ). Lemma 4.5.1 shows the optimality of the results of the MEPP algorithm with respect to the conditional entropy with m previous vectors. And corollary 4.5.3 shows the conditional entropy with m previous vectors is close to the conditional entropy with all previous vectors. As a result, the joint entropy of the optimal paths of the MEPP algorithm is close to the optimal paths of iMASP. The following theorem bounds the entropy decrease between the 24 Chapter 4. Maximum Entropy Path Planning ∗ optimal paths xme 1:n of the MEPP algorithm and the optimal paths x1:n of iMASP: ∗ Theorem 4.5.4. Let xme 1:n be the optimal paths of the MEPP algorithm and x1:n be the optimal paths of iMASP. Let ε 2 }, the entropy decrease between the two σs2 exp{− (m+1) 2 2 1 paths can be bounded by H(Zx∗1:n ) − H(Zxme ) ≤ (n − m)2 k 2 log{1 + 1:n ε2 }. σn2 (σn2 + σs2 ) The proof for the above result is shown in Appendix A.5. According to theorem 4.5.4, the performance guarantee is bounded by the number of columns n, the value of m, the number of robots k and the value of ε. And the value of ε depends on the value of m and the normalized horizontal length scale. Hence, there are a few ways to improve the performance bound: (a) transect sampling task with small number of columns, (b) environmental fields with small horizontal length scales or large horizontal discretization width, (c) using a small number of robots, (d) using a large value of m. In particular, for anisotropic fields, if the robots along the small correlated direction, the value of ε will be small. As a result, we can use a small m which incur little planning time to bound the sampling performance. 25 Chapter 5 Maximum Mutual Information Path Planning In this chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual Information Path Planning), to find the paths with maximum mutual information. Similar to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal paths, the time complexity will exponentially increase with the length of planning horizon. In the previous chapter, we have proposed the MEPP algorithm with m-order Markov property. The time complexity for this algorithm is polynomial and the performance can be guaranteed. However, in section 5.2, we show that the m-order Markov property cannot be applied to maximum mutual information criterion. To solve this problem, a different approximation method is proposed in section 5.3 . Based on this approximation method, the M2 IPP algorithm is proposed in section 5.4. In section 5.5, the analysis for its time complexity is provided, which shows the M2 IPP algorithm is also polynomial time. In section 5.6, we provide a performance guarantee for the M2 IPP algorithm. 26 Chapter 5. Maximum Mutual Information Path Planning 5.1 Notations With sampling locations xi in stage i, the row vector ui of unobserved locations in this stage can be determined. Let Zui denote the row vector of corresponding random measurements. With sampling locations xi:l from stage i to stage l, let ui:l denote the vector of all unobserved locations in these stages (i.e., ui:l (ui , . . . , ul )) and Zui:l denote the vector of all corresponding random measurements (i.e., Zui:l (Zui , . . . , Zul )). Given observation paths x1:n , let u1:n denote the unobserved part of the field. 5.2 Problem Definition With observation paths x1:n and unobserved part u1:n of the field (e.g., Fig. 5.1a), the mutual information between Zx1:n and Zu1:n is I(Zx1:n ; Zu1:n ) = H(Zx1:n ) − H(Zx1:n |Zu1:n ). (5.1) Given paths x1:n , with (5.1) and (4.4), the mutual information can be evaluated in closed form. As a result, if we use the exhaustive algorithm, the optimal paths can be found. However, to enumerate all possible paths in the field, the time complexity will exponentially increase with the length of planning horizon. In the previous chapter, we have applied the m-order Markov property to maximum entropy criterion to reduce time complexity. However, this property cannot be applied to maximum mutual information criterion. The reason is as follows. From (5.1), with the chain rule of entropy, we have n I(Zxi ; Zu1:n |Zx1:i−1 ). I(Zx1:n ; Zu1:n ) = I(Zx1 ; Zu1:n ) + (5.2) i=2 From (5.2), the conditional mutual information I(Zxi ; Zu1:n |Zx1:i−1 ) in stage i depends on vector x1:i−1 and vector u1:n . If we apply the m-order Markov property to (5.2), we have 27 Chapter 5. Maximum Mutual Information Path Planning following formula: n I(Zxi ; Zu1:n |Zxi−m:i−1 ). I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:n ) + (5.3) i=m+1 Although vector x1:i−1 can be approximated with vector xi−m:i−1 in (5.3), vector u1:n is unknown. Given current vector xi and vector xi−m:i−1 , we can only get vector ui−m:i of unobserved locations (e.g., Fig. 5.1b). With vector ui−m:i , the conditional mutual information in stage i cannot be determined. Therefore, we cannot propose a similar approximation algorithm with the m-order Markov property for maximum mutual information criterion. (a) (b) Figure 5.1: Visualization of applying m-order Markov property to maximum mutual information criterion. 5.3 Problem Analysis From the previous section, it is known that given current vector xi and vector xi−m:i−1 , the conditional mutual information I(Zxi ; Zu1:n |Zx1:i−1 ) cannot be approximated. To approximate the conditional mutual information in each stage, we need to approximate vector u1:n as well. We address this issue still by exploiting the covariance function for the property that correlation of two points exponentially decreases with the distance between the two points. According to this property, for vector xi , we can use vector ui−m:i+m in this stage to approximate vector u1:n . Due to small correlation, the information that other points in vector u1:n can provide is negligible. And we can bound the mutual information decrease in each stage incurred by ignoring other points. Consequently, (5.1) can be approximated 28 Chapter 5. Maximum Mutual Information Path Planning as follows: n−m−1 I(Zxi ; Zui−m:i+m |Zxi−m:i−1 ) I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:2m ) + i=m+1 + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ). (5.4) From (5.4), the approximated unobserved part for vector xi is vector ui−m:i+m . However, if we only use the m-order Markov property, we still only get vector ui−m:m . And some points in vector ui−m:i+m are still unknown (e.g., Fig. 5.2a). Thus, instead of using the m-order Markov property, we enumerate all possible paths in 2m previous stages. Different from maximum entropy path planning, the reward in each stage is the conditional mutual information for the vector in the middle of the paths (e.g., Fig. 5.2b). (a) (b) Figure 5.2: Visualization of the approximation method of the M2 IPP algorithm. Consequently, (5.4) can be rewritten as follows: n−1 I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:2m ) + I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) i=2m+1 + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ). (5.5) With current vector xi and vector xi−2m:i−1 in 2m previous stages, the approximated unobserved part of the field ui−2m:i for vector xi−m can be determined. As a result, the conditional mutual information for vector xi−m can be obtained. For the vectors in first m stages, there is no path history of m stages. We use vector u1:2m as their approximated unobserved part of the field. So, the conditional mutual information for the first m vec29 Chapter 5. Maximum Mutual Information Path Planning tors can be grouped together. Similarly, for vectors xn−m:n in last m + 1 stages, we use vector un−2m:n as their approximated unobserved part of the field and the conditional mutual information for the last m + 1 vectors can be grouped together. With the sum of the approximated values in all stages, we can approximate the maximum mutual information paths. 5.4 M2 IPP Algorithm From the previous section, with current vector xi and vector xi−2m:i−1 , the conditional mutual information for vector xi−m can be obtained. Consequently, the following dynamic programming formulas are proposed to approximate maximum conditional mutual information in each stage: mi (xi−2m+1:i ) Vimi (xi−2m:i−1 ) = max I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + Vi+1 (5.6) Vnmi (xn−2m:n−1 ) = max I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) (5.7) xi ∈Xi xn ∈Xn for stages i = 2m + 1, . . . , n − 1. To get the optimal vector xmi 1:2m in first 2m stages, the following equation can be used: mi xmi 1:2m = arg max I(Zx1:m , Zu1:2m ) + V2m+1 (x1:2m ) (5.8) x1:2m ∈X1:2m where X1:2m is set of all possible x1:2m over the first 2m stages. Based on (4.4), the optimal paths of the M2 IPP algorithm are xmi 1:n mi mi mi (xmi 1:2m , x2m+1 , . . . , xn ) where x1:2m is from (5.8) mi and for stages i = 2m + 1, . . . , n, given xmi i−2m:i−1 , xi is the vector in (5.6) or (5.7) which returns the largest value. 5.5 Time Analysis Theorem 5.5.1. Let |X| be the number of possible vectors in each stage. Determining the optimal paths of the M2 IPP algorithm requires O(|X|2m+1 (n + 2[r(2m + 1)]3 )), where r is 30 Chapter 5. Maximum Mutual Information Path Planning the number of rows, n is the number of columns, and m is the value used for approximated unobserved part of the field in each stage. Given vector xi−2m:i−1 , to get the conditional mutual information for vector xi−m over all possible xi ∈ Xi , we need |X| × O([r(2m + 1)]3 ) = O(|X|[r(2m + 1)]3 ) operations. And in each stage, there are |X|2m possible xi−2m:i−1 over 2m previous stages. Hence, in each stage, to get the optimal values for |X|2m vectors, we need |X|2m × O(|X|[r(2m + 1)]3 ) = O(|X|2m+1 [r(2m+1)]3 ) operations. Similar to the MEPP algorithm, the conditional mutual information calculated for one stage is the same as the values in other stages. Thus, we can propagate the optimal values from stage n − 2 to 2m + 1 and the time needed is O(|X|2m+1 (n−2m−2)). Subsequently, it requires O(|X|2m+1 [r(2m+1)]3 ) time to calculate the conditional mutual information for last m+1 vectors. To get the mutual information for the first m vectors, the time needed is O(|X|2m [r(2m)]3 ). As a result, the time complexity for the M2 IPP algorithm is O(|X|2m+1 (n − 2m − 2 + [r(2m + 1)]3 ) + |X|2m+1 [r(2m + 1)]3 + |X|2m [r(2m)]3 ) = O(|X|2m+1 (n + 2[r(2m + 1)]3 )). Comparing to the greedy algorithm (6.2) in section 6.1 which requires considering all unobserved points in each stage, our algorithm just need to consider the unobserved points in 2m+1 columns. As a result, for a transect sampling task with a large number of columns, our algorithm is still efficient. 5.6 Performance Guarantees In section 5.4, we have formulated the M2 IPP algorithm with the approximation method proposed in section 5.3. The following lemma shows the optimality of the results of the M2 IPP algorithm in terms of approximated conditional mutual information: 2 Lemma 5.6.1. Let xmi 1:n be the optimal paths of the M IPP algorithm, for any other paths 31 Chapter 5. Maximum Mutual Information Path Planning x1:n , we have: n−1 I(Zxmi ; Zumi I(Zxmi ; Zumi )+ 1:m 1:2m i=2m+1 i−m i−2m:i |Zxmi i−2m:i−m−1 ) + I(Zxmi ; Zumi |Zxmi )≥ n−m:n n−2m:n n−2m:n−m−1 n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) I(Zx1:m ; Zu1:2m ) + i=2m+1 + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) (5.9) mi where umi 1:n and u1:n are the unobserved part of the field for observation paths x1:n and x1:n , respectively. The proof for this result is shown in Appendix B.1. From this lemma, given the optimal paths x1:n of the exhaustive algorithm, inequality (5.9) still holds. This is because if we consider the conditional mutual information in each stage with all previous vectors and the whole unobserved part of the field, the mutual information between the observation paths x1:n and the corresponding unobserved part u1:n is the maximal one. However, if we consider the conditional mutual information with approximated path history and approximated unobserved part in each stage, the paths xmi 1:n are the optimal paths. Similar to corollary 4.5.3, we can bound the mutual information decrease in each stage as well. Let ω1 and ω2 be the horizontal and vertical width of the grid cell. Let 2 2 /ω2 1 1 /ω1 and denote the normalized horizontal and vertical length scales, respectively. Given the approximated path history xi−2m:i−m−1 and approximated unobserved part ui−2m:i , the following lemma bounds the mutual information decrease for loosing the path history and the unobserved points in other stages: Lemma 5.6.2. Given vector xi and vector xi−2m:i−1 , the approximated unobserved part of the field ui−2m:i for vector xi−m can be obtained. Let ε 2 σs2 exp{− (m+1) }, if there are r 2 2 1 rows and n columns in the field, the mutual information decrease for loosing the path history 32 Chapter 5. Maximum Mutual Information Path Planning and the unobserved points in other stages can be bounded with following formulas: I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) − I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) = Ai−m − Bi−m (5.10) where Ai−m = H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) , (5.11) Bi−m = H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 ) (5.12) and ε2 } σn2 (σn2 + σs2 ) ε2 ≤ (i − 2m − 1)k 2 log{1 + 2 2 }. σn (σn + σs2 ) Ai−m ≤ (n − 2m − 1)rk log{1 + (5.13) Bi−m (5.14) The proof for this lemma is shown in Appendix B.3. With the definition of mutual information, (5.10), (5.11), (5.12) can be obtained. For Bi−m , with corollary 4.5.3, the inequality (5.14) can be obtained. For Ai−m , all points in vector (x1:i−1 , u1:n ), which are away from vector xi−m within m stages are in the vector (xi−m:i−1 , ui−2m:i ). As a result, other points provide little information about the vector xi−m . Then, the value for Ai−m can be bounded by inequality (5.13). The lemma 5.6.1 shows the optimality of the results of the M2 IPP algorithm in terms of approximated conditional mutual information. And the lemma 5.6.2 shows the mutual information decrease in each stage can be bounded. As a result, the mutual information of the results of the M2 IPP algorithm is close to the optimal results. The following lemma 2 bounds the mutual information decrease between the paths xmi 1:n of the M IPP algorithm and the optimal paths x1:n of the exhaustive algorithm: 2 Theorem 5.6.3. Let xmi 1:n be the optimal paths of the algorithm M IPP and x1:n be the optimal paths of the exhaustive algorithm. Let ε 2 }, if there are r rows and σs2 exp{− (m+1) 2 2 1 33 Chapter 5. Maximum Mutual Information Path Planning n columns in field, the mutual information decrease can be bounded with: I(Zx1:n ; Zu1:n ) − I(Zxmi ; Zumi ) 1:n 1:n (5.15) 1 ε2 ≤ [nr + (n − 2m)k](n − 2m)k log{1 + 2 2 }. 2 σn (σn + σs2 ) (5.16) The proof for above result is shown in Appendix B.4. According to the theorem 5.6.3, the performance guarantee is bounded by the number of columns n, the value of m, the number of robots k and the value of ε. And the value of ε depends on the value of m and the normalized horizontal length scale. As a result, there are a few ways to improve the performance bound: (a) transect sampling task with small number of columns, (b) environmental fields with small horizontal length scales or large horizontal discretization width, (c) using a small number of robots, (d) using a large value of m. Similar to the MEPP algorithm, if the robots explore the anisotropic fields along small correlated direction, we can also use a small m which incur little planning to bound the sampling performance. 34 Chapter 6 Experimental Results In the two previous chapters, we have provided the performance guarantees for the two proposed algorithms, MEPP and M2 IPP. In this chapter, we empirically evaluate the performance of these two algorithms on two real-world data sets. The results of our proposed algorithms are compared to two other existing algorithms based on three performance metrics. The data sets and performance metrics are described in section 6.1. The performance results for the two data sets are presented in sections 6.2 and 6.3. The time efficiency of the two proposed algorithms is shown in section 6.4. The algorithms are implemented in Matlab and the experiments are run on a PC with Intel Quad Core Q9550 2.83GHz and 4 GB RAM. In section 6.5, we will discuss how to select criterion to get lower prediction error for different environmental fields. 6.1 Data Sets and Performance Metrics In the two previous chapters, we have provided the performance guarantees for the proposed algorithms. To reveal the performance empirically, the algorithms are tested on two real-world data sets: (a) May 2009 temperature data of Panther Hollow Lake in Pittsburgh, PA spanning 25 m by 150 m, (b) June 2009 plankton density data of Chesapeake Bay spanning 314 m by 1765 m. The environmental fields in these two data sets 35 Chapter 6. Experimental Results are modeled with the Gaussian Process. The hyper-parameters (i.e., 1, 2, σs2 , σn2 ) are learned using maximum likelihood estimation (MLE). The learned hyper-parameters are 1 = 40.45 m, 1 = 27.5723 m, 2 = 16 m, σs2 = 0.1542, and σn2 = 0.0036 for the temperature field and 2 = 134.6415 m, σs2 = 2.152, and σn2 = 0.041 for the plankton density field. The temperature field which distributed over 25m × 150m is discretized into a 5 × 30 grid (e.g., 6.1d). To investigate how the algorithms perform under different vertical and horizontal correlations (specifically, length scales), we reduced the horizontal and/or vertical length scales of the original field to produce three other modified temperature fields (e.g., 6.1a, 6.1b, 6.1c). The remaining hyper-parameters (e.g., σs2 , σn2 ) are learned on the original field with the reduced length scales through MLE. The four temperature fields with learned hyper-parameters are shown in Fig. 6.1. 1 1 24 24 2 2 3 3 23.5 5 = 5.0 m, 1 5 23 5 (a) 10 2 15 = 5.0 m, 20 σs2 25 = 0.2364, = 0.0545 (b) 1 10 = 5.0 m, 2 15 20 = 16.0 m, σs2 25 30 = 0.2704, σn2 = 0.0563 1 24 2 24 2 3 3 23.5 23.5 4 4 5 5 23 5 1 23 5 30 σn2 1 (c) 23.5 4 4 10 = 40.45 m, 2 15 = 5.0 m, 20 σs2 25 = 0.3116, 23 30 σn2 5 = 0.0588 (d) 1 10 = 40.45 m, 2 15 20 = 16 m, σs2 25 = 0.3926, 30 σn2 = 0.0601 Figure 6.1: Temperature fields which distributed over 25m × 150m are discretized into 5 × 30 grids with learned hyper-parameters. The plankton density field with learned hyper-parameters is shown in Fig. 6.2. 14 2 12 4 10 6 8 8 5 10 15 20 25 30 35 40 45 Figure 6.2: Plankton density field which distributed over 314m × 1765m is discretized into a 8 × 45 grid with 1 = 27.5273 m, 2 = 134.6415 m, σs2 = 1.4670, and σn2 = 0.2023. We will compare the performance of our proposed algorithms with two other state-of- 36 Chapter 6. Experimental Results the-art algorithms which have also been used as baseline in the work of [Low et al., 2009; Low et al., 2011]: (a) greedy maximum entropy path planning (GMEPP). Given starting locations, the GMEPP algorithm greedily selects next vector of locations which maximize the joint entropy of the observation paths, which can be defined as follows: gme Vi (x1:i−1 ) = max H(Zxi |Zx1:i−1 ) xi ∈Xi (6.1) for stages i = 2, . . . , n; (b) greedy maximum mutual information path planning (GM2 IPP). Given starting locations, the GM2 IPP algorithm greedily selects next vector of locations which maximize the mutual information between the observation paths and the corresponding unobserved part of the field, which can be formulated as follows: gmi Vi (x1:i−1 ) = max I(Zx1:i ; ZX \x1:i ) xi ∈Xi (6.2) for stages i = 2, . . . , n. For the two algorithms, we will enumerate all possible starting locations to get the optimal paths. The performance of the algorithms are evaluated with three different metrics: (a) the joint entropy of the unobserved part of the field ENT(π): H(Zu1:n |Zx1:n ), where x1:n is the observation paths, which is denoted by π, and u1:n is the unobserved part of the field, (b) the mutual information between the observation paths and the unobserved part of the field MI(π): I(Zu1:n ; Zx1:n ), (c) the mean square relative prediction error ERR(π): |u1:n |−1 Σu∈u1:n {(zu −µu|x1:n )/¯ µ}2 , where zu is the measurement value at point u, and µu|x1:n is the posterior mean value at point u using (2.1), and µ ¯ = |u1:n |−1 Σu∈u1:n zu . Both smaller ENT(π) and larger MI(π) imply lower ERR(π). However, the observations paths with small ENT(π) are different from the paths with large MI(π). For entropy metric, to get smaller H(Zu1:n |Zx1:n ), according to H(Zx1:n , Zu1:n ) = H(Zx1:n ) + H(Zu1:n |Zx1:n ), we need to get larger H(Zx1:n ). To achieve larger H(Zx1:n ), the points in x1:n need to be uncertain to each other. So, to get smaller ENT(π), we need to select the points which are far away from each other. For mutual information metric, to get larger H(Zu1:n ) − H(Zu1:n |Zx1:n ), we need to 37 Chapter 6. Experimental Results get larger H(Zu1:n ) and smaller H(Zu1:n |Zx1:n ). To get smaller H(Zu1:n |Zx1:n ), similar to entropy metric, we need to select the points which are far away from each other. However, to get larger H(Zu1:n ), similar to getting larger H(Zx1:n ), the points in u1:n need to be far away from each other. Hence, the points in x1:n cannot be extremely far away from each other so that the points in x1:n are inside the unobserved points u1:n . Consequently, the unobserved points can be separated from each other and be far away from each other. As a result, to get larger MI(π), we need to select the points which are far away from each other and can separate the unobserved points from each other. For prediction error metric, it will show how accurately the field is being mapped by the sampling paths of different algorithms. 6.2 6.2.1 Temperature Data Results Entropy Metric Fig. 6.3 shows the results of ENT(π) for different algorithms with different number of robots on the temperature fields. In the experiments, we have used different values of m for the MEPP algorithm. For the M2 IPP algorithm, we have set m = 3 with one robot and m = 2 with two robots and three robots. With one robot, it can be observed that: (1) On fields a and b, the MEPP and GMEPP algorithms can achieve smaller ENT(π) than the GM2 IPP and M2 IPP algorithms: for field a or field b, because the correlations of the field are small, the points of the field are uncertain to each other. So, the joint entropy of each field is large. And the points selected by maximum mutual information criterion will not be far away from each other as much as possible, so ENT(π gmi ) and ENT(π mi ) are much larger. (2) With a small m (e.g., m = 2), ENT(π me ) is the smallest on fields a, b: there are two reasons which account for this. Firstly, for field a or field b, the horizontal length scale of the field is small, in each stage our m-order Markov property is enough to exploit the small horizontal correlation. Secondly, because our algorithm is non-myopic, the paths with maximum entropy can be found. (3) For fields c and d, increasing the value of m decreases 38 Chapter 6. Experimental Results the ENT(π me ) significantly: for field c or field d, because the horizontal length scale of the field is large, larger Markov property can exploit the horizontal correlation much more. With two robots, we have three similar observations as the one-robot case. With three robots, it can be observed that: (1) On fields a and b, the GM2 IPP algorithm can achieve comparable ENT(π) to that of the MEPP and GMEPP algorithms: when we increase the number of sampled locations in each column, the points selected by different criteria are similar. (2) With a small m (e.g., m = 2), ENT(π me ) is the smallest on fields a, b: the reason has been explained in the second observation under the one-robot case. (3) On field c, increasing the value of m decreases the ENT(π me ) significantly: the reason is the same as the third observation under one-robot case. (4) On field d, ENT(π me ) is the smallest: when we increase the number of robots, the large vertical correlation can be exploited more by our non-myopic algorithm. Summarizing the above observations, it can be observed that: (1) On fields a and b, with any number of robots, the MEPP algorithm with a small m can achieve smaller ENT(π) than other algorithms. (2) On fields c and d, the MEPP algorithm with a large m can achieve comparable ENT(π) to that of other algorithms. (3) On field d, increasing the number of robots (e.g. k = 3), the MEPP algorithm can achieve smaller ENT(π) than other algorithms. It can be found that the MEPP algorithm generalizes the work of [Low et al., 2011] and can still maintain tight performance bounds on fields with large length scales. 6.2.2 Mutual Information Metric Fig. 6.4 shows the results of MI(π) for different algorithms with different number of robots on the temperature fields. In the experiments, we have used different values of m for the M2 IPP algorithm. For the MEPP algorithm, we have used m = 7 with one robot and m = 5 with two robots and three robots. With one robot, it can be observed that: (1) On fields a and b, the M2 IPP and GM2 IPP algorithms can achieve larger MI(π) than the MEPP and GMEPP algorithms: since the points selected by the maximum entropy criterion will 39 Chapter 6. Experimental Results −64.8 1 2 3 4 5 6 7 m(field a) Ent −165 Ent −64.6 −160 −57 −122 −124 −126 −128 −130 −181 −182 −175 −183 1 2 3 4 5 6 7 m(field c) −110 −115 1 −180 −170 −105 −59 1 2 3 4 5 6 7 m(field b) GMEPP GM2IPP MEPP M2IPP −100 −58 −60 Ent Ent −64.4 1 2 3 4 5 6 7 m(field d) 2 3 4 5 m(field a) −128 −130 −132 −134 2 3 4 5 m(field b) 1 2 3 4 5 m(field d) −138 −138.4 −138.8 1 2 3 4 5 m(field c) (a) robot 1 (b) robot 2 −70 −45 Ent 1 −75 −46 −80 Ent −47 1 2 3 4 5 m(field a) 1 2 3 4 5 m(field b) −92 −89.3 −89.4 −89.5 −89.6 −93 1 2 3 4 5 m(field c) −94 1 2 3 4 5 m(field d) (c) robot 3 Figure 6.3: The results of ENT(π) for different algorithms with different number of robots on the temperature fields. be far away from each other, some points will be on the border of the field. So some of the unobserved points cannot be separated from each other. And for field a or field b, because the correlations of the field are small, the joint entropy of the unobserved part of the field by maximum entropy criterion will be much smaller than that by maximum mutual information criterion. As a result, MI(π gme ) and MI(π me ) are much smaller. (2) On fields a 40 Chapter 6. Experimental Results and b, MI(π mi ) is the largest: for field a or field b, the horizontal correlation of the field is small, our small Markov property is enough to exploit the horizontal correlation. In each stage, the M2 IPP algorithm consider all the unobserved points around current points and most of these unobserved points are on vertical line, so the large vertical correlation of field b can also be exploited. With two robots and three robots, it can be observed that: (1) When we increase the number of robots (e.g., k = 3), the MEPP algorithm and GMEPP algorithms can achieve comparable MI(π) to that of GM2 IPP algorithm: when we increase the number of sampled locations in each column, the points selected by different criteria are similar. (2) When we increase the number of robots, MI(π mi ) may be worse than other algorithms: because the value of m we use is small, when the number of robots is large, the performance bound of the M2 IPP algorithm is loose. 6.2.3 Prediction Error Metric Fig. 6.5 shows the results of ERR(π) for different algorithms with different number of robots on the temperature fields. In the experiments, we have set the MEPP algorithm and the M2 IPP algorithm with different values of m. For maximum entropy criterion, with one robot, it can be observed that: (1) With a small m (e.g., m = 2), ERR(π me ) is less than or equal to ERR(π gme ) on fields a and b: the reason is the same as the second observation under one-robot case in section 6.2.1. (2) For fields c and d, increasing the value of m decreases the ERR(π me ) significantly: the reason is the same as the third observation under one-robot case in section 6.2.1. With two robots and three robots, it can be observed that: With a small m (e.g., m = 2), ERR(π me ) is less than or equal to ERR(π gme ) on fields b, c and d: for field b, the reason has been explained under two-robot and three-robot cases in section 6.2.1. For fields c and d, there are two reasons which account for this. Firstly, because fields c and d are more correlated than fields a, b, all points selected can be used to predict other points. Although m-order Markov property is not large enough to exploit the large correlations completely, because our algorithm is non-myopic, the selected points 41 Chapter 6. Experimental Results 50 46 26 44 MI 1 2 3 m(field a) MI 48 27 1 1 39.4 GMEPP GM2IPP 39.2 MEPP M2IPP 1 39 2 3 m(field c) 1 2 MI 1 2 m(field b) 37 36.8 36.6 36.4 36.2 46 45.5 45 44.5 2 3 m(field d) 1 2 1 2 m(field d) m(field c) (a) robot 1 (b) robot 2 40.8 62 40.6 61 40.4 60 1 2 1 2 m(field b) m(field a) MI 64 62 60 58 56 54 m(field a) 39.6 48 46 44 42 40 38 42 41.5 41 40.5 2 3 m(field b) MI MI 28 34 32 30 28 41 40 39 1 2 1 2 m(field d) m(field c) (c) robot 3 Figure 6.4: The results of MI(π) for different algorithms with different number of robots on the temperature fields. can distribute more evenly than the points selected by the greedy algorithm. For maximum mutual information criterion, it can be observed that: With m = 2, ERR(π mi ) is comparable to ERR(π gmi ) on all fields under one-robot and two-robot cases: although the MI(π gmi ) is larger than the MI(π mi ), because our algorithm is non-myopic, the points can be distributed more evenly than the points selected by the greedy algorithm. 42 Chapter 6. Experimental Results Under three-robot case, ERR(π mi ) may be larger than ERR(π gmi ): the reason is the same as the second observation under three-robot case in section 6.2.2. −5 Err x 10 x 10 2 5.8 4.5 4 3.5 3 2.5 1 5.7 1 2 3 4 5 6 7 m(field a) −5 x 10 8 6 4 2 0 2 0 x 10 1 2 3 4 5 6 7 m(field c) 0 1 2 3 4 5 6 7 m(field b) −6 x 10 1 2 3 4 m(field a) GMEPP GM2IPP MEPP M2IPP 1 1 2 3 4 5 6 7 m(field d) 2 3 4 m(field b) 5 −7 2.5 GMEPP GM2IPP MEPP M2IPP 2 x 10 2 1.5 0 1 2 3 4 m(field c) 5 1 2 3 4 m(field d) 5 (b) robot 2 −6 −7 x 10 x 10 3.5 Err 1 −6 (a) robot 1 8 6 4 2 3 2.5 1 2 3 4 m(field a) 5 1 2 3 4 m(field b) 5 −8 −7 2 5 x 10 3 Err Err 4 Err 5.9 5 4 3 2 1 −6 −5 −6 x 10 x 10 x 10 Err 10 8 1.5 6 1 2 3 4 m(field c) 5 1 2 3 4 m(field d) 5 (c) robot 3 Figure 6.5: The results of ERR(π) for different algorithms with different number of robots on the temperature fields. 43 Chapter 6. Experimental Results 6.3 Plankton Data Results 6.3.1 Entropy Metric Fig. 6.6 shows the results of ENT(π) for different algorithms on plankton density field with different number of robots. In the experiments, we have used different values of m for the MEPP algorithm. For the M2 IPP algorithm, we have set m = 2 with one robot and m = 1 with two robots and three robots. For entropy metric, it can be observed that with any number of robots, the MEPP algorithm with a small m (e.g. m = 1) can achieve smallest ENT(π): this is because small horizontal and large vertical correlations can be exploited by our m-order Markov property and non-myopic algorithm as explained in section 6.2.1. 30 124 55 Ent Ent 120 GMEPP GM2IPP MEPP M2IPP 50 45 GMEPP GM2IPP MEPP M2IPP 20 Ent GMEPP GM2IPP MEPP M2IPP 122 10 0 118 40 −10 116 1 2 3 4 1 m (a) robot 1 2 m (b) robot 2 3 1 2 m 3 (c) robot 3 Figure 6.6: The results of ENT(π) for different algorithms with different number of robots on the plankton density field. 6.3.2 Mutual Information Metric Fig. 6.7 shows the results of MI(π) for different algorithms on plankton density field with different number of robots. In the experiments, with one robot, we have set m = 4 for the MEPP algorithm and m = 2 for the M2 IPP algorithm. With two robots and three robots, we have set m = 3 for the MEPP algorithm and m = 1 for the M2 IPP algorithm. For mutual information metric, it can be observed that with any number of robots, the M2 IPP algorithm can achieve MI(π) performance comparable to that of the GM2 IPP algorithm: because the horizontal correlation is small, small Markov property is enough to exploit the 44 Chapter 6. Experimental Results horizontal correlation. The large vertical correlation can be exploited by the unobserved 85 170 80 160 200 75 150 195 205 MI MI MI points in each stage. 70 140 190 65 130 185 gme gmi mepp m2ipp (a) robot 1 gme gmi mepp (b) robot 2 m2ipp gme gmi mepp m2ipp (c) robot 3 Figure 6.7: The results of MI(π) for different algorithms with different number of robots on the plankton density field. 6.3.3 Prediction Error Metric Fig. 6.8 shows the results of ERR(π) for different algorithms on plankton density field with different number of robots. In the experiments, we have set the MEPP algorithm and the M2 IPP algorithm with different values of m. For maximum entropy criterion, it can be observed that: ERR(π me ) with a small m (e.g. m = 1) is less than or equal to ERR(π gme ) with any number of robots: the reason is the same as the observation in section 6.3.1. For maximum mutual information criterion, it can be observed that: ERR(π mi ) is comparable to ERR(π gmi ): the reason is the same as the observation in section 6.3.2. 6.4 Time Efficiency Fig. 6.9 shows the running time of different algorithms to derive the paths with different number of robots on the temperature field. From this figure, when we use m ≤ 3, the MEPP algorithm is more efficient than GMEPP algorithm with any number of robots. When we use m = 1, the M2 IPP algorithm is much more efficient than the GM2 IPP algorithm with any number of robots. When we use m = 2, the time efficiency of the M2 IPP algorithm is 45 Chapter 6. Experimental Results x 10 −3 −3 14 x 10 10 8 3.5 3 4 3 2 1 1 2 3 4 2 1.5 6 1 GMEPP GM2IPP MEPP M2IPP 2.5 GMEPP GM2IPP MEPP M2IPP Err GMEPP GM2IPP MEPP M2IPP Err 12 Err −4 x 10 5 1 2 m m (a) robot 1 3 1 (b) robot 2 2 m 3 (c) robot 3 Figure 6.8: The results of ERR(π) for different algorithms with different number of robots on the plankton density field. close to that of the GM2 IPP algorithm. 4 10 10 10 10 10 1 0 GMEPP GM2IPP MEPP M2IPP −1 1 2 3 4 m 5 6 7 3 10 2 2 Time(s) 10 4 10 3 Time(s) Time(s) 10 3 10 1 10 0 2 10 1 10 0 10 10 −1 −1 10 10 1 (a) robot 1 2 3 m (b) robot 2 4 5 1 2 3 m 4 5 (c) robot 3 Figure 6.9: The running time of different algorithms with different number of robots on the temperature fields. Fig. 6.10 shows the running time of different algorithms with different number of robots on the plankton density field. From this figure, it can be observed that when we use m ≤ 3, the MEPP algorithm is more efficient than the GMEPP algorithm with any number of robots. It can be also observed that when we use m = 1, the M2 IPP algorithm can achieve significant computational gain over the GM2 IPP algorithm, which supports our time complexity analysis in section 5.5. 46 Chapter 6. Experimental Results 10 10 10 10 5 10 4 10 3 10 10 3 5 10 2 1 0 4 10 Time(s) Time(s) 10 6 4 Time(s) 10 2 10 1 10 10 −1 −1 10 1 2 3 4 1 2 m m (a) robot 1 2 10 1 GMEPP GM2IPP MEPP M2IPP 0 3 10 10 0 10 3 1 (b) robot 2 2 m 3 (c) robot 3 Figure 6.10: The running time of different algorithms with different number of robots on the plankton density field. 6.5 Criterion Selection In this section, we will discuss how to select the criterion for different environmental fields. To answer this question, some principles are proposed. As we know in section 6.1, the sampling points by the maximum entropy criterion will be faraway from each other. The sampling points by the maximum mutual information criterion will be faraway from each other and separate the unobserved points from each other. The following figures show how the two criteria select the sampling points in an environmental field. 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 1 2 3 4 5 6 7 8 9 10 (a) maximum entropy criterion 11 0 1 2 3 4 5 6 7 8 9 10 11 (b) maximum mutual information criterion Figure 6.11: Sampling points selected by different criteria. To get lower prediction error, we propose three principles as follows: 47 Chapter 6. Experimental Results 1. For highly correlated environmental fields, maximum entropy criterion is better. For small correlated environmental fields, maximum mutual information criterion is better. If the field is highly correlated, the measurement values are close to each other. As we know, the sampling points by maximum entropy criterion will be faraway from each other. Some points will be on the border of the area. So the measurement values at the unobserved points which are on the border can be predicted accurately. Although there are only a few sampling points inside the area, because the measurement values are close to each other, the measurement values at the unobserved points which are inside the area can also be predicted accurately. If we use maximum mutual information criterion, there will be too many sampling points which are inside the area. The measurement values at the unobserved points which are on the border can not be predicted accurately. So if the field is highly correlated, the maximum entropy criterion will be better. If the field is small correlated, the measurement values are different from each other. As we know, the sampling points by the mutual information criterion are distributed among the unobserved points. Hence, the measurement values at the unobserved points can be predicted accurately. If we use the maximum entropy criterion, some sampling points will be on the border, which can not provide much information for the inside unobserved points. So if the field is small correlated, the maximum mutual information criterion is better. 2. If the number of sampling points in each column is large, maximum entropy criterion is better. If the number of sampling points in each column is small, maximum mutual information criterion is better. If the number of sampling points in each column is large, there will be too many sampling points in the area. To distribute these sampling points well, maximum entropy criterion is better. If the number of sampling points in each column is small, to get more information from the unobserved points, the maximum mutual information criterion is better. 48 Chapter 6. Experimental Results 3. If the number of rows is large, maximum mutual information criterion is better. If the number of rows is small, maximum entropy criterion is better. If the number of rows is large, there are many unobserved points which are inside the area. As we know, some of the sampling points by the maximum entropy criterion will be on the border, which can not provide much information for the inside unobserved points. In contrast, the sampling points by the maximum mutual information criterion will be distributed among the unobserved points. As a result, when the number of rows is large, the sampling points by the maximum mutual information criterion can provide more information about the unobserved points. If the number of rows is small, there are less unobserved points which are inside the area. We need less sampling points which are inside the area. So if the number of rows is small, maximum entropy criterion is better. 49 Chapter 7 Conclusions In this thesis, we have studied the following problem: How can we exploit the environmental structure to improve sampling performance as well as the time efficiency of planning for anisotropic fields? To address the question, this thesis has provided the following novel contributions: • Formalization of MEPP: It is found that for many GPs, correlation of two points exponentially decreases with the distance between the two points. With this property, we found that m-order Markov property can be applied to maximum entropy criterion to reduce time complexity and guarantee the performance. Consequently, we propose the polynomial-time approximation algorithm, MEPP. For a class of exploration tasks called transect sampling task, a theoretical performance guarantee is provided for the MEPP algorithm. • Formalization of M2 IPP: For maximum entropy criterion, the m-order Markov property can be used to reduce time complexity and guarantee the performance. However, it is found that the m-order Markov property cannot be applied to maximum mutual information criterion. To solve this problem, another approximation method is provided. Based on this approximation method, we propose the M2 IPP algorithm. The time complexity of the M2 IPP algorithm is also polynomial. A theoretical per- 50 Chapter 7. Conclusions formance guarantee on the sampling performance of the M2 IPP algorithm for the transect sampling task is provided as well. • Evaluation of performance: We evaluate the performance of two proposed algorithms with two real-world data sets. The performance is measured with three metrics: entropy, mutual information and prediction error. The results of our proposed algorithms are compared with two other state-of-the-art algorithms: GMEPP and GM2 IPP. For maximum entropy criterion, the MEPP algorithm with a small m can achieve lower joint entropy of the observed part than the GMEPP algorithm on fields with small horizontal length scales. For the fields with large length scales, the MEPP algorithm with a large m can also achieve comparable performance. The prediction error by the MEPP algorithm is smaller than that by the GMEPP algorithm almost on all fields. When we use m ≤ 3, with any number of robots, the MEPP algorithm is more efficient than the GMEPP algorithm. For maximum mutual information criterion, the mutual information and the prediction error by the M2 IPP algorithm is comparable to that by the GM2 IPP algorithm in two data sets. On the fields with small number of columns, the time efficiency of the M2 IPP algorithm with m = 2 is close to that of the GM2 IPP algorithm. However on the fields with large number of columns, the M2 IPP algorithm with m = 1 is much more efficient than the GM2 IPP algorithm. To get lower prediction error for different environmental fields, we proposed three principles to select the criterion. 51 Appendix A Maximum Entropy Path Planning A.1 Proof for Lemma 2.2.1 Given any vector A of observed points and an unobserved point y, with (2.2), the posterior variance of point y is 2 σy|A = σy2 − ΣyA Σ−1 AA ΣAy (A.1) where ΣAA is a covariance matrix, ΣyA and ΣAy are covariance vectors. As we know, the 2 posterior variance σy|A should be larger than 0. So if σn2 > 0, we have: 2 σy|A = σs2 + σn2 − ΣyA Σ−1 AA ΣAy > 0 (A.2) where the values in the diagonal line of ΣAA are σs2 + σn2 . And if σn2 = 0, we have: 2 σy|A = σs2 − ΣyA Σ−1 BB ΣAy > 0 (A.3) where ΣBB is a covariance matrix and ΣBB = ΣAA − σn2 I. According to the covariance function, the covariance vectors ΣyA and ΣAy do not change. We define A ΣAA , B ΣBB , E σn2 I, Y ΣAy , YT ΣyA . And let W 52 Chapter A. Maximum Entropy Path Planning −1 Y, Y = AW, we have: Σ−1 AA ΣAy = A WT EW + WT ET B−1 EW > 0 (A.4) ⇒ WT BT B−1 EW + WT ET B−1 EW > 0 (A.5) ⇒ WT (B + E)T B−1 EW > 0 (A.6) ⇒ WT AT B−1 EW + WT AT W > WT AT W (A.7) ⇒ WT AT (B−1 E + I)W > WT AT W (A.8) ⇒ WT AT B−1 (E + B)W > WT AT A−1 AW (A.9) ⇒ (AW)T B−1 AW > (AW)T A−1 AW (A.10) ⇒ YT B−1 Y > YT A−1 Y (A.11) −1 ⇒ ΣY A Σ−1 BB ΣAY > ΣY A ΣAA ΣAY . (A.12) For inequality (A.4), because W is a vector and E = σn2 I, we have WT EW > 0. Because B is a covariance matrix, which is invertible and positive semi-definite, B−1 is positive semidefinite. Hence, WT ET B−1 EW ≥ 0 and inequality (A.4) holds. Since B is a covariance matrix, which is symmetric, we have BT = B. Hence, inequality (A.5) can be obtained from inequality (A.4). Other results can be obtained easily. Consequently, we have: −1 ΣyA Σ−1 AA ΣAy < ΣyA ΣBB ΣAy . (A.13) For (A.2), with above result (A.13), we have: 2 σy|A = σs2 + σn2 − ΣyA Σ−1 AA ΣAy > σs2 + σn2 − ΣyA Σ−1 BB ΣAy (A.14) > σn2 . (A.15) With inequality (A.3), we have inequality (A.15). Therefore, lemma 2.2.1 holds. 53 Chapter A. Maximum Entropy Path Planning A.2 Proof for Lemma 4.5.1 Given x1:m , (4.6) and (4.7) can be rewritten as follows: me Vm+1 (x1:m ) = = max me (x2:m+1 ) H(Zxm+1 |Zx1:m ) + Vm+2 max {H(Zxm+1 |Zx1:m ) xm+1 ∈Xm+1 xm+1 ∈Xm+1 + = max xm+2 ∈Xm+2 max xm+1 ∈Xm+1 ,xm+2 ∈Xm+2 me (x3:m+2 )]} [H(Zxm+2 |Zx2:m+1 ) + Vm+3 (A.16) (A.17) {H(Zxm+1 |Zx1:m ) me + H(Zxm+2 |Zx2:m+1 ) + Vm+3 (x3:m+2 )} (A.18) ... n = H(Zxi |Zxi−m:i−1 ). max xm+1 ∈Xm+1 ,...,xn ∈Xn (A.19) i=m+1 Therefore, given x1:m , we can get the vectors xm+1 , . . . , xn which have maximum value for n H(Zxi |Zxi−m:i−1 ). With (4.8), we have: i=m+1 me xme 1:m = arg max H(Zx1:m ) + Vm+1 (x1:m ) (A.20) x1:m ∈X1:m where X1:m is the set of all possible x1:m over the first m columns. With (A.20), the paths n H(Zxi |Zxi−m:i−1 ) can be obtained. Therefore, the x1:n having maximum H(Zx1:m ) + i=m+1 lemma 4.5.1 holds. 54 Chapter A. Maximum Entropy Path Planning A.3 Proof for Lemma 4.5.2 The following proof is for single-robot case. We will apply the result to multi-robot case. Let xA xi−m:i−1 , xp xi−m−1 and we have: σx2i−m−1 |xi−m:i−1 − σx2i−m−1 |(xi−m:i−1 ,xi ) = σx2p |xA − σx2p |(xA ,xi ) (A.21) H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi ) = H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi ). (A.22) For (A.21), we have: σx2p |xA − σx2p |(xA ,xi ) ≤ σx2p − σx2p |xi = σx2p − (σx2p − ≤ (A.23) K2 (xp , xi ) ) σx2i (A.24) ε2 . σx2i (A.25) The first inequality is from the assumption that the variance reduction σx2p |xA − σx2p |(xA ,xi ) is submodular. And the work of [Das and Kempe, 2008a] shows that if with measurements zxA , the correlation between Zxp and Zxi does not increase, the variance reduction is submodular. If we have the measurements zxA , because the points in A are close to xp , σx2p |xA is small. Due to the small correlation between xp and xi , the point xi can not reduce σx2p |xA any more. As a result, to predict the variance of point xp , the variance will decrease more when adding point xi to an empty set than adding point xi to a larger set. With (2.2), (A.24) can be obtained. Note that the distance between any pair of two points which are from stage i and stage i − m − 1 respectively is at least (m + 1) ∗ ω1 , so max K(xji−m−1 , xji ) should 1≤j,j ≤k 2 }. Hence, (A.25) can be obtained. be less than σs2 exp{− (m+1) 2 2 1 55 Chapter A. Maximum Entropy Path Planning For (A.22), we have: H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi ) σx2p |xA 1 = log 2 2 σxp |(xA ,xi ) (A.26) 2 σx2p |(xA ,xi ) + σε2 1 xi ≤ log 2 2 σxp |(xA ,xi ) = ε2 1 log{1 + 2 } 2 σxp |(xA ,xi ) σx2i ≤ log{1 + ε2 } σn2 (σn2 + σs2 ) (A.27) (A.28) (A.29) With (4.4), (A.26) can be obtained. The second step (A.27) is using the result in (A.25). Then, the result (A.29) can be obtained with the lemma 2.2.1. If there are k robots, each vector x contains k points. Let x1:k indicate the k points of vector x. With the chain rule of entropy, we have: H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi ) (A.30) = {H(Zx1p |ZxA ) − H(Zx1p |ZxA , Zxi )} + . . . + {H(Zxkp |Zx1:k−1 , ZxA ) − H(Zxkp |Zx1:k−1 , ZxA , Zxi )} p p (A.31) k ≤ {H(Zxj |ZxA ) − H(Zxj |ZxA , Zxi )}. j=1 p p (A.32) For each part of (A.31), due to submodularity, the variance reduction will be larger when 56 Chapter A. Maximum Entropy Path Planning adding xi into a smaller set. Then (A.32) can be obtained. For (A.32), we have: H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1:k ) p p i = H(Zxj |ZxA ) − {H(Zxj |ZxA , Zx1 ) p p i + H(Zx2:k |Zxj , ZxA , Zx1 ) − H(Zx2:k |ZxA , Zx1 )} p i i (A.33) i i = H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1 ) p p i + H(Zx2:k |ZxA , Zx1 ) − H(Zx2:k |Zxj , ZxA , Zx1 ) i i i p (A.34) i = H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1 ) p p i k H(Zxj |ZxA , Zx1:j −1 ) − H(Zxj |Zxj , ZxA , Zx1:j −1 ) + j =2 ≤ k ∗ log{1 + i i i p (A.35) i ε2 }. σn2 (σn2 + σs2 ) (A.36) As a result, the entropy decrease H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi ) can be bounded with the following result: H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi ) ≤ k 2 log{1 + ε2 } σn2 (σn2 + σs2 ) (A.37) With a similar proof, we can know that if t ≥ m, H(Zxi−t−1 |Zxi−t:i−1 )−H(Zxi−t−1 |Zxi−t:i−1 , Zxi ) should be less than k 2 log{1 + ε2 2 (σ 2 +σ 2 ) }. σn n s Therefore, the lemma 4.5.2 holds. 57 Chapter A. Maximum Entropy Path Planning A.4 Proof for Corollary 4.5.3 With the chain rule of entropy, for H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 ), we have: H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 ) = H(Zxi |Zxi−m:i−1 ) + H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi ). (A.38) And we also have: H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 ) = H(Zx1:i−m−1 |Zxi−m:i−1 ) + H(Zxi |Zx1:i−m−1 , Zxi−m:i−1 ) = H(Zx1:i−m−1 |Zxi−m:i−1 ) + H(Zxi |Zx1:i−1 ). (A.39) With (A.38) and (A.39), we can get H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 ) = H(Zx1:i−m−1 |Zxi−m:i−1 ) − H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi ). (A.40) For (A.40), with chain rule of entropy, we have: H(Zx1:i−m−1 |Zxi−m:i−1 ) − H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi ) (A.41) i−m−1 H(Zxt |Zxt+1:i−1 ) − H(Zxt |Zxt+1:i−1 , Zxi ) = (A.42) t=1 ≤ (i − m − 1)k 2 log{1 + ε2 }. σn2 (σn2 + σs2 ) (A.43) With lemma 4.5.2, each part in (A.42) can be bounded. Then inequality (A.43) can be obtained. Therefore, corollary 4.5.3 holds. 58 Chapter A. Maximum Entropy Path Planning A.5 Proof for Theorem 4.5.4 ∗ Let xme 1:n be the optimal paths of the MEPP algorithm and x1:n be the optimal paths of iMASP, according to lemma 4.5.1, we have: n ) |Zxme H(Zxme i−m:i−1 i H(Zxme )+ 1:m i=m+1 n ≥ H(Zx∗1:m ) + H(Zx∗i |Zx∗i−m:i−1 ) (A.44) i=m+1 and let n H(Zxme |Zxme ) i i−m:i−1 θ = H(Zxme )+ 1:m i=m+1 n − {H(Zx∗1:m ) + H(Zx∗i |Zx∗i−m:i−1 )}. (A.45) i=m+1 From (A.44), we have θ ≥ 0. With the chain rule of entropy, the entropy decrease H(Zx∗1:n )− H(Zxme ) can be rewritten as: 1:n H(Zx∗1:n ) − H(Zxme ) 1:n n H(Zx∗i |Zx∗1:i−1 ) = H(Zx∗1:m ) + i=m+1 n − {H(Zxme )+ 1:m H(Zxme |Zxme )}. i 1:i−1 (A.46) i=m+1 To apply θ into (A.46), we have to replace each H(Zx∗i |Zx∗1:i−1 ) with H(Zx∗i |Zx∗i−m:i−1 ) and each H(Zxme |Zxme ) with H(Zxme |Zxme ). Let ∆∗i = H(Zx∗i |Zx∗i−m:i−1 ) − H(Zx∗i |Zx∗1:i−1 ) i 1:i−1 i i−m:i−1 me me me me and ∆me i = H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 ), where i = m + 1, . . . , n. (A.46) can be 59 Chapter A. Maximum Entropy Path Planning rewritten as: H(Zx∗1:n ) − H(Zxme ) 1:n n [H(Zx∗i |Zx∗i−m:i−1 ) − ∆∗i ] = H(Zx∗1:m ) + i=m+1 n ) − ∆me |Zxme [H(Zxme i ]} i−m:i−1 i − {H(Zxme )+ 1:m (A.47) i=m+1 n n ∆∗i H(Zx∗i |Zx∗i−m:i−1 ) − = H(Zx∗1:m ) + i=m+1 i=m+1 n n − [H(Z xme 1:m H(Z )+ xme i |Z xme i−m:i−1 ∆me i ] )− (A.48) i=m+1 i=m+1 n = ∗ [∆me i − ∆i ] − θ (A.49) i=m+1 n ≤ ∗ [∆me i − ∆i ] (A.50) i=m+1 n ∆me i . ≤ (A.51) i=m+1 Replacing the θ in (A.48), (A.49) can be obtained. With θ ≥ 0, we can get (A.50) from (A.49). Because each ∆∗i ≥ 0 (m + 1 ≤ i ≤ n) in (A.50), (A.51) can be obtained. For 2 (A.51), with corollary 4.5.3, ∆me i ≤ (i − m − 1)k log{1 + ε2 2 (σ 2 +σ 2 ) }, σn n s where m + 1 ≤ i ≤ n. Hence, for the entropy decrease H(Zx∗1:n ) − H(Zxme ), we have: 1:n H(Zx∗1:n )−H(Zxme ) ≤ (n − m)2 k 2 log{1 + 1:n ε2 }. σn2 (σn2 + σs2 ) (A.52) Therefore, theorem 4.5.4 holds. 60 Appendix B Maximum Mutual Information Path Planning B.1 Proof for Lemma 5.6.1 Given x1:2m , (5.6) and (5.7) can be rewritten as follows: mi V2m+1 (x1:2m ) = = max mi I(Zxm+1 ; Zu1:2m+1 |Zx1:m ) + V2m+2 (x2:2m+1 ) max {I(Zxm+1 ; Zu1:2m+1 |Zx1:m ) x2m+1 ∈X2m+1 x2m+1 ∈X2m+1 + = max x2m+2 ∈X2m+2 max mi [I(Zxm+2 ; Zu2:2m+2 |Zx2:m+1 ) + V2m+3 (x3:2m+2 )]} x2m+1 ∈X2m+1 ,x2m+2 ∈X2m+2 (B.1) (B.2) {I(Zxm+1 ; Zu1:2m+1 |Zx1:m ) mi + I(Zxm+2 ; Zu2:2m+2 |Zx2:m+1 ) + V2m+3 (x3:2m+2 )} (B.3) ... = max x2m+1 ∈X2m+1 ,...,xn ∈Xn { n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )}. (B.4) i=2m+1 61 Chapter B. Maximum Mutual Information Path Planning As a result, given x1:2m , we can get vectors x2m+1 , . . . , xn which have maximum value for n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ). With (5.8), we i=2m+1 have: mi xmi 1:2m = arg max I(Zx1:m ; Zu1:2m ) + V2m+1 (x1:2m ) (B.5) x1:2m ∈X1:2m where X1:2m is the set of all possible x1:2m over first 2m columns. From (5.6) and (5.7), for mi each x1:2m , we can get the value V2m+1 (x1:2m ). With (B.5), the paths x1:n having maximum n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )+I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) I(Zx1:m ; Zu1:2m )+ i=2m+1 can be obtained. Therefore, lemma 5.6.1 holds. B.2 Proof for Other Lemmas Before we show the proof for lemma 5.6.2 and theorem 5.6.3, the following lemmas are needed. Lemma B.2.1. Let ε 2 σs2 exp{− (m+1) }. Given vector xi−m , vector xi−2m:i−m−1 and 2 2 1 vector ui−2m:i , for any vector xi−2m−1 , we have: H(Zxi−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m ) ≤ k 2 log{1 + ε2 } σn2 (σn2 + σs2 ) (B.6) With submodularity and lemma 4.5.2, the result can be easily obtained. As a result, we know that if t ≤ i−2m−1, H(Zxt |Zxt+1:i−m−1 , Zui−2m:i )−H(Zxt |Zxt+1:i−m−1 , Zui−2m:i , Zxi−m ) should be less than k 2 log{1 + Corollary B.2.2. Let ε ε2 2 (σ 2 +σ 2 ) }. σn n s 2 σs2 exp{− (m+1) }. Given vector xi−m , vector xi−2m:i−m−1 and 2 2 1 62 Chapter B. Maximum Mutual Information Path Planning vector ui−2m:i , for any vector ui−2m−1 , we have: H(Zui−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zui−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m ) ≤ (r − k)k log{1 + ε2 } σn2 (σn2 + σs2 ) (B.7) Because the number of rows is r and the number of sampling locations in each column is k, the number of unobserved locations in each column is r − k. Hence, the size of vector ui−2m−1 is r−k. With a similar proof as lemma 4.5.2, the result can be obtained. As a result, if t ≤ i − 2m − 1, H(Zut |Zxi−2m:i−m−1 , Zut+1:i ) − H(Zut |Zxi−2m:i−m−1 , Zut+1:i , Zxi−m ) should 2 be less than (r − k)k log{1 + σ2 (σε2 +σ2 ) }. And if t ≥ i + 1, H(Zut |Zxi−2m:i−m−1 , Zui−2m:t−1 ) − n n s 2 H(Zut |Zxi−2m:i−m−1 , Zui−2m:t−1 , Zxi−m ) should also be less than (r − k)k log{1 + σ2 (σε2 +σ2 ) }. n n s 63 Chapter B. Maximum Mutual Information Path Planning Lemma B.2.3. Let ε 2 }. Given vector xi−m , vector xi−2m:i−m−1 and σs2 exp{− (m+1) 2 2 1 vector ui−2m:i , we have H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) ≤ (n − 2m − 1)rk log{1 + ε2 } + σs2 ) σn2 (σn2 (B.8) Let Zx∆ = Zx1:i−m−1 \Zxi−2m:i−m−1 and Zu∆ = Zu1:n \Zui−2m:i . H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) = H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − [H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) + H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m ) − H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i )] (B.9) = H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m ) (B.10) = [H(Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )] + [H(Zx∆ |Zxi−2m:i−m−1 , Zu1:n ) − H(Zx∆ |Zxi−2m:i−m−1 , Zu1:n , Zxi−m )] (B.11) i−2m−1 [H(Zut |Zxi−2m:i−m−1 , Zut+1:i ) − H(Zut |Zxi−2m:i−m−1 , Zut+1:i , Zxi−m )] = t=1 n [H(Zut |Zxi−2m:i−m−1 , Zu1:t−1 ) − H(Zut |Zxi−2m:i−m−1 , Zu1:t−1 , Zxi−m )] + t=i+1 i−2m−1 [H(Zxt |Zxt+1:i−m−1 , Zu1:n ) − H(Zxt |Zxt+1:i−m−1 , Zu1:n , Zxi−m )] + (B.12) t=1 ≤ [(n − 2m − 1)(r − k)k + (i − 2m − 1)k 2 ] log{1 + ≤ (n − 2m − 1)[(r − k)k + k 2 ] log{1 + = (n − 2m − 1)rk log{1 + ε2 } σn2 (σn2 + σs2 ) ε2 } σn2 (σn2 + σs2 ) ε2 }. σn2 (σn2 + σs2 ) (B.13) (B.14) (B.15) With the chain rule of entropy, (B.9), (B.11) and (B.12) can be obtained. With lemma 64 Chapter B. Maximum Mutual Information Path Planning B.2.1 and corollary B.2.2, each part in (B.12) can be bounded. Then inequality (B.13) can be obtained. Inequality (B.15) can be easily obtained. B.3 Proof For Lemma 5.6.2 With the definition of mutual information, we have: I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) = H(Zxi−m |Zx1:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) (B.16) I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) = H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ). (B.17) With (B.16) and (B.17), we have: I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) − I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) = Ai−m − Bi−m (B.18) Ai−m = H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) (B.19) Bi−m = H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 ). (B.20) With lemma B.2.3, the value of Ai−m can be bouned by: Ai−m ≤ (n − 2m − 1)rk log{1 + ε2 }. σn2 (σn2 + σs2 ) (B.21) With corollary 4.5.3, the value of Bi−m can be bounded by: Bi−m ≤ (i − 2m − 1)k 2 log{1 + ε2 }. + σs2 ) σn2 (σn2 (B.22) Therefore, the lemma 5.6.2 holds. 65 Chapter B. Maximum Mutual Information Path Planning B.4 Proof For Theorem 5.6.3 2 Let xmi 1:n be the optimal paths of the M IPP algorithm and x1:n be the optimal paths of the exhaustive algorithm, according to lemma 5.6.1, we have: n−1 I(Zxmi ; Zumi I(Zxmi ; Zumi )+ 1:m 1:2m i−m i=2m+1 i−2m:i |Zxmi i−2m:i−m−1 ) + I(Zxmi ; Zumi |Zxmi )≥ n−m:n n−2m:n n−2m:n−m−1 n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) I(Zx1:m ; Zu1:2m ) + i=2m+1 + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ), (B.23) and let n−1 θ = I(Zxmi ; Zumi )+ 1:m 1:2m I(Zxmi ; Zumi i=2m+1 i−m i−2m:i |Zxmi i−2m:i−m−1 ) + I(Zxmi ; Zumi |Zxmi )− n−m:n n−2m:n n−2m:n−m−1 n−1 I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) [I(Zx1:m ; Zu1:2m ) + i=2m+1 + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )]. (B.24) 66 Chapter B. Maximum Mutual Information Path Planning From (B.23), we have θ ≥ 0. For the mutual information decrease I(Zx1:n , Zu1:n ) − I(Zxmi , Zumi ), it can be rewritten as: 1:n 1:n I(Zx1:n , Zu1:n ) − I(Zxmi , Zumi ) 1:n 1:n n−1 I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) = I(Zx1:m ; Zu1:n ) + i=2m+1 + I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 ) n−1 |Zxmi I(Zxmi ; Zumi 1:n − [I(Zxmi ; Zumi )+ 1:m 1:n i=2m+1 i−m 1:i−m−1 ) + I(Zxmi ; Zumi |Zxmi )]. n−m:n 1:n 1:n−m−1 (B.25) To apply θ into the (B.25), we have to replace I(Zx1:m ; Zu1:n ) with I(Zx1:m ; Zu1:2m ), each I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) with I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) where 2m + 1 ≤ i ≤ n − 1 and I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 ) with I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ). With the definition of mutual information, we have : I(Zx1:m ; Zu1:n ) = H(Zx1:m ) − H(Zx1:m |Zu1:n ) (B.26) I(Zx1:m ; Zu1:2m ) = H(Zx1:m ) − H(Zx1:m |Zu1:2m ). (B.27) With the chain rule of entropy, we have: H(Zx1:m |Zu1:2m ) − H(Zx1:m |Zu1:n ) = H(Zx1 |Zu1:2m ) − H(Zx1 |Zu1:n ) + . . . + H(Zxm |Zx1:m−1 , Zu1:2m ) − H(Zxm |Zx1:m−1 , Zu1:n ) (B.28) m H(Zxt |Zx1:t−1 , Zu1:2m ) − H(Zxt |Zx1:t−1 , Zu1:n ) = (B.29) t=1 ≤ m(n − 2m)(r − k)k log{1 + ε2 }. σn2 (σn2 + σs2 ) (B.30) With a similar proof as lemma B.2.3, each part in (B.29) can be bounded. Then, inequality 67 Chapter B. Maximum Mutual Information Path Planning (B.30) can be obtained. Apply (B.30) to (B.26) and (B.27), we have: I(Zx1:m ; Zu1:n ) − I(Zx1:m ; Zu1:2m ) = A1:m (B.31) where A1:m ≤ m(n − 2m)(r − k)k log{1 + ε2 }. σn2 (σn2 + σs2 ) (B.32) With the lemma 5.6.2, each I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) can be replaced with I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ), where 2m + 1 ≤ i ≤ n − 1. With the definition of mutual information, we have: I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 ) = H(Zxn−m:n |Zx1:n−m−1 ) − H(Zxn−m:n |Zx1:n−m−1 , Zu1:n ) (B.33) I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) = H(Zxn−m:n |Zxn−2m:n−m−1 ) − H(Zxn−m:n |Zxn−2m:n−m−1 , Zun−2m:n ). (B.34) With the chain rule of entropy , we have: H(Zxn−m:n |Zxn−2m:n−m−1 ) − H(Zxn−m:n |Zx1:n−m−1 ) n H(Zxt |Zxn−2m:t−1 ) − H(Zxt |Zx1:t−1 ) = (B.35) t=n−m ≤ (m + 1)(n − 2m − 1)k 2 log{1 + ε2 }. σn2 (σn2 + σs2 ) (B.36) With corollary 4.5.3, inequality (B.36) can be obtained. 68 Chapter B. Maximum Mutual Information Path Planning With the chain rule of entropy , we have: H(Zxn−m:n |Zxn−2m:n−m−1 , Zun−2m:n ) − H(Zxn−m:n |Zx1:n−m−1 , Zu1:n ) n {H(Zxt |Zxn−2m:t−1 , Zun−2m:n ) = t=n−m − H(Zxt |Zx1:t−1 , Zu1:n )} ≤ (m + 1)(n − 2m − 1)rk log{1 + ε2 }. σn2 (σn2 + σs2 ) (B.37) (B.38) With a similar proof as lemma B.2.3, inequality (B.38) can be obtained. Apply the results (B.36) and (B.38) to (B.33) and (B.34), we have: I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 ) − I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) = An−m:n − Bn−m:n (B.39) where ε2 } σn2 (σn2 + σs2 ) ε2 ≤ (m + 1)(n − 2m − 1)k 2 log{1 + 2 2 }. σn (σn + σs2 ) An−m:n ≤ (m + 1)(n − 2m − 1)rk log{1 + (B.40) Bn−m:n (B.41) 69 Chapter B. Maximum Mutual Information Path Planning With above results, (B.25) can be rewritten as: I(Zx1:n ; Zu1:n ) − I(Zxmi ; Zumi ) 1:n 1:n = A1:m + I(Zx1:m ; Zu1:2m ) n−1 [Ai−m − Bi−m + I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )] + i=2m+1 + [An−m:n − Bn−m:n + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )] − {Ami ; Zumi ) 1:m + I(Zxmi 1:m 1:2m n−1 mi [Ami i−m − Bi−m + I(Zxmi ; Zumi + i−m i=2m+1 i−2m:i |Zxmi i−2m:i−m−1 )] mi + [Ami ; Zumi |Zxmi )]} n−m:n − Bn−m:n + I(Zxmi n−m:n n−2m:n n−2m:n−m−1 (B.42) n−1 (Ai−m − Bi−m ) + (An−m:n − Bn−m:n ) = A1:m + i=2m+1 n−1 mi mi mi (Ami i−m − Bi−m ) + (An−m:n − Bn−m:n )] − θ − [Ami 1:m + (B.43) i=2m+1 n−1 n−1 ≤ A1:m + mi mi Bi−m + Bn−m:n Ai−m + An−m:n + i=2m+1 i=2m+1 n−1 n−1 mi Ami i−m + An−m:n + − [Ami 1:m + i=2m+1 Bi−m + Bn−m:n ] (B.44) i=2m+1 70 Chapter B. Maximum Mutual Information Path Planning n−1 n−1 ≤ A1:m + mi mi Bi−m + Bn−m:n Ai−m + An−m:n + i=2m+1 (B.45) i=2m+1 ≤ [m(n − 2m)(r − k)k + (n − 2m − 1 + m + 1)(n − 2m − 1)rk ε2 1 } + (n − 2m − 1)(n − 2m − 2)k 2 + (m + 1)(n − 2m − 1)k 2 )] log{1 + 2 2 2 σn (σn + σs2 ) (B.46) ≤ [m(n − 2m)(r − k)k + (n − m)(n − 2m)rk 1 ε2 + (n − 2m)(n − 2m − 2)k 2 + (m + 1)(n − 2m)k 2 )] log{1 + 2 2 } 2 σn (σn + σs2 ) (B.47) = [m(n − 2m)(r − k)k + (n − m)(n − 2m)rk 1 ε2 + (n − 2m)(n − 2m)k 2 + m(n − 2m)k 2 )] log{1 + 2 2 } 2 σn (σn + σs2 ) 1 ε2 = [mr + (n − m)r + (n − 2m)k](n − 2m)k log{1 + 2 2 } 2 σn (σn + σs2 ) 1 ε2 = [nr + (n − 2m)k](n − 2m)k log{1 + 2 2 }. 2 σn (σn + σs2 ) (B.48) (B.49) (B.50) With results (B.31), (B.39) and lemma 5.6.2, (B.42) can be obtained. Applying θ in (B.42), (B.43) can be obtained. Due to θ ≥ 0, inequality (B.44) can be obtained. With results (B.32), (B.40), (B.41) and lemma 5.6.2, inequalities (B.45) and (B.46) can be obtained. Other inequalities can be obtained easily. Then, the value I(Zx1:n ; Zu1:n ) − I(Zxmi ; Zumi ) can be bounded with: 1:n 1:n ; Zumi ) I(Zx1:n ; Zu1:n ) − I(Zxmi 1:n 1:n ε2 1 }. ≤ [nr + (n − 2m)k](n − 2m)k log{1 + 2 2 2 σn (σn + σs2 ) Therefore, theorem 5.6.3 holds. 71 Bibliography [Austerlitz et al., 2007] F. Austerlitz, C. Dutech, P. E. Smouse, F. Davis, and V. L. Sork. Estimating anisotropic pollen dispersal: a case study in quercus lobata. Heredity, 99:193– 204, 2007. [Batalin et al., 2004] M. A. Batalin, M. Rahimi, Y. Yu, D. Liu, A. Kansal, G. S. Sukhatme, W. J. Kaiser, M. Hansen, G. J. Pottie, M. Srivastava, and D. Estrin. Call and response: Experiments in sampling the environment. In Proc. SenSys, pages 25–38, 2004. [Binney et al., 2010] J. Binney, A. Krause, and G. S. Sukhatme. Informative path planning for an autonomous underwater vehicle. In Proc. ICRA, pages 4791–4796, 2010. [Boisvert and Deutsch, 2011] J. B. Boisvert and C. V. Deutsch. Modeling locally varying anisotropy of CO2 emissions in the United States. Stoch. Environ. Res. Risk Assess., 25:1077–1084, 2011. [Budrikait˙e and Duˇcinskas, 2005] L. Budrikait˙e and K. Duˇcinskas. Modelling of geometric anisotropic spatial variation. In Proc. 10th International Conference on Mathematical Modelling and Analysis, pages 361–366, 2005. [Das and Kempe, 2008a] A. Das and D. Kempe. Algorithms for subset selection in linear regression. In Proc. STOC, pages 45–54, 2008. [Das and Kempe, 2008b] A. Das and D. Kempe. Sensor selection for minimizing worst-case prediction error. In Proc. IPSN, pages 97–108, 2008. 72 BIBLIOGRAPHY [Franklin and Mills, 2007] R. B. Franklin and A. L. Mills. Statistical analysis of spatial structure in microbial communities. In R. B. Franklin and A. L. Mills, editors, The Spatial Distribution of Microbes in rhe Environment, pages 31–60. Springer, 2007. [Garnett et al., 2010] R. Garnett, M. A. Osborne, and S. J. Roberts. Bayesian optimization for sensor set selection. In Proc. IPSN, pages 209–219, 2010. [Guestrin et al., 2005] C. Guestrin, A. Krause, and A. Singh. Near-optimal sensor placements in gaussian processes. In Proc. ICML, pages 265–272, August 2005. [Hosoda and Kawamura, 2005] K. Hosoda and H. Kawamura. Seasonal variation of space/time statistics of short-term sea surface temperature variability in the Kuroshio region. J. Oceanography, 61(4):709–720, 2005. [Ko et al., 1995] C. Ko, J. Lee, and M. Queyranne. An exact algorithm for maximum entropy sampling. Ops Research, 43:684–691, 1995. [Korf, 1990] R. E. Korf. Real-time heuristic search. Artificial Intelligence, 42(2-3):189 – 211, 1990. [Krause et al., 2006] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements: maximizing information while minimizing communication cost. In Proc. IPSN, pages 2–10, 2006. [Low et al., 2007] K. H. Low, G. Gorden, J. M. Dolan, and P. Khosla. Adaptive sampling for multi-robot wide-area exploration. In Proc. ICRA, pages 755–760, 2007. [Low et al., 2008] K. H. Low, J. M. Dolan, and P. Khosla. Adaptive multi-robot wide-area exploration and mapping. In Proc. AAMAS, pages 23–30, 2008. [Low et al., 2009] K. H. Low, J. M. Dolan, and P. Khosla. Information-theoretic approach to efficient adaptive path planning for mobile robotic environmental sensing. In Proc. ICAPS, September 2009. 73 BIBLIOGRAPHY [Low et al., 2011] K. H. Low, J. M. Dolan, and P. Khosla. Active markov informationtheoretic path planning for robotic environmental sensing. In Proc. AAMAS, May 2011. [Low, 2009] K. H. Low. Multi-Robot Adaptive Exploration and Mapping for Environmental Sensing Applications. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA, 2009. [Lynch and McGillicuddy Jr., 2001] D. R. Lynch and D. J. McGillicuddy Jr. Objective analysis for coastal regimes. Continental Shelf Research, 21:1299–1315, 2001. [McBratney et al., 1981] A. B. McBratney, R. Webster, and T. M. Burgess. The design of optimal sampling schemes for local estimation and mapping of regionalized variables – I: Theory and method. Computers & Geosciences, 7(4):331–334, 1981. [McGrath et al., 2004] D. McGrath, C. Zhang, and O. T. Carton. Geostatistical analyses and hazard assessment on soil lead in Silvermines area, Ireland. Environmental Pollution, 127:239–248, 2004. [Meliou et al., 2007] A. Meliou, A. Krause, C. Guestrin, and J. M. Hellerstein. Nonmyopic informative path planning in spatio-temporal models. In Proc. AAAI, pages 602–607, 2007. [Popa et al., 2006] D. O. Popa, M. F. Mysorewala, and F. L. Lewis. EKF-based adaptive sampling with mobile robotic sensor nodes. In Proc. IROS, pages 2451 – 2456, 2006. [Prudhomme and Reed, 1999] C. Prudhomme and D. W. Reed. Mapping extreme rainfall in a mountainous region using geostatistical techniques: A case study in Scotland. Int. J. Climatol., 19:1337–1356, 1999. [Rabesiranana et al., 2009] N. Rabesiranana, M. Rasolonirina, A. F. Solonjara, and R. Andriambololona. Investigating the spatial anisotropy of soil radioactivity in the region of Vinaninkarena, Antsirabe - Madagascar. In Proc. 4th High-Energy Physics International Conference, 2009. 74 BIBLIOGRAPHY [Rahimi et al., 2003] M. Rahimi, R. Pon, W. J. Kaiser, G. S. Sukhatme, D. Estrin, and M. Srivastava. Adaptive sampling for environmental robotics. In Proc. ICRA, pages 3537–3544, 2003. [Rahimi et al., 2005] M. Rahimi, M. Hansen, W. J. Kaiser, G. S. Sukhatme, and D. Estrin. Adaptive sampling for environmental field estimation using robotic sensors. In Proc. IROS, pages 3692–3698, 2005. [Rasmussen and Williams, 2006] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press,Cambridge, MA, 2006. [Rivest et al., 2012] M. Rivest, D. Marcotte, and P. Pasquier. Sparse data integration for the interpolation of concentration measurements using kriging in natural coordinates. J. Hydrology, 416-417:72–82, 2012. [Rudnick et al., 2004] D. L. Rudnick, R. E. Davis, C. C. Eriksen, D. Fratantoni, and M. J. Perry. Underwater gliders for ocean research. In Mar. Technol. Soc, 38(2), pages 73–84, 2004. [Samal et al., 2011] A. R. Samal, R. R. Sengupta, and R. H. Fifarek. Modelling spatial anisotropy of gold concentration data using GIS-based interpolated maps and variogram analysis: Implications for structural control of mineralization. J. Earth Syst. Sci., 120(4):583–593, 2011. [Sánchez et al., 2011] J. M. C. Sánchez, D. F. Greene, and M. Quesada. A field test of inverse modeling of seed dispersal. Amer. J. Botany, 98(4):698–703, 2011. [Shewry and Wynn, 1987] M. C. Shewry and H. P. Wynn. Maximum entropy sampling. Journal of Applied Statistics, 14:165–170, 1987. [Singh et al., 2006] A. Singh, R. Nowak, and P. Ramanathan. Active learning for adaptive mobile sensing networks. In Proc. IPSN, pages 60–68, 2006. 75 BIBLIOGRAPHY [Singh et al., 2007] A. Singh, A. Krause, C. Guestrin, W. Kaiser, and M. Batalin. Efficient planning of informative paths for multiple robots. In Proc. IJCAI, January 2007. [Singh et al., 2009] A. Singh, A. Krause, and W. J. Kaiser. Nonmyopic adaptive informative path planning for multiple robots. In Proc. IJCAI, pages 1843–1850, 2009. [St˚ ahl et al., 2000] G. St˚ ahl, A. Ringvall, and T. L¨ am˚ as. Guided transect sampling for assessing sparse populations. Forest Science Washington, 46:108–115, 2000. [Thompson and Wettergreen, 2008] D. R. Thompson and D. Wettergreen. Intelligent maps for autonomous kilometer-scale science survey. In Proc. iSAIRAS, February 2008. [Wackernagel, 2009] H. Wackernagel. Geostatistics for Gaussian processes. In Proc. NIPS Workshop on Kernels for Multiple Outputs and Multi-Task Learning: Frequentist and Bayesian Points of View, 2009. [Ward and Jasieniuk, 2009] S. M. Ward and M. Jasieniuk. Review: Sampling weedy and invasive plant populations for genetic diversity analysis. Weed Science, 57(6):593–602, 2009. [Webster and Oliver, 2007] R. Webster and M. Oliver. Geostatistics for Environmental Scientists. John Wiley and Sons, Inc., 2007. [Wu et al., 2005] J. Wu, C. Zheng, and C. C. Chien. Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological conditions. Journal of Contaminant Hydrology, 77:41–65, 2005. [Xiao et al., 2004] X. Xiao, G. Gertner, G. Wang, and A. B. Anderson. Optimal sampling scheme for estimation landscape mapping of vegetation cover. Landscape Ecology, 20(4):375–387, 2004. [Zhang and Sukhatme, 2007] B. Zhang and G. S. Sukhatme. Adaptive sampling for estimating a scalar field using a robotic boat and a sensor network. In Proc. ICRA, pages 3673–3680, 2007. 76 BIBLIOGRAPHY [Zhang et al., 2011] J. G. Zhang, H. S. Chen, Y. R. Su, X. L. Kong, W. Zhang, Y. Shi, H. B. Liang, and G. M. Shen. Spatial variability and patterns of surface soil moisture in a field plot of karst area in southwest China. Plant Soil. Environ., 57(9):409–417, 2011. 77 [...]... multiple robots, a large sampling task can be completed easily and fast 18 Chapter 4 Maximum Entropy Path Planning In this chapter, we propose the MEPP (Maximum Entropy Path Planning) algorithm, which can find the paths with maximum entropy Before presenting our own work, we introduce the information- theoretic Multi- Robot Adaptive Sampling Problem (iMASP) Although the optimal paths can be theoretically... a path for single robot For a small sampling task, single robot is easy to coordinate and deploy However, it will be difficult for single robot to accomplish a large sampling task Instead, our work like those in [Singh et al., 2007; Low et al., 2007; Low et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010; Low et al., 2011] can generate multiple paths for multiple robots With multiple... Entropy Path Planning (MEPP) algorithm: A polynomialtime approximation algorithm, MEPP, is proposed to find the maximum entropy paths We also provide a theoretical performance guarantee on the sampling performance of the MEPP algorithm for a class of exploration tasks called transect sampling task • Formalization of Maximum Mutual Information Path Planning (M2 IPP) algorithm: For maximum mutual information. .. the robots will perform the transect sampling task So the travelling cost of each robot is the horizontal length of the field And the action space for each robot is limited Multiple robots will be applied to explore the field We assume that the number of robots will be less than the number of sampling locations in each column Our proposed algorithms will find the paths with maximum entropy and the paths... joint entropy of the optimal paths of the MEPP algorithm is close to the optimal paths of iMASP The following theorem bounds the entropy decrease between the 24 Chapter 4 Maximum Entropy Path Planning ∗ optimal paths xme 1:n of the MEPP algorithm and the optimal paths x1:n of iMASP: ∗ Theorem 4.5.4 Let xme 1:n be the optimal paths of the MEPP algorithm and x1:n be the optimal paths of iMASP Let ε 2 },... chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual Information Path Planning) , to find the paths with maximum mutual information Similar to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal paths, the time complexity will exponentially increase with the length of planning horizon In the previous chapter, we have proposed the MEPP algorithm with... above Secondly, these work did not consider the computational efficiency of planning In robotics community, the work of [Low et al., 2009] has defined the information- theoretic Multi- Robot Adaptive Sampling Problem (iMASP) However, for any environmental field, the time complexity of iMASP exponentially increases with the length of planning horizon To reduce the time complexity, the work of [Low et al.,... non-myopic algorithm, the MEPP, to find the paths with maximum entropy 2.3.2 Mutual Information Another metric, mutual information, is also proposed to measure the informativeness of observation paths Given observation paths P and unobserved part X \P, the mutual information between ZP and ZX \P is: I(ZP ; ZX \P ) = H(ZX \P ) − H(ZX \P |ZP ) (2.11) Based on the mutual information, the problem can be formalized... is the set of all possible paths in the field With (2.4) and (2.7), the mutual information for paths P can be evaluated in closed form In this thesis, we will present another non-myopic algorithm, the M2 IPP, to find the paths with maximum mutual information 14 Chapter 3 Related Work To monitor an environmental field, the robots need to sample locations which can give more information about the measurement... of robots, (d) using a large value of m In particular, for anisotropic fields, if the robots along the small correlated direction, the value of ε will be small As a result, we can use a small m which incur little planning time to bound the sampling performance 25 Chapter 5 Maximum Mutual Information Path Planning In this chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual Information ... al., 2011] can generate multiple paths for multiple robots With multiple robots, a large sampling task can be completed easily and fast 18 Chapter Maximum Entropy Path Planning In this chapter,... MEPP (Maximum Entropy Path Planning) algorithm, which can find the paths with maximum entropy Before presenting our own work, we introduce the information- theoretic Multi- Robot Adaptive Sampling... Mutual Information Path Planning) , to find the paths with maximum mutual information Similar to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal paths, the time

Định dạng
Số trang	84
Dung lượng	891,34 KB