Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 84 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
84
Dung lượng
891,34 KB
Nội dung
Information-Theoretic Multi-Robot Path Planning
Cao Nannan
(B.Sc., East China Normal University, 2009)
A THESIS SUBMITTED
FOR THE DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF COMPUTE SCIENCE
SCHOOL OF COMPUTING
NATIONAL UNIVERSITY OF SINGAPORE
2012
DECLARATION
I hereby declare that this thesis is my original work and it has been written by me in its
entirety. I have duly acknowledged all the sources of information which have been used
in the thesis. This thesis has also not been submitted for any degree in any university
previously.
Name:
Date:
Acknowledgements
First of all, I am grateful to God for his great mercy, immeasurable love and consistent
guidance.
Second, I want to express my sincere gratitude to my supervisor, Assist. Prof. Low
Kian Hsiang. During the period we work together, not only he share with me a lot of
knowledge, but he also teach me how to work carefully and seriously. Without him, there
would be no this thesis. I really appreciate his patience and support.
Third, I want to thank all fellow brothers and sisters Zeng Yong, Luochen, Kang Wei,
Xiao Qian, Prof. Tan and Zhengkui who always love me as a younger brother in family.
And I really enjoy the fellowship time when we study bible and worship together. I also
want to thank all friends in AI 1 lab and AI 3 lab, especially Lim Zhanwei, Ye Nan, Bai
Haoyu, Xu Nuo, Chen Jie, Trong Nghia Hoang, Jiangbo and Ruofei who have helped me
to check and revise my thesis.
Last but not least, I would like to thank my parents who always support me and
encourage me when I need.
i
Contents
List of Tables
1
List of Figures
2
1 Introduction
4
1.1
Motivation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2
Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.3
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2 Background
9
2.1
Transect Sampling Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2
Gaussian Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.3
Entropy and Mutual Information . . . . . . . . . . . . . . . . . . . . . . . .
13
3 Related Work
15
3.1
Design-based vs. Model-based Strategies . . . . . . . . . . . . . . . . . . . .
15
3.2
Polynomial-time vs. Non-polynomial-time Strategies . . . . . . . . . . . . .
16
3.3
Non-guaranteed vs. Performance-guaranteed Sampling Paths . . . . . . . .
17
3.4
Multi-robot vs. Single-robot Strategies . . . . . . . . . . . . . . . . . . . . .
17
4 Maximum Entropy Path Planning
19
4.1
Notations and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
4.2
iMASP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
ii
4.3
MEPP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
4.4
Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
4.5
Performance Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
5 Maximum Mutual Information Path Planning
26
5.1
Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
5.2
Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
5.3
Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.4
M2 IPP Algorithm
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
5.5
Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
5.6
Performance Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
6 Experimental Results
35
6.1
Data Sets and Performance Metrics . . . . . . . . . . . . . . . . . . . . . . .
35
6.2
Temperature Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
6.3
Plankton Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
6.4
Time Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
6.5
Criterion Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
7 Conclusions
50
Appendices
51
A Maximum Entropy Path Planning
52
A.1 Proof for Lemma 2.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
A.2 Proof for Lemma 4.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
A.3 Proof for Lemma 4.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
A.4 Proof for Corollary 4.5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
A.5 Proof for Theorem 4.5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
iii
B Maximum Mutual Information Path Planning
61
B.1 Proof for Lemma 5.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
B.2 Proof for Other Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
B.3 Proof For Lemma 5.6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
B.4 Proof For Theorem 5.6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
Bibliography
72
iv
Abstract
Research in environmental sensing and monitoring is especially important in supporting environmental sustainability efforts worldwide, and has recently attracted significant attention and interest.
A key direction of this research lies in modeling and predicting the spatiotemporally varying environmental phenomena. One approach is to use a team of robots to sample the area and model the
measurement values at unobserved points. For smoothly varying and hot-spot fields, there is some
work which has been done to model the fields well. However, there is still a class of common environmental fields called anisotropic fields in which the spatial phenomena are highly correlated along
one direction and less correlated along the perpendicular direction. We exploit the environmental
structure to improve the sampling performance and time efficiency of planning for anisotropic fields.
In this thesis, we cast the planning problem into a stagewise decision-theoretic problem. we
adopt Gaussian Process to model spatial phenomena. Maximum entropy criterion and maximum
mutual information criterion are used to measure the informativeness of the observation paths. It is
found that for many GPs, correlation of two points exponentially decreases with the distance between
the two points. With this property, for maximum entropy criterion, we propose a polynomial-time
approximation algorithm, MEPP, to find the maximum entropy paths. We also provide a theoretical
performance guarantee for this algorithm. For maximum mutual information criterion, we propose
another polynomial-time approximation algorithm, M2 IPP. Similar to the MEPP, a performance
guarantee is also provided for this algorithm. We demonstrate the performance advantages of our
algorithms on two real data sets. To get lower prediction error, three priciples have also been
proposed to select the criterion for different environmental fields.
v
List of Tables
3.1
Comparisons of different exploration strategies (DB: design-based, MB: modelbased, PT: polynomial-time NP: non-polynomial-time, NO: non-optimized,
NG: non-guaranteed, PG: performance-guaranteed, UP: unknown-performance
MR: multi-robot SR: single-robot). . . . . . . . . . . . . . . . . . . . . . . .
16
1
List of Figures
1.1
The density of chlorophyll-a in Gulf of Mexico. The values along the coastal
line are close to each other, which is highly correlated. The values along the
perpendicular direction changes a lot, which is less correlated. . . . . . . . .
5
2.1
Transect sampling task in a temperature field. . . . . . . . . . . . . . . . . .
10
2.2
The value of K(p1 , p2 ) exponentially decreases to zero and the posterior variance σp21 |p2 exponentially increases to prior variance as the distance between
point p1 and point p2 linearly increases. . . . . . . . . . . . . . . . . . . . .
5.1
12
Visualization of applying m-order Markov property to maximum mutual information criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.2
Visualization of the approximation method of the M2 IPP algorithm. . . . .
29
6.1
Temperature fields which distributed over 25m × 150m are discretized into
5 × 30 grids with learned hyper-parameters. . . . . . . . . . . . . . . . . . .
6.2
Plankton density field which distributed over 314m×1765m is discretized into
a 8 × 45 grid with
1
= 27.5273 m,
2
= 134.6415 m, σs2 = 1.4670, and σn2 =
0.2023. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3
36
The results of ENT(π) for different algorithms with different number of robots
on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4
36
40
The results of MI(π) for different algorithms with different number of robots
on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
2
LIST OF FIGURES
6.5
The results of ERR(π) for different algorithms with different number of robots
on the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6
The results of ENT(π) for different algorithms with different number of robots
on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7
45
The results of ERR(π) for different algorithms with different number of robots
on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9
44
The results of MI(π) for different algorithms with different number of robots
on the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8
43
46
The running time of different algorithms with different number of robots on
the temperature fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
6.10 The running time of different algorithms with different number of robots on
the plankton density field. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
6.11 Sampling points selected by different criteria. . . . . . . . . . . . . . . . . .
47
3
Chapter 1
Introduction
1.1
Motivation
Research in environmental sensing and monitoring are especially important in supporting
environmental sustainability efforts worldwide and has recently gained significant attention
and practical interest. A key direction of this research lies in modeling and predicting the
spatiotemporally varying environmental phenomena, which affects our natural and built-up
habitats for aiding our understanding of them and the decision making of the policy makers.
The spatiotemporal structures and properties of phenomena vary with the environmental
physical/biochemical conditions. For example, the phenomena in some environmental fields
could be smoothly varying. Other environmental fields could have a few hot spots. Due to
different spatiotemporal structures, the sampling performance of different classes of sampling strategies will be different. The work of [Low, 2009] shows that adaptive sampling
can exploit the hotspots well and non-adaptive sampling can map the smoothly varying
environmental fields accurately. However, there is still a class of common environmental
fields called anisotropic fields in which the spatial phenomena are highly correlated along
one direction and less correlated along the perpendicular direction (e.g. Fig. 1.1). Due to
ocean current, the anisotropic fields can be easily found in ocean phenomena. Typically,
the anisotropic fields can be found in following spatial phenomena:
4
Chapter 1. Introduction
0
35
20
30
40
60
25
80
20
100
15
120
10
140
5
160
0
180
0
20
40
60
80
100
120
140
160
180
Figure 1.1: The density of chlorophyll-a in Gulf of Mexico. The values along the coastal line
are close to each other, which is highly correlated. The values along the perpendicular direction
changes a lot, which is less correlated.
1. Ocean phenomena: phytoplankton concentration [Franklin and Mills, 2007], sea surface temperature [Hosoda and Kawamura, 2005], salinity field [Budrikait˙e and Duˇcinskas,
2005] and velocity field of ocean current [Lynch and McGillicuddy Jr., 2001];
2. Soil phenomena: heavy mental concentration [McGrath et al., 2004], surface soil moisture [Zhang et al., 2011], soil radioactivity [Rabesiranana et al., 2009] and gold concentrations [Samal et al., 2011];
3. Biological phenomena: pollen dispersal [Austerlitz et al., 2007], seed dispersal [S´anchez
5
Chapter 1. Introduction
et al., 2011];
4. Other phenomena: rainfall [Prudhomme and Reed, 1999], groundwater contaminant
plumes [Rivest et al., 2012; Wu et al., 2005], air pollution [Boisvert and Deutsch,
2011].
So, for this class of environmental fields, how can we exploit the environmental structure
to improve sampling performance?
To monitor an environmental field in the ocean, land or forest, some work has been
done to find the most informative set of static sensor placements [Guestrin et al., 2005;
Krause et al., 2006; Das and Kempe, 2008b; Garnett et al., 2010]. However, if the area to
monitor is very large, the number of sensors required will be large. For some applications,
such as monitoring plankton bloom in ocean or pH value in a river, the movement of water
discourages static sensor placements as well. In contrast, a team of robots (e.g., unmanned
aerial vehicles, autonomous underwater vehicles [Rudnick et al., 2004]) which can move
around to sample the area will be a desirable solution. To explore an environmental field,
planning sampling paths for the robots become the fundamental problem. However, the
work of [Ko et al., 1995; Guestrin et al., 2005] shows that the problem of selecting the most
informative set of static points is NP-complete. And we are not aware of any work which
can find the most informative paths in polynomial time without strong assumption. So, for
anisotropic fields, can we also exploit the environmental structure to improve time efficiency
of planning?
1.2
Objective
To provide exploration strategy for multiple robots, this thesis aims to address the following
issue:
How can we exploit the environmental structure to improve sampling performance as well as the time efficiency of planning for anisotropic fields?
In statistics community, some work [McBratney et al., 1981; Xiao et al., 2004; Webster
6
Chapter 1. Introduction
and Oliver, 2007; Ward and Jasieniuk, 2009; Wackernagel, 2009] has been done to do sampling design for anisotropic fields. To tackle anisotropic effects, these work will adjust grid
spacing so that the less correlated direction will be sampled more than other directions.
However, firstly, these strategies are all for static sensors. As a result, they suffers from
the disadvantages of static sensors which has been stated above. Secondly, these work did
not consider the computational efficiency of planning. In robotics community, the work of
[Low et al., 2009] has defined the information-theoretic Multi-Robot Adaptive Sampling
Problem (iMASP). However, for any environmental field, the time complexity of iMASP
exponentially increases with the length of planning horizon. To reduce the time complexity, the work of [Low et al., 2011] has assumed that the measurements in next stage only
depends on the measurements in current stage. However, for the fields which have large
correlations, this assumption is too strong. The work of [Singh et al., 2007] has proposed
a quasi-polynomial algorithm to find the most informative paths with specified budget of
cost. They proposed two heuristics, spatial-decomposition and branch-and-bound search,
to reduce time complexity. However, spatial-decomposition violates the continuous spatial
correlations of environmental fields. And no performance guarantee is provided for the
branch-and-bound search algorithm.
1.3
Contributions
To do point sampling and prediction, environmental fields are discretized into grids. The
planning problem is cast into a stagewise decision-theoretic problem. With sampled observations, we adopt Gaussian Process [Rasmussen and Williams, 2006] to model spatial
phenomena. Maximum entropy criterion [Shewry and Wynn, 1987] and maximum mutual
information criterion [Guestrin et al., 2005] are proposed to measure the informativeness of
observation paths. It is found that for many GPs, correlation of two points exponentially
decreases with the distance between the two points. With this property, our work propose
two information-theoretic algorithms which can trade off between sampling performance
and time complexity. Especially, for anisotropic fields, if the robots explore the field along
7
Chapter 1. Introduction
the small correlated direction, our algorithms can guarantee the sampling performance of
observation paths with little planning time. The specific contributions of the thesis include:
• Formalization of Maximum Entropy Path Planning (MEPP) algorithm: A polynomialtime approximation algorithm, MEPP, is proposed to find the maximum entropy
paths. We also provide a theoretical performance guarantee on the sampling performance of the MEPP algorithm for a class of exploration tasks called transect sampling
task.
• Formalization of Maximum Mutual Information Path Planning (M2 IPP) algorithm:
For maximum mutual information criterion, we propose another polynomial-time approximation algorithm, M2 IPP. A theoretical performance guarantee on the sampling
performance of the M2 IPP algorithm for the transect sampling task is provided as
well.
• Evaluation of performance: We evaluate the sampling performance of our proposed
algorithms on two real-world data sets. The performance is measured with three metrics: entropy, mutual information and prediction error. The results of our algorithms
demonstrate advantages over other state-of-the-art algorithms.
This thesis will be organized as follows. In chapter 2, some background is reviewed. In
chapter 3, related work on exploration strategy is provided. In chapters 4 and 5, our two
proposed algorithms are explained in detail. In chapter 6, experiments on two real-world
data sets are presented. And we conclude this thesis in chapter 7.
8
Chapter 2
Background
In this chapter, we review some background to formalize our problem. In section 2.1, a
class of exploration tasks called transect sampling task to which our algorithms can be
applied is presented. With sampled observations, we adopt Gaussian Process to model the
environmental field, which is reviewed in section 2.2. Entropy and mutual information are
used to measure the informativeness of the sampling paths, which are reviewed in section
2.3.
2.1
Transect Sampling Task
For a discretized unobserved field, the transect sampling task [St˚
ahl et al., 2000; Thompson and Wettergreen, 2008] assumes that the number of columns is much larger than the
number of sampling locations in each column. For example, the following figure 2.1 shows
a temperature field which spans over a 25 m × 150 m area is discretized into a 5 × 30 grid
of sampling locations (white dots). In this discretized field, each robot is constrained to
explore forward from leftmost column to rightmost column, with one sampling location for
each column. Thus, the action space for each robot given its current location comprises
the 5 locations in the right adjacent column. For the constraint on exploring forward, the
robots with limited maneuverability can explore the area with less complex planning paths
9
Chapter 2. Background
which can be achieved more reliably.
Figure 2.1: Transect sampling task in a temperature field.
In this thesis, we assume that the robots will perform the transect sampling task. So the
travelling cost of each robot is the horizontal length of the field. And the action space for
each robot is limited. Multiple robots will be applied to explore the field. We assume that
the number of robots will be less than the number of sampling locations in each column.
Our proposed algorithms will find the paths with maximum entropy and the paths with
maximum mutual information for multiple robots.
2.2
Gaussian Process
With sampled observations, we adopt the Gaussian Process [Rasmussen and Williams, 2006]
to model the environmental field. The GP model has been widely used to model environmental fields in spatial statistics [Webster and Oliver, 2007]. A Gaussian Process is a
collection of random variables, any finite number of which have a multivariate Gaussian
Distribution. To specify this distribution, a mean function M(·) and a symmetric positive
definite covariance function K(·, ·) have to be defined for a Gaussian Process. For example,
given a vector A of points and the corresponding vector ZA of random measurements on
these points, P (ZA ) is a multivariate Gaussian Distribution. It can be specified with a
mean vector µA and a covariance matrix ΣAA . For the mean vector µA , each entry corresponds to each point u in vector A with M(u). Similarly, in the covariance matrix ΣAA ,
each entry corresponds to each pair of points u, v in vector A with K(u, v). If we have the
measurements zA for vector A, given any other unobserved point y, with Bayes rules, we
can know that P (Zy |zA ) is also a Gaussian Distribution. For this Gaussian Distribution,
2
the posterior mean µy|A and the posterior variance σy|A
which correspond to the predicted
10
Chapter 2. Background
measurement value and the uncertainty at the unobserved point y are given by:
µy|A = µy + ΣyA Σ−1
AA (zA − µA )
(2.1)
2
σy|A
= K(y, y) − ΣyA Σ−1
AA ΣAy
(2.2)
where µy and µA are the prior means which are returned by the mean function M(·), ΣyA
is the covariance vector and each entry of ΣyA corresponds to each point u in vector A with
K(u, y). If there is a vector B of unobserved points, we have:
µB|A = µB + ΣBA Σ−1
AA (zA − µA )
(2.3)
ΣB|A = ΣBB − ΣBA Σ−1
AA ΣAB
(2.4)
where µB|A is a posterior mean vector and ΣB|A is a posterior covariance matrix. From (2.2)
and (2.4), it is known that the posterior variance does not depend on the measurements of
observed points zA .
2.2.1
Covariance Function
In this thesis, we assume that the GP model has a constant mean function and a stationary
covariance function. Hence, the mean function M(·) which can be learned with prior data or
expert knowledge will return a constant prior mean for any point. The covariance function
K(·, ·) will not depend on the locations of two points but the distance between two points.
The covariance function used in this thesis is:
K(u, v)
1
σs2 exp{− (u − v)T M −2 (u − v)} + σn2 δuv
2
(2.5)
where σs2 is the signal variance, σn2 is the noise variance, M is a diagonal matrix with
horizontal and vertical length scales
1
and
2,
and if u equals v, δuv is 1, otherwise 0.
With the covariance function 2.5, the following lemma shows the least measurement
error in Gaussian Process:
11
Chapter 2. Background
Lemma 2.2.1. In Gaussian Process, given an unobserved point y and any observed vector
2
A of points, if the noise variance is σn2 , the posterior variance σy|A
should be larger than
σn2 .
The proof for above result is shown in Appendix A.1. By Lemma 2.2.1, the posterior
variance of an unobserved point can be lower bounded.
With the covariance function 2.5, correlation of two points will exponentially decrease
with the distance between the two points. For example, given two points p1 and p2 , when
the distance between point p1 and point p2 linearly increases, the value of K(p1 , p2 ) will
exponentially decrease to zero and the posterior variance σp21 |p2 will exponentially increase
to the prior variance, which is shown in Fig. 2.2.
1.4
1.2
1
0.8
0.6
0.4
K(p1,p2)
σ2p
0.2
|p
1
0
6
5
4
3
2
1
0
1
distance
2
3
4
2
5
6
Figure 2.2: The value of K(p1 , p2 ) exponentially decreases to zero and the posterior variance
σp21 |p2 exponentially increases to prior variance as the distance between point p1 and point p2
linearly increases.
From Fig. 2.2, it can be known that correlation of two points exponentially decreases
with the distance of the two points. And when K(p1 , p2 ) is close to zero, the information
that point p2 can provide for point p1 is very little. Therefore, given an unobserved point y
and a vector A of observed points, we can remove the points A˜ in vector A to approximate
˜ K(u, y) is close to zero.
the posterior variance where for each point u in A,
12
Chapter 2. Background
2.3
Entropy and Mutual Information
For a transect sampling task, with sampled observations, the uncertainty at each unobserved
point can be obtained based on the GP model. With the uncertainty, entropy and mutual
information are used to quantify the informativeness of observation paths.
2.3.1
Entropy
Let X be the domain of the environmental field which is discretized into grid cell locations.
Given observation paths P, let X \P be the unobserved part of the field. Let ZX denote
the vector of random measurements on the points in X . Let ZP and ZX \P denote the
vector of random measurements on the points in P and X \P, respectively. To minimize the
uncertainty of the unobserved part, with entropy metric, the problem can be formalized as:
P ∗ = arg min H(ZX \P |ZP )
(2.6)
P∈T
where T is the set of all possible paths in the field.
For a vector A of a points, it can be shown that the joint entropy of the corresponding
vector ZA of random measurements is:
H(ZA ) = −
p(ZA ) log p(ZA )d(ZA ) =
1
log((2πe)a |ΣAA |)
2
(2.7)
As a result, for (2.6), we have:
H(ZX \P |ZP ) =
t
1
log(2πe) + log(|ΣX \P|P |)
2
2
(2.8)
where t is the size of X \P and ΣX \P|P is the posterior covariance matrix which can be
obtained with (2.4). With (2.8), the conditional entropy of the unobserved part for paths P
can be obtained. If we use an exhaustive algorithm, the optimal paths can be found. However, the number of possible paths exponentially increases with the length of the columns
in the field. If the field is large, it is intractable to solve this problem optimally.
13
Chapter 2. Background
With the chain rule of entropy, we have:
H(ZX ) = H(ZP ) + H(ZX \P |ZP ).
(2.9)
Because H(ZX ) is constant, the problem of minimizing the uncertainty of the unobserved
part H(ZX \P |ZP ) is equivalent to:
P ∗ = arg max H(ZP ).
(2.10)
P∈T
In this thesis, we will present an efficient non-myopic algorithm, the MEPP, to find the
paths with maximum entropy.
2.3.2
Mutual Information
Another metric, mutual information, is also proposed to measure the informativeness of
observation paths. Given observation paths P and unobserved part X \P, the mutual information between ZP and ZX \P is:
I(ZP ; ZX \P ) = H(ZX \P ) − H(ZX \P |ZP ).
(2.11)
Based on the mutual information, the problem can be formalized as:
P ∗ = arg max I(ZP ; ZX \P )
(2.12)
P∈T
where T is the set of all possible paths in the field. With (2.4) and (2.7), the mutual information for paths P can be evaluated in closed form. In this thesis, we will present another
non-myopic algorithm, the M2 IPP, to find the paths with maximum mutual information.
14
Chapter 3
Related Work
To monitor an environmental field, the robots need to sample locations which can give
more information about the measurement values at unobserved points. Different work has
developed various methods to select the sampling locations, which are shown in table 3.1.
In particular, our strategies are model-based which can find sampling paths for multiple
robots within polynomial time. Moreover, the performance of the sampling paths can be
guaranteed. The differences between our work and other related work are compared below.
3.1
Design-based vs. Model-based Strategies
To sample an unobserved area, some work [Rahimi et al., 2003; Batalin et al., 2004; Rahimi
et al., 2005; Singh et al., 2006; Popa et al., 2006; Low et al., 2007] has designed various
strategies. Based on a designed strategy, the robots adaptively sample new locations until
the strategy condition is satisfied. Because the sampling locations are selected based on the
designed strategy, the performance of the sampling paths cannot be quantified. Moreover,
some of these strategies [Rahimi et al., 2003; Batalin et al., 2004; Singh et al., 2006] need
to pass the area multiple times to sample new locations. However, these strategies will not
be suitable for some robots which are energy-constrained.
15
Chapter 3. Related Work
Table 3.1: Comparisons of different exploration strategies (DB: design-based, MB: model-based,
PT: polynomial-time NP: non-polynomial-time, NO: non-optimized, NG: non-guaranteed, PG:
performance-guaranteed, UP: unknown-performance MR: multi-robot SR: single-robot).
❤❤❤❤
❤❤❤
❤❤
Characteristics
❤❤❤❤
❤❤❤
Exploration strategies
❤
❤
Rahimi et al., 2003; Rahimi et al., 2005
Batalin et al., 2004
Popa et al., 2006
Singh et al., 2006
Low et al., 2007
Meliou et al., 2007
Singh et al., 2007; Singh et al., 2009
Zhang and Sukhatme, 2007
Low et al., 2008; Low et al., 2009
Binney et al., 2010
Low et al., 2011
MEPP
M2 IPP
DB
MB
PT
NP
×
×
×
×
×
NO
NG
PG
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
UP
×
×
×
×
×
×
×
×
×
×
MR
SR
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
Instead, our strategies, like those in [Meliou et al., 2007; Zhang and Sukhatme, 2007;
Singh et al., 2007; Low et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010;
Low et al., 2011], assume that the environmental field is realized from a statistical model.
Based on the model, the informativeness of the sampling paths can be quantified. The
problem becomes how to find the most informative paths.
In contrast to design-based strategies, mode-based strategies need some prior knowledge
about the environmental field to train the model. With trained model, the paths can be
planned before sampling the area. Because the sampling paths are already known, the
robots do not need to pass the area multiple times.
3.2
Polynomial-time vs. Non-polynomial-time Strategies
Among those model-based strategies, some of these strategies [Meliou et al., 2007; Zhang
and Sukhatme, 2007; Singh et al., 2007; Low et al., 2008; Low et al., 2009; Binney et al.,
2010] cannot find the sampling paths in polynomial time. For example, in the work of [
Meliou et al., 2007; Singh et al., 2007; Binney et al., 2010], the time complexity for the
proposed algorithms is quasi-polynomial. In the work of [Zhang and Sukhatme, 2007;
16
Chapter 3. Related Work
Low et al., 2008; Low et al., 2009], the time complexity for the proposed algorithms will
exponentially increase with the length of planning horizon. However, our work, like another
work [Low et al., 2011] can find the sampling paths in polynomial time. For those designbased strategies, because the sampling locations are selected based on designed strategies,
the time complexity is not a main concern.
3.3
Non-guaranteed vs. Performance-guaranteed Sampling Paths
Among those model-based strategies, some of these strategies [Meliou et al., 2007; Singh
et al., 2007; Low et al., 2008; Low et al., 2009; Binney et al., 2010] cannot guarantee the
performance of the sampling paths. Because the time complexity for these strategies are
non-polynomial time, different heuristics (e.g., greedy heuristic, branch-and-bound search,
anytime heuristic search algorithm) have been used to to reduce time complexity. However,
no performance guarantee has been provided for these heuristics. Although the work of [
Zhang and Sukhatme, 2007] can find the optimal paths, they need assume that the information gain from each location is independent from other locations. However, this assumption
violate the spatial correlations of environmental fields.
Instead, our work, like another work [Low et al., 2011], can provide theoretical guarantees for the sampling paths. Although the work of [Low et al., 2011] can provide performance
guarantees, it also need assume that the measurements in next stage only depend on the
measurements in current stage. Our work relax this strong assumption by utilizing a longer
path history. And theoretical guarantees are provided for the optimal paths of our algorithms. For those design-based strategies, the informativeness of the sampling locations
cannot be quantified. As a result, the performance of those sampling paths is unknown.
3.4
Multi-robot vs. Single-robot Strategies
Some work [Rahimi et al., 2003; Batalin et al., 2004; Rahimi et al., 2005; Singh et al., 2006;
Popa et al., 2006; Zhang and Sukhatme, 2007; Meliou et al., 2007; Binney et al., 2010] can
17
Chapter 3. Related Work
only generate a path for single robot. For a small sampling task, single robot is easy to
coordinate and deploy. However, it will be difficult for single robot to accomplish a large
sampling task. Instead, our work like those in [Singh et al., 2007; Low et al., 2007; Low
et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010; Low et al., 2011] can
generate multiple paths for multiple robots. With multiple robots, a large sampling task
can be completed easily and fast.
18
Chapter 4
Maximum Entropy Path Planning
In this chapter, we propose the MEPP (Maximum Entropy Path Planning) algorithm, which
can find the paths with maximum entropy. Before presenting our own work, we introduce
the information-theoretic Multi-Robot Adaptive Sampling Problem (iMASP). Although the
optimal paths can be theoretically found by the algorithm for iMASP, its time complexity
exponentially increases with the length of planning horizon.
To reduce time complexity and provide a tight performance guarantee, we exploit the
covariance function for the property that correlation of two points exponentially decreases
with the distance between the two points. With this property, the MEPP algorithm is
proposed in section 4.3. In section 4.4, the analysis for its time complexity is provided,
which shows the MEPP algorithm is polynomial time. We provide a performance guarantee
for the MEPP algorithm in section 4.5.
4.1
Notations and Preliminaries
Let the transect be discretized into a r × n grid of sampling locations. The columns of the
field are indexed in an increasing order. The leftmost column is indexed as ‘1’, rightmost
column as ‘n’. Each planning stage corresponds to each column with the same index. In
each stage, every robot takes an observation which comprises its location and measurement.
19
Chapter 4. Maximum Entropy Path Planning
We assume that there are k robots to explore the area and k is less than the number of
rows. In stage i, let xi denote the row vector of these k sampling locations and Zxi denote
the corresponding row vector of k random measurements. And let xji indicate the j-th
(1 ≤ j ≤ k) location in vector xi . In addition, let xi:l represent the vector of all sampling
locations from stage i to stage l (i.e., xi:l
(xi , . . . , xl ) ) and Zxi:l denote the vector of all
(Zxi , . . . , Zxl )).
corresponding random measurements (i.e., Zxi:l
Given vectors x1 , . . . , xn , the robots can sample the area from leftmost column to rightmost column. Given vector xi−1 of locations, we assume that the robots can deterministically move to vector xi of locations. Let Xi denote the set of all possible xi in stage i. Let
X be a variable that can denote Xi in any stage. Because the sampling points in each stage
are the same, the number of possible vectors |X| in each stage is the same. To save energy,
we also assume that each robot will not cross the paths of other robots. As a result, given
the number of rows r, the number of all possible vectors |X| in each stage is Crk .
4.2
i MASP
To find the paths with maximum entropy, the work of [Low et al., 2009] has defined iMASP.
Given observation paths x1:n , using the chain rule of entropy, we have
n
H(Zxi |Zx1:i−1 ).
H(Zx1:n ) = H(Zx1 ) +
(4.1)
i=2
Based on (4.1), the work of [Low et al., 2009] has proposed the following n-stage dynamic
programming equations to calculate maximum conditional entropy in each stage:
∗
Vi∗ (x1:i−1 ) = max H(Zxi |Zx1:i−1 ) + Vi+1
(x1:i )
(4.2)
Vn∗ (x1:n−1 ) = max H(Zxn |Zx1:n−1 )
(4.3)
xi ∈Xi
xn ∈Xn
for stages i = 1, . . . , n − 1. For the first stage, because there is no previous stage, x1:0 is a
vector which has no element. Hence, H(x1 |x1:0 ) is equvilent to H(x1 ). Because the field is
20
Chapter 4. Maximum Entropy Path Planning
modeled with Gaussian Process, the conditional entropy in each stage is defined as follows:
H(Zxi |Zx1:i−1 ) =
1
log(2πe)k |Σxi |x1:i−1 |,
2
(4.4)
where Σxi |x1:i−1 is defined in (2.4). Based on (4.4), the optimal paths of iMASP are x∗1:n
(x∗1 , . . . , x∗n ) where for stages i = 1, . . . , n, given x∗1:i−1 , x∗i is the vector in (4.2) or (4.3) which
returns the largest value. It can be computed that the time complexity of the algorithm for
iMASP is O(|X|n (kn)3 ). As a result, the time complexity will exponentially increase with
the length of planning horizon. To avoid this intractable complexity, an anytime heuristic
search algorithm [Korf, 1990] has been used to approximate the optimal paths. However,
no performance guarantee is provided for this heuristic search algorithm.
4.3
MEPP Algorithm
To balance the time complexity and the performance guarantee, we exploit the covariance
function for the property that correlation of two points exponentially decreases with the
distance between the two points. As a result, when we predict the posterior variance of
an unobserved point y given a vector A of points, we can remove the points A˜ from vector
˜ K(u, y) is a small
A to approximate the posterior variance, where for each point u in A,
value. With this property, H(Zxi |Zx1:i−1 ) can be approximated by H(Zxi |Zxi−m:i−1 ) where
max K(xji , xji−m−1 ) is a small value. And we can prove that the entropy decrease for this
1≤j,j ≤k
truncation can be bounded. Consequently, the joint entropy H(Zx1:n ) can be approximated
by the following formula :
n
H(Zx1:n ) ≈ H(Zx1:m ) +
H(Zxi |Zxi−m:i−1 ).
(4.5)
i=m+1
21
Chapter 4. Maximum Entropy Path Planning
According to (4.5), the following dynamic programming equations are proposed to approximate maximum conditional entropy in each stage:
me
Vime (xi−m:i−1 ) = max H(Zxi |Zxi−m:i−1 ) + Vi+1
(xi−m+1:i )
(4.6)
Vnme (xn−m:n−1 ) = max H(Zxn |Zxn−m:n−1 )
(4.7)
xi ∈Xi
xn ∈Xn
for stages i = m + 1, . . . , n − 1. To get the optimal vector xme
1:m in first m stages, we can use
the following equation:
me
xme
1:m = arg max H(Zx1:m ) + Vm+1 (x1:m )
(4.8)
x1:m ∈X1:m
where X1:m is the set of all possible x1:m over the first m stages. Based on (4.4), the optimal
paths of the MEPP algorithm are xme
1:n
me
me
me
(xme
1:m , xm+1 , . . . , xn ) where x1:m is from (4.8) and
me
for stages i = m + 1, . . . , n, given xme
i−m:i−1 , xi is the vector in (4.6) or (4.7) which returns
the largest value. It can be found that when m = 1, the MEPP algorithm is the same as the
Markov-Based iMASP in the work of [Low et al., 2011]. As a result, our work generalizes
the work of [Low et al., 2011] by utilizing a longer path history.
4.4
Time Analysis
Theorem 4.4.1. Let |X| be the number of possible vectors in each stage. Determining
the optimal paths based on m-order Markov property for the MEPP algorithm requires
O(|X|m+1 [n + (km)3 ]) time, where n is the number of columns.
Given vector xi−m:i−1 , to get the posterior entropy H(Zxi |Zxi−m:i−1 ) over all possible
xi ∈ Xi , we need |X| × O((km)3 ) = O(|X|(km)3 ) operations. And in each stage, there are
|X|m possible xi−m:i−1 over m previous stages. Hence, in each stage, to get the optimal values for |X|m vectors, we need |X|m × O(|X|(km)3 ) = O(|X|m+1 (km)3 ) operations. Because
we have used the stationary covariance function, the covariance function only depends on
the distance between points. Thus, the entropy values calculated for one stage are the same
22
Chapter 4. Maximum Entropy Path Planning
as the values in other stages. We can propagate the optimal values from stage n−1 to m+1
and the time needed is O(|X|m+1 (n − m − 1)). To get vector xme
1:m , we need to compute
the joint entropy H(Zx1:m ) for all possible x1:m over first m stages. Hence, the time needed
m
3
to get the vector xme
1:m is O(|X| (km) ). As a result, the time complexity for the MEPP
algorithm is O(|X|m+1 [(n − m − 1) + (km)3 ] + |X|m (km)3 ) = O(|X|m+1 [n + (km)3 ]).
Comparing with the iMASP which requires O(|X|n (kn)3 ), this algorithm scales well
with large n. Though, it is less efficient than Markov-Based iMASP which needs O(|X|2 (n+
k 3 )), the MEPP algorithm is also efficient in practice, which is demonstrated in section 6.4
4.5
Performance Guarantees
In section 4.3, we have defined the MEPP algorithm with the m-order Markov property.
The following lemma shows the optimality of the results of the MEPP algorithm in terms
of conditional entropy with m previous vectors:
Lemma 4.5.1. Let xme
1:n be the optimal paths of the MEPP algorithm, for any other paths
x1:n , we have:
n
H(Zxme
|Zxme
)
i
i−m:i−1
H(Zxme
)+
1:m
i=m+1
n
≥H(Zx1:m ) +
H(Zxi |Zxi−m:i−1 )
(4.9)
i=m+1
where Zxme
and Zx1:n are the vectors of random measurements for the paths xme
1:n and x1:n
1:n
respectively.
The proof for this result is shown in Appendix A.2. From this lemma, given the optimal paths x∗1:n of iMASP, inequality (4.9) still holds. This is because if we consider the
conditional entropy in each stage with all previous vectors, the joint entropy of the paths
x∗1:n is the maximal one. However, if we consider the conditional entropy in each stage only
with m previous vectors, the paths xme
1:n are the optimal paths.
23
Chapter 4. Maximum Entropy Path Planning
Let ω1 and ω2 be the horizontal and vertical width of the grid cell. Let
2
2 /ω2
1
1 /ω1
and
denote the normalized horizontal and vertical length scales, respectively. Given
vector xi−m−1 and vector xi−m:i−1 , for any vector xi , the entropy decrease can be bounded
by the following lemma:
Lemma 4.5.2. Let ε
2
σs2 exp{− (m+1)
}. Given vector xi−m−1 and vector xi−m:i−1 , for
2 2
1
any vector xi , the entropy decrease can be bounded by
H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi )
≤ k 2 log{1 +
ε2
}
σn2 (σn2 + σs2 )
(4.10)
The proof for this lemma is shown in Appendix A.3. With a similar proof, given vector
xi−t:i−1 in t previous stages, where t ≥ m, the entropy decrease H(Zxi−t−1 |Zxi−t:i−1 ) −
H(Zxi−t−1 |Zxi−t:i−1 , Zi ) is less than k 2 log{1 +
ε2
2 (σ 2 +σ 2 ) }.
σn
n
s
As a result, with the chain rule
of entropy, given vector xi and vector xi−m:i−1 in m previous stages, the entropy decrease
for losing the vectors in all further previous stages can be bounded by following corollary:
Corollary 4.5.3. Given vector xi and vector xi−m:i−1 in m previous stages, the entropy
decrease for losing the vectors in all further previous stages can be bounded by
H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 )
≤ (i − m − 1)k 2 log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(4.11)
The proof for this corollary is shown in Appendix A.4. From this corollary, H(Zxi |Zxi−m:i−1 )
is close to H(Zxi |Zx1:i−1 ).
Lemma 4.5.1 shows the optimality of the results of the MEPP algorithm with respect to
the conditional entropy with m previous vectors. And corollary 4.5.3 shows the conditional
entropy with m previous vectors is close to the conditional entropy with all previous vectors.
As a result, the joint entropy of the optimal paths of the MEPP algorithm is close to the
optimal paths of iMASP. The following theorem bounds the entropy decrease between the
24
Chapter 4. Maximum Entropy Path Planning
∗
optimal paths xme
1:n of the MEPP algorithm and the optimal paths x1:n of iMASP:
∗
Theorem 4.5.4. Let xme
1:n be the optimal paths of the MEPP algorithm and x1:n be the
optimal paths of iMASP. Let ε
2
}, the entropy decrease between the two
σs2 exp{− (m+1)
2 2
1
paths can be bounded by
H(Zx∗1:n ) − H(Zxme
) ≤ (n − m)2 k 2 log{1 +
1:n
ε2
}.
σn2 (σn2 + σs2 )
The proof for the above result is shown in Appendix A.5. According to theorem 4.5.4,
the performance guarantee is bounded by the number of columns n, the value of m, the
number of robots k and the value of ε. And the value of ε depends on the value of m and the
normalized horizontal length scale. Hence, there are a few ways to improve the performance
bound: (a) transect sampling task with small number of columns, (b) environmental fields
with small horizontal length scales or large horizontal discretization width, (c) using a small
number of robots, (d) using a large value of m. In particular, for anisotropic fields, if the
robots along the small correlated direction, the value of ε will be small. As a result, we can
use a small m which incur little planning time to bound the sampling performance.
25
Chapter 5
Maximum Mutual Information Path
Planning
In this chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual
Information Path Planning), to find the paths with maximum mutual information. Similar
to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal
paths, the time complexity will exponentially increase with the length of planning horizon.
In the previous chapter, we have proposed the MEPP algorithm with m-order Markov
property. The time complexity for this algorithm is polynomial and the performance can
be guaranteed. However, in section 5.2, we show that the m-order Markov property cannot
be applied to maximum mutual information criterion. To solve this problem, a different
approximation method is proposed in section 5.3 . Based on this approximation method,
the M2 IPP algorithm is proposed in section 5.4. In section 5.5, the analysis for its time
complexity is provided, which shows the M2 IPP algorithm is also polynomial time. In
section 5.6, we provide a performance guarantee for the M2 IPP algorithm.
26
Chapter 5. Maximum Mutual Information Path Planning
5.1
Notations
With sampling locations xi in stage i, the row vector ui of unobserved locations in this
stage can be determined. Let Zui denote the row vector of corresponding random measurements. With sampling locations xi:l from stage i to stage l, let ui:l denote the vector of
all unobserved locations in these stages (i.e., ui:l
(ui , . . . , ul )) and Zui:l denote the vector
of all corresponding random measurements (i.e., Zui:l
(Zui , . . . , Zul )). Given observation
paths x1:n , let u1:n denote the unobserved part of the field.
5.2
Problem Definition
With observation paths x1:n and unobserved part u1:n of the field (e.g., Fig. 5.1a), the
mutual information between Zx1:n and Zu1:n is
I(Zx1:n ; Zu1:n ) = H(Zx1:n ) − H(Zx1:n |Zu1:n ).
(5.1)
Given paths x1:n , with (5.1) and (4.4), the mutual information can be evaluated in closed
form. As a result, if we use the exhaustive algorithm, the optimal paths can be found.
However, to enumerate all possible paths in the field, the time complexity will exponentially
increase with the length of planning horizon.
In the previous chapter, we have applied the m-order Markov property to maximum
entropy criterion to reduce time complexity. However, this property cannot be applied to
maximum mutual information criterion. The reason is as follows. From (5.1), with the
chain rule of entropy, we have
n
I(Zxi ; Zu1:n |Zx1:i−1 ).
I(Zx1:n ; Zu1:n ) = I(Zx1 ; Zu1:n ) +
(5.2)
i=2
From (5.2), the conditional mutual information I(Zxi ; Zu1:n |Zx1:i−1 ) in stage i depends on
vector x1:i−1 and vector u1:n . If we apply the m-order Markov property to (5.2), we have
27
Chapter 5. Maximum Mutual Information Path Planning
following formula:
n
I(Zxi ; Zu1:n |Zxi−m:i−1 ).
I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:n ) +
(5.3)
i=m+1
Although vector x1:i−1 can be approximated with vector xi−m:i−1 in (5.3), vector u1:n is
unknown. Given current vector xi and vector xi−m:i−1 , we can only get vector ui−m:i of
unobserved locations (e.g., Fig. 5.1b). With vector ui−m:i , the conditional mutual information in stage i cannot be determined. Therefore, we cannot propose a similar approximation
algorithm with the m-order Markov property for maximum mutual information criterion.
(a)
(b)
Figure 5.1: Visualization of applying m-order Markov property to maximum mutual information
criterion.
5.3
Problem Analysis
From the previous section, it is known that given current vector xi and vector xi−m:i−1 , the
conditional mutual information I(Zxi ; Zu1:n |Zx1:i−1 ) cannot be approximated. To approximate the conditional mutual information in each stage, we need to approximate vector u1:n
as well. We address this issue still by exploiting the covariance function for the property
that correlation of two points exponentially decreases with the distance between the two
points. According to this property, for vector xi , we can use vector ui−m:i+m in this stage
to approximate vector u1:n . Due to small correlation, the information that other points in
vector u1:n can provide is negligible. And we can bound the mutual information decrease
in each stage incurred by ignoring other points. Consequently, (5.1) can be approximated
28
Chapter 5. Maximum Mutual Information Path Planning
as follows:
n−m−1
I(Zxi ; Zui−m:i+m |Zxi−m:i−1 )
I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:2m ) +
i=m+1
+ I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ).
(5.4)
From (5.4), the approximated unobserved part for vector xi is vector ui−m:i+m . However,
if we only use the m-order Markov property, we still only get vector ui−m:m . And some
points in vector ui−m:i+m are still unknown (e.g., Fig. 5.2a). Thus, instead of using the
m-order Markov property, we enumerate all possible paths in 2m previous stages. Different
from maximum entropy path planning, the reward in each stage is the conditional mutual
information for the vector in the middle of the paths (e.g., Fig. 5.2b).
(a)
(b)
Figure 5.2: Visualization of the approximation method of the M2 IPP algorithm.
Consequently, (5.4) can be rewritten as follows:
n−1
I(Zx1:n ; Zu1:n ) ≈ I(Zx1:m ; Zu1:2m ) +
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
i=2m+1
+ I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ).
(5.5)
With current vector xi and vector xi−2m:i−1 in 2m previous stages, the approximated unobserved part of the field ui−2m:i for vector xi−m can be determined. As a result, the
conditional mutual information for vector xi−m can be obtained. For the vectors in first
m stages, there is no path history of m stages. We use vector u1:2m as their approximated
unobserved part of the field. So, the conditional mutual information for the first m vec29
Chapter 5. Maximum Mutual Information Path Planning
tors can be grouped together. Similarly, for vectors xn−m:n in last m + 1 stages, we use
vector un−2m:n as their approximated unobserved part of the field and the conditional mutual information for the last m + 1 vectors can be grouped together. With the sum of the
approximated values in all stages, we can approximate the maximum mutual information
paths.
5.4
M2 IPP Algorithm
From the previous section, with current vector xi and vector xi−2m:i−1 , the conditional
mutual information for vector xi−m can be obtained. Consequently, the following dynamic
programming formulas are proposed to approximate maximum conditional mutual information in each stage:
mi
(xi−2m+1:i )
Vimi (xi−2m:i−1 ) = max I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + Vi+1
(5.6)
Vnmi (xn−2m:n−1 ) = max I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )
(5.7)
xi ∈Xi
xn ∈Xn
for stages i = 2m + 1, . . . , n − 1. To get the optimal vector xmi
1:2m in first 2m stages, the
following equation can be used:
mi
xmi
1:2m = arg max I(Zx1:m , Zu1:2m ) + V2m+1 (x1:2m )
(5.8)
x1:2m ∈X1:2m
where X1:2m is set of all possible x1:2m over the first 2m stages. Based on (4.4), the optimal
paths of the M2 IPP algorithm are xmi
1:n
mi
mi
mi
(xmi
1:2m , x2m+1 , . . . , xn ) where x1:2m is from (5.8)
mi
and for stages i = 2m + 1, . . . , n, given xmi
i−2m:i−1 , xi is the vector in (5.6) or (5.7) which
returns the largest value.
5.5
Time Analysis
Theorem 5.5.1. Let |X| be the number of possible vectors in each stage. Determining the
optimal paths of the M2 IPP algorithm requires O(|X|2m+1 (n + 2[r(2m + 1)]3 )), where r is
30
Chapter 5. Maximum Mutual Information Path Planning
the number of rows, n is the number of columns, and m is the value used for approximated
unobserved part of the field in each stage.
Given vector xi−2m:i−1 , to get the conditional mutual information for vector xi−m over
all possible xi ∈ Xi , we need |X| × O([r(2m + 1)]3 ) = O(|X|[r(2m + 1)]3 ) operations. And
in each stage, there are |X|2m possible xi−2m:i−1 over 2m previous stages. Hence, in each
stage, to get the optimal values for |X|2m vectors, we need |X|2m × O(|X|[r(2m + 1)]3 ) =
O(|X|2m+1 [r(2m+1)]3 ) operations. Similar to the MEPP algorithm, the conditional mutual
information calculated for one stage is the same as the values in other stages. Thus, we
can propagate the optimal values from stage n − 2 to 2m + 1 and the time needed is
O(|X|2m+1 (n−2m−2)). Subsequently, it requires O(|X|2m+1 [r(2m+1)]3 ) time to calculate
the conditional mutual information for last m+1 vectors. To get the mutual information for
the first m vectors, the time needed is O(|X|2m [r(2m)]3 ). As a result, the time complexity
for the M2 IPP algorithm is O(|X|2m+1 (n − 2m − 2 + [r(2m + 1)]3 ) + |X|2m+1 [r(2m + 1)]3 +
|X|2m [r(2m)]3 ) = O(|X|2m+1 (n + 2[r(2m + 1)]3 )).
Comparing to the greedy algorithm (6.2) in section 6.1 which requires considering all
unobserved points in each stage, our algorithm just need to consider the unobserved points
in 2m+1 columns. As a result, for a transect sampling task with a large number of columns,
our algorithm is still efficient.
5.6
Performance Guarantees
In section 5.4, we have formulated the M2 IPP algorithm with the approximation method
proposed in section 5.3. The following lemma shows the optimality of the results of the
M2 IPP algorithm in terms of approximated conditional mutual information:
2
Lemma 5.6.1. Let xmi
1:n be the optimal paths of the M IPP algorithm, for any other paths
31
Chapter 5. Maximum Mutual Information Path Planning
x1:n , we have:
n−1
I(Zxmi ; Zumi
I(Zxmi
; Zumi
)+
1:m
1:2m
i=2m+1
i−m
i−2m:i
|Zxmi
i−2m:i−m−1
)
+ I(Zxmi
; Zumi
|Zxmi
)≥
n−m:n
n−2m:n
n−2m:n−m−1
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
I(Zx1:m ; Zu1:2m ) +
i=2m+1
+ I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ) (5.9)
mi
where umi
1:n and u1:n are the unobserved part of the field for observation paths x1:n and x1:n ,
respectively.
The proof for this result is shown in Appendix B.1. From this lemma, given the optimal
paths x1:n of the exhaustive algorithm, inequality (5.9) still holds. This is because if we
consider the conditional mutual information in each stage with all previous vectors and
the whole unobserved part of the field, the mutual information between the observation
paths x1:n and the corresponding unobserved part u1:n is the maximal one. However,
if we consider the conditional mutual information with approximated path history and
approximated unobserved part in each stage, the paths xmi
1:n are the optimal paths.
Similar to corollary 4.5.3, we can bound the mutual information decrease in each stage as
well. Let ω1 and ω2 be the horizontal and vertical width of the grid cell. Let
2
2 /ω2
1
1 /ω1
and
denote the normalized horizontal and vertical length scales, respectively. Given
the approximated path history xi−2m:i−m−1 and approximated unobserved part ui−2m:i , the
following lemma bounds the mutual information decrease for loosing the path history and
the unobserved points in other stages:
Lemma 5.6.2. Given vector xi and vector xi−2m:i−1 , the approximated unobserved part of
the field ui−2m:i for vector xi−m can be obtained. Let ε
2
σs2 exp{− (m+1)
}, if there are r
2 2
1
rows and n columns in the field, the mutual information decrease for loosing the path history
32
Chapter 5. Maximum Mutual Information Path Planning
and the unobserved points in other stages can be bounded with following formulas:
I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) − I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) = Ai−m − Bi−m
(5.10)
where
Ai−m = H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n ) ,
(5.11)
Bi−m = H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 )
(5.12)
and
ε2
}
σn2 (σn2 + σs2 )
ε2
≤ (i − 2m − 1)k 2 log{1 + 2 2
}.
σn (σn + σs2 )
Ai−m ≤ (n − 2m − 1)rk log{1 +
(5.13)
Bi−m
(5.14)
The proof for this lemma is shown in Appendix B.3. With the definition of mutual
information, (5.10), (5.11), (5.12) can be obtained. For Bi−m , with corollary 4.5.3, the
inequality (5.14) can be obtained. For Ai−m , all points in vector (x1:i−1 , u1:n ), which are
away from vector xi−m within m stages are in the vector (xi−m:i−1 , ui−2m:i ). As a result,
other points provide little information about the vector xi−m . Then, the value for Ai−m
can be bounded by inequality (5.13).
The lemma 5.6.1 shows the optimality of the results of the M2 IPP algorithm in terms
of approximated conditional mutual information. And the lemma 5.6.2 shows the mutual
information decrease in each stage can be bounded. As a result, the mutual information
of the results of the M2 IPP algorithm is close to the optimal results. The following lemma
2
bounds the mutual information decrease between the paths xmi
1:n of the M IPP algorithm
and the optimal paths x1:n of the exhaustive algorithm:
2
Theorem 5.6.3. Let xmi
1:n be the optimal paths of the algorithm M IPP and x1:n be the
optimal paths of the exhaustive algorithm. Let ε
2
}, if there are r rows and
σs2 exp{− (m+1)
2 2
1
33
Chapter 5. Maximum Mutual Information Path Planning
n columns in field, the mutual information decrease can be bounded with:
I(Zx1:n ; Zu1:n ) − I(Zxmi
; Zumi
)
1:n
1:n
(5.15)
1
ε2
≤ [nr + (n − 2m)k](n − 2m)k log{1 + 2 2
}.
2
σn (σn + σs2 )
(5.16)
The proof for above result is shown in Appendix B.4. According to the theorem 5.6.3,
the performance guarantee is bounded by the number of columns n, the value of m, the
number of robots k and the value of ε. And the value of ε depends on the value of m
and the normalized horizontal length scale. As a result, there are a few ways to improve
the performance bound: (a) transect sampling task with small number of columns, (b)
environmental fields with small horizontal length scales or large horizontal discretization
width, (c) using a small number of robots, (d) using a large value of m. Similar to the MEPP
algorithm, if the robots explore the anisotropic fields along small correlated direction, we
can also use a small m which incur little planning to bound the sampling performance.
34
Chapter 6
Experimental Results
In the two previous chapters, we have provided the performance guarantees for the two
proposed algorithms, MEPP and M2 IPP. In this chapter, we empirically evaluate the performance of these two algorithms on two real-world data sets. The results of our proposed
algorithms are compared to two other existing algorithms based on three performance metrics. The data sets and performance metrics are described in section 6.1. The performance
results for the two data sets are presented in sections 6.2 and 6.3. The time efficiency of
the two proposed algorithms is shown in section 6.4. The algorithms are implemented in
Matlab and the experiments are run on a PC with Intel Quad Core Q9550 2.83GHz and
4 GB RAM. In section 6.5, we will discuss how to select criterion to get lower prediction
error for different environmental fields.
6.1
Data Sets and Performance Metrics
In the two previous chapters, we have provided the performance guarantees for the proposed algorithms. To reveal the performance empirically, the algorithms are tested on two
real-world data sets: (a) May 2009 temperature data of Panther Hollow Lake in Pittsburgh, PA spanning 25 m by 150 m, (b) June 2009 plankton density data of Chesapeake Bay spanning 314 m by 1765 m. The environmental fields in these two data sets
35
Chapter 6. Experimental Results
are modeled with the Gaussian Process. The hyper-parameters (i.e.,
1,
2,
σs2 , σn2 ) are
learned using maximum likelihood estimation (MLE). The learned hyper-parameters are
1
= 40.45 m,
1
= 27.5723 m,
2
= 16 m, σs2 = 0.1542, and σn2 = 0.0036 for the temperature field and
2
= 134.6415 m, σs2 = 2.152, and σn2 = 0.041 for the plankton density
field. The temperature field which distributed over 25m × 150m is discretized into a 5 × 30
grid (e.g., 6.1d). To investigate how the algorithms perform under different vertical and
horizontal correlations (specifically, length scales), we reduced the horizontal and/or vertical length scales of the original field to produce three other modified temperature fields
(e.g., 6.1a, 6.1b, 6.1c). The remaining hyper-parameters (e.g., σs2 , σn2 ) are learned on the
original field with the reduced length scales through MLE. The four temperature fields with
learned hyper-parameters are shown in Fig. 6.1.
1
1
24
24
2
2
3
3
23.5
5
= 5.0 m,
1
5
23
5
(a)
10
2
15
= 5.0 m,
20
σs2
25
= 0.2364,
= 0.0545
(b)
1
10
= 5.0 m,
2
15
20
= 16.0 m,
σs2
25
30
= 0.2704,
σn2
= 0.0563
1
24
2
24
2
3
3
23.5
23.5
4
4
5
5
23
5
1
23
5
30
σn2
1
(c)
23.5
4
4
10
= 40.45 m,
2
15
= 5.0 m,
20
σs2
25
= 0.3116,
23
30
σn2
5
= 0.0588 (d)
1
10
= 40.45 m,
2
15
20
= 16 m,
σs2
25
= 0.3926,
30
σn2
= 0.0601
Figure 6.1: Temperature fields which distributed over 25m × 150m are discretized into 5 × 30
grids with learned hyper-parameters.
The plankton density field with learned hyper-parameters is shown in Fig. 6.2.
14
2
12
4
10
6
8
8
5
10
15
20
25
30
35
40
45
Figure 6.2: Plankton density field which distributed over 314m × 1765m is discretized into a
8 × 45 grid with 1 = 27.5273 m, 2 = 134.6415 m, σs2 = 1.4670, and σn2 = 0.2023.
We will compare the performance of our proposed algorithms with two other state-of-
36
Chapter 6. Experimental Results
the-art algorithms which have also been used as baseline in the work of [Low et al., 2009;
Low et al., 2011]: (a) greedy maximum entropy path planning (GMEPP). Given starting
locations, the GMEPP algorithm greedily selects next vector of locations which maximize
the joint entropy of the observation paths, which can be defined as follows:
gme
Vi
(x1:i−1 ) = max H(Zxi |Zx1:i−1 )
xi ∈Xi
(6.1)
for stages i = 2, . . . , n; (b) greedy maximum mutual information path planning (GM2 IPP).
Given starting locations, the GM2 IPP algorithm greedily selects next vector of locations
which maximize the mutual information between the observation paths and the corresponding unobserved part of the field, which can be formulated as follows:
gmi
Vi
(x1:i−1 ) = max I(Zx1:i ; ZX \x1:i )
xi ∈Xi
(6.2)
for stages i = 2, . . . , n. For the two algorithms, we will enumerate all possible starting
locations to get the optimal paths.
The performance of the algorithms are evaluated with three different metrics: (a) the
joint entropy of the unobserved part of the field ENT(π): H(Zu1:n |Zx1:n ), where x1:n is
the observation paths, which is denoted by π, and u1:n is the unobserved part of the field,
(b) the mutual information between the observation paths and the unobserved part of
the field MI(π): I(Zu1:n ; Zx1:n ), (c) the mean square relative prediction error ERR(π):
|u1:n |−1 Σu∈u1:n {(zu −µu|x1:n )/¯
µ}2 , where zu is the measurement value at point u, and µu|x1:n
is the posterior mean value at point u using (2.1), and µ
¯ = |u1:n |−1 Σu∈u1:n zu . Both smaller
ENT(π) and larger MI(π) imply lower ERR(π). However, the observations paths with small
ENT(π) are different from the paths with large MI(π). For entropy metric, to get smaller
H(Zu1:n |Zx1:n ), according to H(Zx1:n , Zu1:n ) = H(Zx1:n ) + H(Zu1:n |Zx1:n ), we need to get
larger H(Zx1:n ). To achieve larger H(Zx1:n ), the points in x1:n need to be uncertain to each
other. So, to get smaller ENT(π), we need to select the points which are far away from each
other. For mutual information metric, to get larger H(Zu1:n ) − H(Zu1:n |Zx1:n ), we need to
37
Chapter 6. Experimental Results
get larger H(Zu1:n ) and smaller H(Zu1:n |Zx1:n ). To get smaller H(Zu1:n |Zx1:n ), similar to
entropy metric, we need to select the points which are far away from each other. However,
to get larger H(Zu1:n ), similar to getting larger H(Zx1:n ), the points in u1:n need to be far
away from each other. Hence, the points in x1:n cannot be extremely far away from each
other so that the points in x1:n are inside the unobserved points u1:n . Consequently, the
unobserved points can be separated from each other and be far away from each other. As
a result, to get larger MI(π), we need to select the points which are far away from each
other and can separate the unobserved points from each other. For prediction error metric,
it will show how accurately the field is being mapped by the sampling paths of different
algorithms.
6.2
6.2.1
Temperature Data Results
Entropy Metric
Fig. 6.3 shows the results of ENT(π) for different algorithms with different number of
robots on the temperature fields. In the experiments, we have used different values of m
for the MEPP algorithm. For the M2 IPP algorithm, we have set m = 3 with one robot and
m = 2 with two robots and three robots. With one robot, it can be observed that: (1) On
fields a and b, the MEPP and GMEPP algorithms can achieve smaller ENT(π) than the
GM2 IPP and M2 IPP algorithms: for field a or field b, because the correlations of the field
are small, the points of the field are uncertain to each other. So, the joint entropy of each
field is large. And the points selected by maximum mutual information criterion will not
be far away from each other as much as possible, so ENT(π gmi ) and ENT(π mi ) are much
larger. (2) With a small m (e.g., m = 2), ENT(π me ) is the smallest on fields a, b: there are
two reasons which account for this. Firstly, for field a or field b, the horizontal length scale
of the field is small, in each stage our m-order Markov property is enough to exploit the
small horizontal correlation. Secondly, because our algorithm is non-myopic, the paths with
maximum entropy can be found. (3) For fields c and d, increasing the value of m decreases
38
Chapter 6. Experimental Results
the ENT(π me ) significantly: for field c or field d, because the horizontal length scale of the
field is large, larger Markov property can exploit the horizontal correlation much more.
With two robots, we have three similar observations as the one-robot case.
With three robots, it can be observed that: (1) On fields a and b, the GM2 IPP algorithm
can achieve comparable ENT(π) to that of the MEPP and GMEPP algorithms: when we
increase the number of sampled locations in each column, the points selected by different
criteria are similar. (2) With a small m (e.g., m = 2), ENT(π me ) is the smallest on fields
a, b: the reason has been explained in the second observation under the one-robot case.
(3) On field c, increasing the value of m decreases the ENT(π me ) significantly: the reason
is the same as the third observation under one-robot case. (4) On field d, ENT(π me ) is
the smallest: when we increase the number of robots, the large vertical correlation can be
exploited more by our non-myopic algorithm.
Summarizing the above observations, it can be observed that: (1) On fields a and b, with
any number of robots, the MEPP algorithm with a small m can achieve smaller ENT(π)
than other algorithms. (2) On fields c and d, the MEPP algorithm with a large m can
achieve comparable ENT(π) to that of other algorithms. (3) On field d, increasing the
number of robots (e.g. k = 3), the MEPP algorithm can achieve smaller ENT(π) than
other algorithms. It can be found that the MEPP algorithm generalizes the work of [Low
et al., 2011] and can still maintain tight performance bounds on fields with large length
scales.
6.2.2
Mutual Information Metric
Fig. 6.4 shows the results of MI(π) for different algorithms with different number of robots
on the temperature fields. In the experiments, we have used different values of m for the
M2 IPP algorithm. For the MEPP algorithm, we have used m = 7 with one robot and m = 5
with two robots and three robots. With one robot, it can be observed that: (1) On fields
a and b, the M2 IPP and GM2 IPP algorithms can achieve larger MI(π) than the MEPP
and GMEPP algorithms: since the points selected by the maximum entropy criterion will
39
Chapter 6. Experimental Results
−64.8
1 2 3 4 5 6 7
m(field a)
Ent
−165
Ent
−64.6
−160
−57
−122
−124
−126
−128
−130
−181
−182
−175
−183
1 2 3 4 5 6 7
m(field c)
−110
−115
1
−180
−170
−105
−59
1 2 3 4 5 6 7
m(field b)
GMEPP
GM2IPP
MEPP
M2IPP
−100
−58
−60
Ent
Ent
−64.4
1 2 3 4 5 6 7
m(field d)
2 3 4 5
m(field a)
−128
−130
−132
−134
2 3 4 5
m(field b)
1
2 3 4 5
m(field d)
−138
−138.4
−138.8
1
2 3 4 5
m(field c)
(a) robot 1
(b) robot 2
−70
−45
Ent
1
−75
−46
−80
Ent
−47
1 2 3 4 5
m(field a)
1 2 3 4 5
m(field b)
−92
−89.3
−89.4
−89.5
−89.6
−93
1 2 3 4 5
m(field c)
−94
1 2 3 4 5
m(field d)
(c) robot 3
Figure 6.3: The results of ENT(π) for different algorithms with different number of robots on
the temperature fields.
be far away from each other, some points will be on the border of the field. So some of
the unobserved points cannot be separated from each other. And for field a or field b,
because the correlations of the field are small, the joint entropy of the unobserved part of
the field by maximum entropy criterion will be much smaller than that by maximum mutual
information criterion. As a result, MI(π gme ) and MI(π me ) are much smaller. (2) On fields a
40
Chapter 6. Experimental Results
and b, MI(π mi ) is the largest: for field a or field b, the horizontal correlation of the field is
small, our small Markov property is enough to exploit the horizontal correlation. In each
stage, the M2 IPP algorithm consider all the unobserved points around current points and
most of these unobserved points are on vertical line, so the large vertical correlation of field
b can also be exploited.
With two robots and three robots, it can be observed that: (1) When we increase the
number of robots (e.g., k = 3), the MEPP algorithm and GMEPP algorithms can achieve
comparable MI(π) to that of GM2 IPP algorithm: when we increase the number of sampled
locations in each column, the points selected by different criteria are similar. (2) When we
increase the number of robots, MI(π mi ) may be worse than other algorithms: because the
value of m we use is small, when the number of robots is large, the performance bound of
the M2 IPP algorithm is loose.
6.2.3
Prediction Error Metric
Fig. 6.5 shows the results of ERR(π) for different algorithms with different number of
robots on the temperature fields. In the experiments, we have set the MEPP algorithm and
the M2 IPP algorithm with different values of m. For maximum entropy criterion, with one
robot, it can be observed that: (1) With a small m (e.g., m = 2), ERR(π me ) is less than
or equal to ERR(π gme ) on fields a and b: the reason is the same as the second observation
under one-robot case in section 6.2.1. (2) For fields c and d, increasing the value of m
decreases the ERR(π me ) significantly: the reason is the same as the third observation under
one-robot case in section 6.2.1. With two robots and three robots, it can be observed that:
With a small m (e.g., m = 2), ERR(π me ) is less than or equal to ERR(π gme ) on fields b,
c and d: for field b, the reason has been explained under two-robot and three-robot cases
in section 6.2.1. For fields c and d, there are two reasons which account for this. Firstly,
because fields c and d are more correlated than fields a, b, all points selected can be used
to predict other points. Although m-order Markov property is not large enough to exploit
the large correlations completely, because our algorithm is non-myopic, the selected points
41
Chapter 6. Experimental Results
50
46
26
44
MI
1
2
3
m(field a)
MI
48
27
1
1
39.4
GMEPP
GM2IPP
39.2
MEPP
M2IPP
1
39
2
3
m(field c)
1
2
MI
1
2
m(field b)
37
36.8
36.6
36.4
36.2
46
45.5
45
44.5
2
3
m(field d)
1
2
1
2
m(field d)
m(field c)
(a) robot 1
(b) robot 2
40.8
62
40.6
61
40.4
60
1
2
1
2
m(field b)
m(field a)
MI
64
62
60
58
56
54
m(field a)
39.6
48
46
44
42
40
38
42
41.5
41
40.5
2
3
m(field b)
MI
MI
28
34
32
30
28
41
40
39
1
2
1
2
m(field d)
m(field c)
(c) robot 3
Figure 6.4: The results of MI(π) for different algorithms with different number of robots on the
temperature fields.
can distribute more evenly than the points selected by the greedy algorithm.
For maximum mutual information criterion, it can be observed that: With m = 2,
ERR(π mi ) is comparable to ERR(π gmi ) on all fields under one-robot and two-robot cases:
although the MI(π gmi ) is larger than the MI(π mi ), because our algorithm is non-myopic, the
points can be distributed more evenly than the points selected by the greedy algorithm.
42
Chapter 6. Experimental Results
Under three-robot case, ERR(π mi ) may be larger than ERR(π gmi ): the reason is the same
as the second observation under three-robot case in section 6.2.2.
−5
Err
x 10
x 10
2
5.8
4.5
4
3.5
3
2.5
1
5.7
1 2 3 4 5 6 7
m(field a)
−5
x 10
8
6
4
2
0
2
0
x 10
1 2 3 4 5 6 7
m(field c)
0
1 2 3 4 5 6 7
m(field b)
−6
x 10
1
2 3 4
m(field a)
GMEPP
GM2IPP
MEPP
M2IPP
1
1 2 3 4 5 6 7
m(field d)
2 3 4
m(field b)
5
−7
2.5
GMEPP
GM2IPP
MEPP
M2IPP
2
x 10
2
1.5
0
1
2 3 4
m(field c)
5
1
2 3 4
m(field d)
5
(b) robot 2
−6
−7
x 10
x 10
3.5
Err
1
−6
(a) robot 1
8
6
4
2
3
2.5
1
2 3 4
m(field a)
5
1
2 3 4
m(field b)
5
−8
−7
2
5
x 10
3
Err
Err
4
Err
5.9
5
4
3
2
1
−6
−5
−6
x 10
x 10
x 10
Err
10
8
1.5
6
1
2 3 4
m(field c)
5
1
2 3 4
m(field d)
5
(c) robot 3
Figure 6.5: The results of ERR(π) for different algorithms with different number of robots on
the temperature fields.
43
Chapter 6. Experimental Results
6.3
Plankton Data Results
6.3.1
Entropy Metric
Fig. 6.6 shows the results of ENT(π) for different algorithms on plankton density field with
different number of robots. In the experiments, we have used different values of m for the
MEPP algorithm. For the M2 IPP algorithm, we have set m = 2 with one robot and m = 1
with two robots and three robots. For entropy metric, it can be observed that with any
number of robots, the MEPP algorithm with a small m (e.g. m = 1) can achieve smallest
ENT(π): this is because small horizontal and large vertical correlations can be exploited by
our m-order Markov property and non-myopic algorithm as explained in section 6.2.1.
30
124
55
Ent
Ent
120
GMEPP
GM2IPP
MEPP
M2IPP
50
45
GMEPP
GM2IPP
MEPP
M2IPP
20
Ent
GMEPP
GM2IPP
MEPP
M2IPP
122
10
0
118
40
−10
116
1
2
3
4
1
m
(a) robot 1
2
m
(b) robot 2
3
1
2
m
3
(c) robot 3
Figure 6.6: The results of ENT(π) for different algorithms with different number of robots on
the plankton density field.
6.3.2
Mutual Information Metric
Fig. 6.7 shows the results of MI(π) for different algorithms on plankton density field with
different number of robots. In the experiments, with one robot, we have set m = 4 for the
MEPP algorithm and m = 2 for the M2 IPP algorithm. With two robots and three robots,
we have set m = 3 for the MEPP algorithm and m = 1 for the M2 IPP algorithm. For
mutual information metric, it can be observed that with any number of robots, the M2 IPP
algorithm can achieve MI(π) performance comparable to that of the GM2 IPP algorithm:
because the horizontal correlation is small, small Markov property is enough to exploit the
44
Chapter 6. Experimental Results
horizontal correlation. The large vertical correlation can be exploited by the unobserved
85
170
80
160
200
75
150
195
205
MI
MI
MI
points in each stage.
70
140
190
65
130
185
gme
gmi
mepp
m2ipp
(a) robot 1
gme
gmi
mepp
(b) robot 2
m2ipp
gme
gmi
mepp
m2ipp
(c) robot 3
Figure 6.7: The results of MI(π) for different algorithms with different number of robots on the
plankton density field.
6.3.3
Prediction Error Metric
Fig. 6.8 shows the results of ERR(π) for different algorithms on plankton density field with
different number of robots. In the experiments, we have set the MEPP algorithm and the
M2 IPP algorithm with different values of m. For maximum entropy criterion, it can be
observed that: ERR(π me ) with a small m (e.g. m = 1) is less than or equal to ERR(π gme )
with any number of robots: the reason is the same as the observation in section 6.3.1. For
maximum mutual information criterion, it can be observed that: ERR(π mi ) is comparable
to ERR(π gmi ): the reason is the same as the observation in section 6.3.2.
6.4
Time Efficiency
Fig. 6.9 shows the running time of different algorithms to derive the paths with different
number of robots on the temperature field. From this figure, when we use m ≤ 3, the MEPP
algorithm is more efficient than GMEPP algorithm with any number of robots. When we
use m = 1, the M2 IPP algorithm is much more efficient than the GM2 IPP algorithm with
any number of robots. When we use m = 2, the time efficiency of the M2 IPP algorithm is
45
Chapter 6. Experimental Results
x 10
−3
−3
14
x 10
10
8
3.5
3
4
3
2
1
1
2
3
4
2
1.5
6
1
GMEPP
GM2IPP
MEPP
M2IPP
2.5
GMEPP
GM2IPP
MEPP
M2IPP
Err
GMEPP
GM2IPP
MEPP
M2IPP
Err
12
Err
−4
x 10
5
1
2
m
m
(a) robot 1
3
1
(b) robot 2
2
m
3
(c) robot 3
Figure 6.8: The results of ERR(π) for different algorithms with different number of robots on
the plankton density field.
close to that of the GM2 IPP algorithm.
4
10
10
10
10
10
1
0
GMEPP
GM2IPP
MEPP
M2IPP
−1
1
2
3
4
m
5
6
7
3
10
2
2
Time(s)
10
4
10
3
Time(s)
Time(s)
10
3
10
1
10
0
2
10
1
10
0
10
10
−1
−1
10
10
1
(a) robot 1
2
3
m
(b) robot 2
4
5
1
2
3
m
4
5
(c) robot 3
Figure 6.9: The running time of different algorithms with different number of robots on the
temperature fields.
Fig. 6.10 shows the running time of different algorithms with different number of robots
on the plankton density field. From this figure, it can be observed that when we use m ≤ 3,
the MEPP algorithm is more efficient than the GMEPP algorithm with any number of
robots. It can be also observed that when we use m = 1, the M2 IPP algorithm can
achieve significant computational gain over the GM2 IPP algorithm, which supports our
time complexity analysis in section 5.5.
46
Chapter 6. Experimental Results
10
10
10
10
5
10
4
10
3
10
10
3
5
10
2
1
0
4
10
Time(s)
Time(s)
10
6
4
Time(s)
10
2
10
1
10
10
−1
−1
10
1
2
3
4
1
2
m
m
(a) robot 1
2
10
1
GMEPP
GM2IPP
MEPP
M2IPP
0
3
10
10
0
10
3
1
(b) robot 2
2
m
3
(c) robot 3
Figure 6.10: The running time of different algorithms with different number of robots on the
plankton density field.
6.5
Criterion Selection
In this section, we will discuss how to select the criterion for different environmental fields.
To answer this question, some principles are proposed. As we know in section 6.1, the
sampling points by the maximum entropy criterion will be faraway from each other. The
sampling points by the maximum mutual information criterion will be faraway from each
other and separate the unobserved points from each other. The following figures show how
the two criteria select the sampling points in an environmental field.
7
7
6
6
5
5
4
4
3
3
2
2
1
1
0
0
0
1
2
3
4
5
6
7
8
9
10
(a) maximum entropy criterion
11
0
1
2
3
4
5
6
7
8
9
10
11
(b) maximum mutual information criterion
Figure 6.11: Sampling points selected by different criteria.
To get lower prediction error, we propose three principles as follows:
47
Chapter 6. Experimental Results
1. For highly correlated environmental fields, maximum entropy criterion is better. For
small correlated environmental fields, maximum mutual information criterion is better.
If the field is highly correlated, the measurement values are close to each other. As we
know, the sampling points by maximum entropy criterion will be faraway from each
other. Some points will be on the border of the area. So the measurement values at
the unobserved points which are on the border can be predicted accurately. Although
there are only a few sampling points inside the area, because the measurement values
are close to each other, the measurement values at the unobserved points which are
inside the area can also be predicted accurately. If we use maximum mutual information criterion, there will be too many sampling points which are inside the area.
The measurement values at the unobserved points which are on the border can not
be predicted accurately. So if the field is highly correlated, the maximum entropy
criterion will be better. If the field is small correlated, the measurement values are
different from each other. As we know, the sampling points by the mutual information criterion are distributed among the unobserved points. Hence, the measurement
values at the unobserved points can be predicted accurately. If we use the maximum
entropy criterion, some sampling points will be on the border, which can not provide
much information for the inside unobserved points. So if the field is small correlated,
the maximum mutual information criterion is better.
2. If the number of sampling points in each column is large, maximum entropy criterion
is better. If the number of sampling points in each column is small, maximum mutual
information criterion is better.
If the number of sampling points in each column is large, there will be too many
sampling points in the area. To distribute these sampling points well, maximum
entropy criterion is better. If the number of sampling points in each column is small, to
get more information from the unobserved points, the maximum mutual information
criterion is better.
48
Chapter 6. Experimental Results
3. If the number of rows is large, maximum mutual information criterion is better. If
the number of rows is small, maximum entropy criterion is better.
If the number of rows is large, there are many unobserved points which are inside the
area. As we know, some of the sampling points by the maximum entropy criterion will
be on the border, which can not provide much information for the inside unobserved
points. In contrast, the sampling points by the maximum mutual information criterion
will be distributed among the unobserved points. As a result, when the number of
rows is large, the sampling points by the maximum mutual information criterion can
provide more information about the unobserved points. If the number of rows is small,
there are less unobserved points which are inside the area. We need less sampling
points which are inside the area. So if the number of rows is small, maximum entropy
criterion is better.
49
Chapter 7
Conclusions
In this thesis, we have studied the following problem:
How can we exploit the environmental structure to improve sampling performance as well as the time efficiency of planning for anisotropic fields?
To address the question, this thesis has provided the following novel contributions:
• Formalization of MEPP: It is found that for many GPs, correlation of two points
exponentially decreases with the distance between the two points. With this property,
we found that m-order Markov property can be applied to maximum entropy criterion
to reduce time complexity and guarantee the performance. Consequently, we propose
the polynomial-time approximation algorithm, MEPP. For a class of exploration tasks
called transect sampling task, a theoretical performance guarantee is provided for the
MEPP algorithm.
• Formalization of M2 IPP: For maximum entropy criterion, the m-order Markov property can be used to reduce time complexity and guarantee the performance. However,
it is found that the m-order Markov property cannot be applied to maximum mutual information criterion. To solve this problem, another approximation method is
provided. Based on this approximation method, we propose the M2 IPP algorithm.
The time complexity of the M2 IPP algorithm is also polynomial. A theoretical per-
50
Chapter 7. Conclusions
formance guarantee on the sampling performance of the M2 IPP algorithm for the
transect sampling task is provided as well.
• Evaluation of performance: We evaluate the performance of two proposed algorithms
with two real-world data sets. The performance is measured with three metrics:
entropy, mutual information and prediction error. The results of our proposed algorithms are compared with two other state-of-the-art algorithms: GMEPP and
GM2 IPP. For maximum entropy criterion, the MEPP algorithm with a small m can
achieve lower joint entropy of the observed part than the GMEPP algorithm on fields
with small horizontal length scales. For the fields with large length scales, the MEPP
algorithm with a large m can also achieve comparable performance. The prediction
error by the MEPP algorithm is smaller than that by the GMEPP algorithm almost
on all fields. When we use m ≤ 3, with any number of robots, the MEPP algorithm
is more efficient than the GMEPP algorithm. For maximum mutual information criterion, the mutual information and the prediction error by the M2 IPP algorithm is
comparable to that by the GM2 IPP algorithm in two data sets. On the fields with
small number of columns, the time efficiency of the M2 IPP algorithm with m = 2
is close to that of the GM2 IPP algorithm. However on the fields with large number of columns, the M2 IPP algorithm with m = 1 is much more efficient than the
GM2 IPP algorithm. To get lower prediction error for different environmental fields,
we proposed three principles to select the criterion.
51
Appendix A
Maximum Entropy Path Planning
A.1
Proof for Lemma 2.2.1
Given any vector A of observed points and an unobserved point y, with (2.2), the posterior
variance of point y is
2
σy|A
= σy2 − ΣyA Σ−1
AA ΣAy
(A.1)
where ΣAA is a covariance matrix, ΣyA and ΣAy are covariance vectors. As we know, the
2
posterior variance σy|A
should be larger than 0. So if σn2 > 0, we have:
2
σy|A
= σs2 + σn2 − ΣyA Σ−1
AA ΣAy > 0
(A.2)
where the values in the diagonal line of ΣAA are σs2 + σn2 . And if σn2 = 0, we have:
2
σy|A
= σs2 − ΣyA Σ−1
BB ΣAy > 0
(A.3)
where ΣBB is a covariance matrix and ΣBB = ΣAA − σn2 I. According to the covariance
function, the covariance vectors ΣyA and ΣAy do not change.
We define A
ΣAA , B
ΣBB , E
σn2 I, Y
ΣAy , YT
ΣyA . And let W
52
Chapter A. Maximum Entropy Path Planning
−1 Y, Y = AW, we have:
Σ−1
AA ΣAy = A
WT EW + WT ET B−1 EW > 0
(A.4)
⇒ WT BT B−1 EW + WT ET B−1 EW > 0
(A.5)
⇒ WT (B + E)T B−1 EW > 0
(A.6)
⇒ WT AT B−1 EW + WT AT W > WT AT W
(A.7)
⇒ WT AT (B−1 E + I)W > WT AT W
(A.8)
⇒ WT AT B−1 (E + B)W > WT AT A−1 AW
(A.9)
⇒ (AW)T B−1 AW > (AW)T A−1 AW
(A.10)
⇒ YT B−1 Y > YT A−1 Y
(A.11)
−1
⇒ ΣY A Σ−1
BB ΣAY > ΣY A ΣAA ΣAY .
(A.12)
For inequality (A.4), because W is a vector and E = σn2 I, we have WT EW > 0. Because B
is a covariance matrix, which is invertible and positive semi-definite, B−1 is positive semidefinite. Hence, WT ET B−1 EW ≥ 0 and inequality (A.4) holds. Since B is a covariance
matrix, which is symmetric, we have BT = B. Hence, inequality (A.5) can be obtained
from inequality (A.4). Other results can be obtained easily. Consequently, we have:
−1
ΣyA Σ−1
AA ΣAy < ΣyA ΣBB ΣAy .
(A.13)
For (A.2), with above result (A.13), we have:
2
σy|A
= σs2 + σn2 − ΣyA Σ−1
AA ΣAy
> σs2 + σn2 − ΣyA Σ−1
BB ΣAy
(A.14)
> σn2 .
(A.15)
With inequality (A.3), we have inequality (A.15). Therefore, lemma 2.2.1 holds.
53
Chapter A. Maximum Entropy Path Planning
A.2
Proof for Lemma 4.5.1
Given x1:m , (4.6) and (4.7) can be rewritten as follows:
me
Vm+1
(x1:m )
=
=
max
me
(x2:m+1 )
H(Zxm+1 |Zx1:m ) + Vm+2
max
{H(Zxm+1 |Zx1:m )
xm+1 ∈Xm+1
xm+1 ∈Xm+1
+
=
max
xm+2 ∈Xm+2
max
xm+1 ∈Xm+1 ,xm+2 ∈Xm+2
me
(x3:m+2 )]}
[H(Zxm+2 |Zx2:m+1 ) + Vm+3
(A.16)
(A.17)
{H(Zxm+1 |Zx1:m )
me
+ H(Zxm+2 |Zx2:m+1 ) + Vm+3
(x3:m+2 )}
(A.18)
...
n
=
H(Zxi |Zxi−m:i−1 ).
max
xm+1 ∈Xm+1 ,...,xn ∈Xn
(A.19)
i=m+1
Therefore, given x1:m , we can get the vectors xm+1 , . . . , xn which have maximum value for
n
H(Zxi |Zxi−m:i−1 ). With (4.8), we have:
i=m+1
me
xme
1:m = arg max H(Zx1:m ) + Vm+1 (x1:m )
(A.20)
x1:m ∈X1:m
where X1:m is the set of all possible x1:m over the first m columns. With (A.20), the paths
n
H(Zxi |Zxi−m:i−1 ) can be obtained. Therefore, the
x1:n having maximum H(Zx1:m ) +
i=m+1
lemma 4.5.1 holds.
54
Chapter A. Maximum Entropy Path Planning
A.3
Proof for Lemma 4.5.2
The following proof is for single-robot case. We will apply the result to multi-robot case.
Let xA
xi−m:i−1 , xp
xi−m−1 and we have:
σx2i−m−1 |xi−m:i−1 − σx2i−m−1 |(xi−m:i−1 ,xi ) = σx2p |xA − σx2p |(xA ,xi )
(A.21)
H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi )
= H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi ).
(A.22)
For (A.21), we have:
σx2p |xA − σx2p |(xA ,xi )
≤ σx2p − σx2p |xi
= σx2p − (σx2p −
≤
(A.23)
K2 (xp , xi )
)
σx2i
(A.24)
ε2
.
σx2i
(A.25)
The first inequality is from the assumption that the variance reduction σx2p |xA − σx2p |(xA ,xi ) is
submodular. And the work of [Das and Kempe, 2008a] shows that if with measurements zxA ,
the correlation between Zxp and Zxi does not increase, the variance reduction is submodular.
If we have the measurements zxA , because the points in A are close to xp , σx2p |xA is small.
Due to the small correlation between xp and xi , the point xi can not reduce σx2p |xA any
more. As a result, to predict the variance of point xp , the variance will decrease more when
adding point xi to an empty set than adding point xi to a larger set. With (2.2), (A.24) can
be obtained. Note that the distance between any pair of two points which are from stage i
and stage i − m − 1 respectively is at least (m + 1) ∗ ω1 , so
max K(xji−m−1 , xji ) should
1≤j,j ≤k
2
}. Hence, (A.25) can be obtained.
be less than σs2 exp{− (m+1)
2 2
1
55
Chapter A. Maximum Entropy Path Planning
For (A.22), we have:
H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi )
σx2p |xA
1
= log 2
2
σxp |(xA ,xi )
(A.26)
2
σx2p |(xA ,xi ) + σε2
1
xi
≤ log
2
2
σxp |(xA ,xi )
=
ε2
1
log{1 + 2
}
2
σxp |(xA ,xi ) σx2i
≤ log{1 +
ε2
}
σn2 (σn2 + σs2 )
(A.27)
(A.28)
(A.29)
With (4.4), (A.26) can be obtained. The second step (A.27) is using the result in (A.25).
Then, the result (A.29) can be obtained with the lemma 2.2.1.
If there are k robots, each vector x contains k points. Let x1:k indicate the k points of
vector x. With the chain rule of entropy, we have:
H(Zxp |ZxA ) − H(Zxp |ZxA , Zxi )
(A.30)
= {H(Zx1p |ZxA ) − H(Zx1p |ZxA , Zxi )} + . . .
+ {H(Zxkp |Zx1:k−1
, ZxA ) − H(Zxkp |Zx1:k−1
, ZxA , Zxi )}
p
p
(A.31)
k
≤
{H(Zxj |ZxA ) − H(Zxj |ZxA , Zxi )}.
j=1
p
p
(A.32)
For each part of (A.31), due to submodularity, the variance reduction will be larger when
56
Chapter A. Maximum Entropy Path Planning
adding xi into a smaller set. Then (A.32) can be obtained. For (A.32), we have:
H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1:k )
p
p
i
= H(Zxj |ZxA ) − {H(Zxj |ZxA , Zx1 )
p
p
i
+ H(Zx2:k |Zxj , ZxA , Zx1 ) − H(Zx2:k |ZxA , Zx1 )}
p
i
i
(A.33)
i
i
= H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1 )
p
p
i
+ H(Zx2:k |ZxA , Zx1 ) − H(Zx2:k |Zxj , ZxA , Zx1 )
i
i
i
p
(A.34)
i
= H(Zxj |ZxA ) − H(Zxj |ZxA , Zx1 )
p
p
i
k
H(Zxj |ZxA , Zx1:j −1 ) − H(Zxj |Zxj , ZxA , Zx1:j −1 )
+
j =2
≤ k ∗ log{1 +
i
i
i
p
(A.35)
i
ε2
}.
σn2 (σn2 + σs2 )
(A.36)
As a result, the entropy decrease H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi ) can be
bounded with the following result:
H(Zxi−m−1 |Zxi−m:i−1 ) − H(Zxi−m−1 |Zxi−m:i−1 , Zxi )
≤ k 2 log{1 +
ε2
}
σn2 (σn2 + σs2 )
(A.37)
With a similar proof, we can know that if t ≥ m, H(Zxi−t−1 |Zxi−t:i−1 )−H(Zxi−t−1 |Zxi−t:i−1 , Zxi )
should be less than k 2 log{1 +
ε2
2 (σ 2 +σ 2 ) }.
σn
n
s
Therefore, the lemma 4.5.2 holds.
57
Chapter A. Maximum Entropy Path Planning
A.4
Proof for Corollary 4.5.3
With the chain rule of entropy, for H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 ), we have:
H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 )
= H(Zxi |Zxi−m:i−1 ) + H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi ).
(A.38)
And we also have:
H(Zx1:i−m−1 , Zxi |Zxi−m:i−1 )
= H(Zx1:i−m−1 |Zxi−m:i−1 ) + H(Zxi |Zx1:i−m−1 , Zxi−m:i−1 )
= H(Zx1:i−m−1 |Zxi−m:i−1 ) + H(Zxi |Zx1:i−1 ).
(A.39)
With (A.38) and (A.39), we can get
H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 )
= H(Zx1:i−m−1 |Zxi−m:i−1 ) − H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi ).
(A.40)
For (A.40), with chain rule of entropy, we have:
H(Zx1:i−m−1 |Zxi−m:i−1 ) − H(Zx1:i−m−1 |Zxi−m:i−1 , Zxi )
(A.41)
i−m−1
H(Zxt |Zxt+1:i−1 ) − H(Zxt |Zxt+1:i−1 , Zxi )
=
(A.42)
t=1
≤ (i − m − 1)k 2 log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(A.43)
With lemma 4.5.2, each part in (A.42) can be bounded. Then inequality (A.43) can be
obtained. Therefore, corollary 4.5.3 holds.
58
Chapter A. Maximum Entropy Path Planning
A.5
Proof for Theorem 4.5.4
∗
Let xme
1:n be the optimal paths of the MEPP algorithm and x1:n be the optimal paths of
iMASP, according to lemma 4.5.1, we have:
n
)
|Zxme
H(Zxme
i−m:i−1
i
H(Zxme
)+
1:m
i=m+1
n
≥ H(Zx∗1:m ) +
H(Zx∗i |Zx∗i−m:i−1 )
(A.44)
i=m+1
and let
n
H(Zxme
|Zxme
)
i
i−m:i−1
θ = H(Zxme
)+
1:m
i=m+1
n
− {H(Zx∗1:m ) +
H(Zx∗i |Zx∗i−m:i−1 )}.
(A.45)
i=m+1
From (A.44), we have θ ≥ 0. With the chain rule of entropy, the entropy decrease H(Zx∗1:n )−
H(Zxme
) can be rewritten as:
1:n
H(Zx∗1:n ) − H(Zxme
)
1:n
n
H(Zx∗i |Zx∗1:i−1 )
= H(Zx∗1:m ) +
i=m+1
n
− {H(Zxme
)+
1:m
H(Zxme
|Zxme
)}.
i
1:i−1
(A.46)
i=m+1
To apply θ into (A.46), we have to replace each H(Zx∗i |Zx∗1:i−1 ) with H(Zx∗i |Zx∗i−m:i−1 ) and
each H(Zxme
|Zxme
) with H(Zxme
|Zxme
). Let ∆∗i = H(Zx∗i |Zx∗i−m:i−1 ) − H(Zx∗i |Zx∗1:i−1 )
i
1:i−1
i
i−m:i−1
me
me
me
me
and ∆me
i = H(Zxi |Zxi−m:i−1 ) − H(Zxi |Zx1:i−1 ), where i = m + 1, . . . , n. (A.46) can be
59
Chapter A. Maximum Entropy Path Planning
rewritten as:
H(Zx∗1:n ) − H(Zxme
)
1:n
n
[H(Zx∗i |Zx∗i−m:i−1 ) − ∆∗i ]
= H(Zx∗1:m ) +
i=m+1
n
) − ∆me
|Zxme
[H(Zxme
i ]}
i−m:i−1
i
− {H(Zxme
)+
1:m
(A.47)
i=m+1
n
n
∆∗i
H(Zx∗i |Zx∗i−m:i−1 ) −
= H(Zx∗1:m ) +
i=m+1
i=m+1
n
n
− [H(Z
xme
1:m
H(Z
)+
xme
i
|Z
xme
i−m:i−1
∆me
i ]
)−
(A.48)
i=m+1
i=m+1
n
=
∗
[∆me
i − ∆i ] − θ
(A.49)
i=m+1
n
≤
∗
[∆me
i − ∆i ]
(A.50)
i=m+1
n
∆me
i .
≤
(A.51)
i=m+1
Replacing the θ in (A.48), (A.49) can be obtained. With θ ≥ 0, we can get (A.50) from
(A.49). Because each ∆∗i ≥ 0 (m + 1 ≤ i ≤ n) in (A.50), (A.51) can be obtained. For
2
(A.51), with corollary 4.5.3, ∆me
i ≤ (i − m − 1)k log{1 +
ε2
2 (σ 2 +σ 2 ) },
σn
n
s
where m + 1 ≤ i ≤ n.
Hence, for the entropy decrease H(Zx∗1:n ) − H(Zxme
), we have:
1:n
H(Zx∗1:n )−H(Zxme
) ≤ (n − m)2 k 2 log{1 +
1:n
ε2
}.
σn2 (σn2 + σs2 )
(A.52)
Therefore, theorem 4.5.4 holds.
60
Appendix B
Maximum Mutual Information Path
Planning
B.1
Proof for Lemma 5.6.1
Given x1:2m , (5.6) and (5.7) can be rewritten as follows:
mi
V2m+1
(x1:2m )
=
=
max
mi
I(Zxm+1 ; Zu1:2m+1 |Zx1:m ) + V2m+2
(x2:2m+1 )
max
{I(Zxm+1 ; Zu1:2m+1 |Zx1:m )
x2m+1 ∈X2m+1
x2m+1 ∈X2m+1
+
=
max
x2m+2 ∈X2m+2
max
mi
[I(Zxm+2 ; Zu2:2m+2 |Zx2:m+1 ) + V2m+3
(x3:2m+2 )]}
x2m+1 ∈X2m+1 ,x2m+2 ∈X2m+2
(B.1)
(B.2)
{I(Zxm+1 ; Zu1:2m+1 |Zx1:m )
mi
+ I(Zxm+2 ; Zu2:2m+2 |Zx2:m+1 ) + V2m+3
(x3:2m+2 )}
(B.3)
...
=
max
x2m+1 ∈X2m+1 ,...,xn ∈Xn
{
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )}.
(B.4)
i=2m+1
61
Chapter B. Maximum Mutual Information Path Planning
As a result, given x1:2m , we can get vectors x2m+1 , . . . , xn which have maximum value for
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ). With (5.8), we
i=2m+1
have:
mi
xmi
1:2m = arg max I(Zx1:m ; Zu1:2m ) + V2m+1 (x1:2m )
(B.5)
x1:2m ∈X1:2m
where X1:2m is the set of all possible x1:2m over first 2m columns. From (5.6) and (5.7), for
mi
each x1:2m , we can get the value V2m+1
(x1:2m ). With (B.5), the paths x1:n having maximum
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )+I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )
I(Zx1:m ; Zu1:2m )+
i=2m+1
can be obtained. Therefore, lemma 5.6.1 holds.
B.2
Proof for Other Lemmas
Before we show the proof for lemma 5.6.2 and theorem 5.6.3, the following lemmas are
needed.
Lemma B.2.1. Let ε
2
σs2 exp{− (m+1)
}. Given vector xi−m , vector xi−2m:i−m−1 and
2 2
1
vector ui−2m:i , for any vector xi−2m−1 , we have:
H(Zxi−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )
≤ k 2 log{1 +
ε2
}
σn2 (σn2 + σs2 )
(B.6)
With submodularity and lemma 4.5.2, the result can be easily obtained. As a result, we
know that if t ≤ i−2m−1, H(Zxt |Zxt+1:i−m−1 , Zui−2m:i )−H(Zxt |Zxt+1:i−m−1 , Zui−2m:i , Zxi−m )
should be less than k 2 log{1 +
Corollary B.2.2. Let ε
ε2
2 (σ 2 +σ 2 ) }.
σn
n
s
2
σs2 exp{− (m+1)
}. Given vector xi−m , vector xi−2m:i−m−1 and
2 2
1
62
Chapter B. Maximum Mutual Information Path Planning
vector ui−2m:i , for any vector ui−2m−1 , we have:
H(Zui−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zui−2m−1 |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )
≤ (r − k)k log{1 +
ε2
}
σn2 (σn2 + σs2 )
(B.7)
Because the number of rows is r and the number of sampling locations in each column
is k, the number of unobserved locations in each column is r − k. Hence, the size of vector
ui−2m−1 is r−k. With a similar proof as lemma 4.5.2, the result can be obtained. As a result,
if t ≤ i − 2m − 1, H(Zut |Zxi−2m:i−m−1 , Zut+1:i ) − H(Zut |Zxi−2m:i−m−1 , Zut+1:i , Zxi−m ) should
2
be less than (r − k)k log{1 + σ2 (σε2 +σ2 ) }. And if t ≥ i + 1, H(Zut |Zxi−2m:i−m−1 , Zui−2m:t−1 ) −
n
n
s
2
H(Zut |Zxi−2m:i−m−1 , Zui−2m:t−1 , Zxi−m ) should also be less than (r − k)k log{1 + σ2 (σε2 +σ2 ) }.
n
n
s
63
Chapter B. Maximum Mutual Information Path Planning
Lemma B.2.3. Let ε
2
}. Given vector xi−m , vector xi−2m:i−m−1 and
σs2 exp{− (m+1)
2 2
1
vector ui−2m:i , we have
H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n )
≤ (n − 2m − 1)rk log{1 +
ε2
}
+ σs2 )
σn2 (σn2
(B.8)
Let Zx∆ = Zx1:i−m−1 \Zxi−2m:i−m−1 and Zu∆ = Zu1:n \Zui−2m:i .
H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n )
= H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − [H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i )
+ H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )
− H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i )]
(B.9)
= H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i )
− H(Zx∆ , Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )
(B.10)
= [H(Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zu∆ |Zxi−2m:i−m−1 , Zui−2m:i , Zxi−m )]
+ [H(Zx∆ |Zxi−2m:i−m−1 , Zu1:n ) − H(Zx∆ |Zxi−2m:i−m−1 , Zu1:n , Zxi−m )]
(B.11)
i−2m−1
[H(Zut |Zxi−2m:i−m−1 , Zut+1:i ) − H(Zut |Zxi−2m:i−m−1 , Zut+1:i , Zxi−m )]
=
t=1
n
[H(Zut |Zxi−2m:i−m−1 , Zu1:t−1 ) − H(Zut |Zxi−2m:i−m−1 , Zu1:t−1 , Zxi−m )]
+
t=i+1
i−2m−1
[H(Zxt |Zxt+1:i−m−1 , Zu1:n ) − H(Zxt |Zxt+1:i−m−1 , Zu1:n , Zxi−m )]
+
(B.12)
t=1
≤ [(n − 2m − 1)(r − k)k + (i − 2m − 1)k 2 ] log{1 +
≤ (n − 2m − 1)[(r − k)k + k 2 ] log{1 +
= (n − 2m − 1)rk log{1 +
ε2
}
σn2 (σn2 + σs2 )
ε2
}
σn2 (σn2 + σs2 )
ε2
}.
σn2 (σn2 + σs2 )
(B.13)
(B.14)
(B.15)
With the chain rule of entropy, (B.9), (B.11) and (B.12) can be obtained. With lemma
64
Chapter B. Maximum Mutual Information Path Planning
B.2.1 and corollary B.2.2, each part in (B.12) can be bounded. Then inequality (B.13) can
be obtained. Inequality (B.15) can be easily obtained.
B.3
Proof For Lemma 5.6.2
With the definition of mutual information, we have:
I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) = H(Zxi−m |Zx1:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n )
(B.16)
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
= H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ).
(B.17)
With (B.16) and (B.17), we have:
I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) − I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
= Ai−m − Bi−m
(B.18)
Ai−m = H(Zxi−m |Zxi−2m:i−m−1 , Zui−2m:i ) − H(Zxi−m |Zx1:i−m−1 , Zu1:n )
(B.19)
Bi−m = H(Zxi−m |Zxi−2m:i−m−1 ) − H(Zxi−m |Zx1:i−m−1 ).
(B.20)
With lemma B.2.3, the value of Ai−m can be bouned by:
Ai−m ≤ (n − 2m − 1)rk log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(B.21)
With corollary 4.5.3, the value of Bi−m can be bounded by:
Bi−m ≤ (i − 2m − 1)k 2 log{1 +
ε2
}.
+ σs2 )
σn2 (σn2
(B.22)
Therefore, the lemma 5.6.2 holds.
65
Chapter B. Maximum Mutual Information Path Planning
B.4
Proof For Theorem 5.6.3
2
Let xmi
1:n be the optimal paths of the M IPP algorithm and x1:n be the optimal paths of the
exhaustive algorithm, according to lemma 5.6.1, we have:
n−1
I(Zxmi ; Zumi
I(Zxmi
; Zumi
)+
1:m
1:2m
i−m
i=2m+1
i−2m:i
|Zxmi
i−2m:i−m−1
)
+ I(Zxmi
; Zumi
|Zxmi
)≥
n−m:n
n−2m:n
n−2m:n−m−1
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
I(Zx1:m ; Zu1:2m ) +
i=2m+1
+ I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ),
(B.23)
and let
n−1
θ = I(Zxmi
; Zumi
)+
1:m
1:2m
I(Zxmi ; Zumi
i=2m+1
i−m
i−2m:i
|Zxmi
i−2m:i−m−1
)
+ I(Zxmi
; Zumi
|Zxmi
)−
n−m:n
n−2m:n
n−2m:n−m−1
n−1
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )
[I(Zx1:m ; Zu1:2m ) +
i=2m+1
+ I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )].
(B.24)
66
Chapter B. Maximum Mutual Information Path Planning
From (B.23), we have θ ≥ 0.
For the mutual information decrease I(Zx1:n , Zu1:n ) −
I(Zxmi
, Zumi
), it can be rewritten as:
1:n
1:n
I(Zx1:n , Zu1:n ) − I(Zxmi
, Zumi
)
1:n
1:n
n−1
I(Zxi−m ; Zu1:n |Zx1:i−m−1 )
= I(Zx1:m ; Zu1:n ) +
i=2m+1
+ I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 )
n−1
|Zxmi
I(Zxmi ; Zumi
1:n
− [I(Zxmi
; Zumi
)+
1:m
1:n
i=2m+1
i−m
1:i−m−1
)
+ I(Zxmi
; Zumi
|Zxmi
)].
n−m:n
1:n
1:n−m−1
(B.25)
To apply θ into the (B.25), we have to replace I(Zx1:m ; Zu1:n ) with I(Zx1:m ; Zu1:2m ), each
I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) with I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ) where 2m + 1 ≤ i ≤ n − 1 and
I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 ) with I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 ).
With the definition of mutual information, we have :
I(Zx1:m ; Zu1:n ) = H(Zx1:m ) − H(Zx1:m |Zu1:n )
(B.26)
I(Zx1:m ; Zu1:2m ) = H(Zx1:m ) − H(Zx1:m |Zu1:2m ).
(B.27)
With the chain rule of entropy, we have:
H(Zx1:m |Zu1:2m ) − H(Zx1:m |Zu1:n )
= H(Zx1 |Zu1:2m ) − H(Zx1 |Zu1:n ) + . . .
+ H(Zxm |Zx1:m−1 , Zu1:2m ) − H(Zxm |Zx1:m−1 , Zu1:n )
(B.28)
m
H(Zxt |Zx1:t−1 , Zu1:2m ) − H(Zxt |Zx1:t−1 , Zu1:n )
=
(B.29)
t=1
≤ m(n − 2m)(r − k)k log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(B.30)
With a similar proof as lemma B.2.3, each part in (B.29) can be bounded. Then, inequality
67
Chapter B. Maximum Mutual Information Path Planning
(B.30) can be obtained.
Apply (B.30) to (B.26) and (B.27), we have:
I(Zx1:m ; Zu1:n ) − I(Zx1:m ; Zu1:2m ) = A1:m
(B.31)
where
A1:m ≤ m(n − 2m)(r − k)k log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(B.32)
With the lemma 5.6.2, each I(Zxi−m ; Zu1:n |Zx1:i−m−1 ) can be replaced with
I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 ), where 2m + 1 ≤ i ≤ n − 1.
With the definition of mutual information, we have:
I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 )
= H(Zxn−m:n |Zx1:n−m−1 ) − H(Zxn−m:n |Zx1:n−m−1 , Zu1:n )
(B.33)
I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )
= H(Zxn−m:n |Zxn−2m:n−m−1 ) − H(Zxn−m:n |Zxn−2m:n−m−1 , Zun−2m:n ).
(B.34)
With the chain rule of entropy , we have:
H(Zxn−m:n |Zxn−2m:n−m−1 ) − H(Zxn−m:n |Zx1:n−m−1 )
n
H(Zxt |Zxn−2m:t−1 ) − H(Zxt |Zx1:t−1 )
=
(B.35)
t=n−m
≤ (m + 1)(n − 2m − 1)k 2 log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(B.36)
With corollary 4.5.3, inequality (B.36) can be obtained.
68
Chapter B. Maximum Mutual Information Path Planning
With the chain rule of entropy , we have:
H(Zxn−m:n |Zxn−2m:n−m−1 , Zun−2m:n )
− H(Zxn−m:n |Zx1:n−m−1 , Zu1:n )
n
{H(Zxt |Zxn−2m:t−1 , Zun−2m:n )
=
t=n−m
− H(Zxt |Zx1:t−1 , Zu1:n )}
≤ (m + 1)(n − 2m − 1)rk log{1 +
ε2
}.
σn2 (σn2 + σs2 )
(B.37)
(B.38)
With a similar proof as lemma B.2.3, inequality (B.38) can be obtained.
Apply the results (B.36) and (B.38) to (B.33) and (B.34), we have:
I(Zxn−m:n ; Zu1:n |Zx1:n−m−1 )
− I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )
= An−m:n − Bn−m:n
(B.39)
where
ε2
}
σn2 (σn2 + σs2 )
ε2
≤ (m + 1)(n − 2m − 1)k 2 log{1 + 2 2
}.
σn (σn + σs2 )
An−m:n ≤ (m + 1)(n − 2m − 1)rk log{1 +
(B.40)
Bn−m:n
(B.41)
69
Chapter B. Maximum Mutual Information Path Planning
With above results, (B.25) can be rewritten as:
I(Zx1:n ; Zu1:n ) − I(Zxmi
; Zumi
)
1:n
1:n
= A1:m + I(Zx1:m ; Zu1:2m )
n−1
[Ai−m − Bi−m + I(Zxi−m ; Zui−2m:i |Zxi−2m:i−m−1 )]
+
i=2m+1
+ [An−m:n − Bn−m:n + I(Zxn−m:n ; Zun−2m:n |Zxn−2m:n−m−1 )]
− {Ami
; Zumi
)
1:m + I(Zxmi
1:m
1:2m
n−1
mi
[Ami
i−m − Bi−m + I(Zxmi ; Zumi
+
i−m
i=2m+1
i−2m:i
|Zxmi
i−2m:i−m−1
)]
mi
+ [Ami
; Zumi
|Zxmi
)]}
n−m:n − Bn−m:n + I(Zxmi
n−m:n
n−2m:n
n−2m:n−m−1
(B.42)
n−1
(Ai−m − Bi−m ) + (An−m:n − Bn−m:n )
= A1:m +
i=2m+1
n−1
mi
mi
mi
(Ami
i−m − Bi−m ) + (An−m:n − Bn−m:n )] − θ
− [Ami
1:m +
(B.43)
i=2m+1
n−1
n−1
≤ A1:m +
mi
mi
Bi−m
+ Bn−m:n
Ai−m + An−m:n +
i=2m+1
i=2m+1
n−1
n−1
mi
Ami
i−m + An−m:n +
− [Ami
1:m +
i=2m+1
Bi−m + Bn−m:n ]
(B.44)
i=2m+1
70
Chapter B. Maximum Mutual Information Path Planning
n−1
n−1
≤ A1:m +
mi
mi
Bi−m
+ Bn−m:n
Ai−m + An−m:n +
i=2m+1
(B.45)
i=2m+1
≤ [m(n − 2m)(r − k)k + (n − 2m − 1 + m + 1)(n − 2m − 1)rk
ε2
1
}
+ (n − 2m − 1)(n − 2m − 2)k 2 + (m + 1)(n − 2m − 1)k 2 )] log{1 + 2 2
2
σn (σn + σs2 )
(B.46)
≤ [m(n − 2m)(r − k)k + (n − m)(n − 2m)rk
1
ε2
+ (n − 2m)(n − 2m − 2)k 2 + (m + 1)(n − 2m)k 2 )] log{1 + 2 2
}
2
σn (σn + σs2 )
(B.47)
= [m(n − 2m)(r − k)k + (n − m)(n − 2m)rk
1
ε2
+ (n − 2m)(n − 2m)k 2 + m(n − 2m)k 2 )] log{1 + 2 2
}
2
σn (σn + σs2 )
1
ε2
= [mr + (n − m)r + (n − 2m)k](n − 2m)k log{1 + 2 2
}
2
σn (σn + σs2 )
1
ε2
= [nr + (n − 2m)k](n − 2m)k log{1 + 2 2
}.
2
σn (σn + σs2 )
(B.48)
(B.49)
(B.50)
With results (B.31), (B.39) and lemma 5.6.2, (B.42) can be obtained. Applying θ in (B.42),
(B.43) can be obtained. Due to θ ≥ 0, inequality (B.44) can be obtained. With results
(B.32), (B.40), (B.41) and lemma 5.6.2, inequalities (B.45) and (B.46) can be obtained.
Other inequalities can be obtained easily.
Then, the value I(Zx1:n ; Zu1:n ) − I(Zxmi
; Zumi
) can be bounded with:
1:n
1:n
; Zumi
)
I(Zx1:n ; Zu1:n ) − I(Zxmi
1:n
1:n
ε2
1
}.
≤ [nr + (n − 2m)k](n − 2m)k log{1 + 2 2
2
σn (σn + σs2 )
Therefore, theorem 5.6.3 holds.
71
Bibliography
[Austerlitz et al., 2007] F. Austerlitz, C. Dutech, P. E. Smouse, F. Davis, and V. L. Sork.
Estimating anisotropic pollen dispersal: a case study in quercus lobata. Heredity, 99:193–
204, 2007.
[Batalin et al., 2004] M. A. Batalin, M. Rahimi, Y. Yu, D. Liu, A. Kansal, G. S. Sukhatme,
W. J. Kaiser, M. Hansen, G. J. Pottie, M. Srivastava, and D. Estrin. Call and response:
Experiments in sampling the environment. In Proc. SenSys, pages 25–38, 2004.
[Binney et al., 2010] J. Binney, A. Krause, and G. S. Sukhatme. Informative path planning
for an autonomous underwater vehicle. In Proc. ICRA, pages 4791–4796, 2010.
[Boisvert and Deutsch, 2011] J. B. Boisvert and C. V. Deutsch. Modeling locally varying
anisotropy of CO2 emissions in the United States. Stoch. Environ. Res. Risk Assess.,
25:1077–1084, 2011.
[Budrikait˙e and Duˇcinskas, 2005] L. Budrikait˙e and K. Duˇcinskas. Modelling of geometric
anisotropic spatial variation. In Proc. 10th International Conference on Mathematical
Modelling and Analysis, pages 361–366, 2005.
[Das and Kempe, 2008a] A. Das and D. Kempe. Algorithms for subset selection in linear
regression. In Proc. STOC, pages 45–54, 2008.
[Das and Kempe, 2008b] A. Das and D. Kempe. Sensor selection for minimizing worst-case
prediction error. In Proc. IPSN, pages 97–108, 2008.
72
BIBLIOGRAPHY
[Franklin and Mills, 2007] R. B. Franklin and A. L. Mills. Statistical analysis of spatial
structure in microbial communities. In R. B. Franklin and A. L. Mills, editors, The
Spatial Distribution of Microbes in rhe Environment, pages 31–60. Springer, 2007.
[Garnett et al., 2010] R. Garnett, M. A. Osborne, and S. J. Roberts. Bayesian optimization
for sensor set selection. In Proc. IPSN, pages 209–219, 2010.
[Guestrin et al., 2005] C. Guestrin, A. Krause, and A. Singh. Near-optimal sensor placements in gaussian processes. In Proc. ICML, pages 265–272, August 2005.
[Hosoda and Kawamura, 2005] K. Hosoda and H. Kawamura.
Seasonal variation of
space/time statistics of short-term sea surface temperature variability in the Kuroshio
region. J. Oceanography, 61(4):709–720, 2005.
[Ko et al., 1995] C. Ko, J. Lee, and M. Queyranne. An exact algorithm for maximum
entropy sampling. Ops Research, 43:684–691, 1995.
[Korf, 1990] R. E. Korf. Real-time heuristic search. Artificial Intelligence, 42(2-3):189 –
211, 1990.
[Krause et al., 2006] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal
sensor placements: maximizing information while minimizing communication cost. In
Proc. IPSN, pages 2–10, 2006.
[Low et al., 2007] K. H. Low, G. Gorden, J. M. Dolan, and P. Khosla. Adaptive sampling
for multi-robot wide-area exploration. In Proc. ICRA, pages 755–760, 2007.
[Low et al., 2008] K. H. Low, J. M. Dolan, and P. Khosla. Adaptive multi-robot wide-area
exploration and mapping. In Proc. AAMAS, pages 23–30, 2008.
[Low et al., 2009] K. H. Low, J. M. Dolan, and P. Khosla. Information-theoretic approach
to efficient adaptive path planning for mobile robotic environmental sensing. In Proc.
ICAPS, September 2009.
73
BIBLIOGRAPHY
[Low et al., 2011] K. H. Low, J. M. Dolan, and P. Khosla. Active markov informationtheoretic path planning for robotic environmental sensing. In Proc. AAMAS, May 2011.
[Low, 2009] K. H. Low. Multi-Robot Adaptive Exploration and Mapping for Environmental
Sensing Applications. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA, 2009.
[Lynch and McGillicuddy Jr., 2001] D. R. Lynch and D. J. McGillicuddy Jr. Objective
analysis for coastal regimes. Continental Shelf Research, 21:1299–1315, 2001.
[McBratney et al., 1981] A. B. McBratney, R. Webster, and T. M. Burgess. The design of
optimal sampling schemes for local estimation and mapping of regionalized variables – I:
Theory and method. Computers & Geosciences, 7(4):331–334, 1981.
[McGrath et al., 2004] D. McGrath, C. Zhang, and O. T. Carton. Geostatistical analyses
and hazard assessment on soil lead in Silvermines area, Ireland. Environmental Pollution,
127:239–248, 2004.
[Meliou et al., 2007] A. Meliou, A. Krause, C. Guestrin, and J. M. Hellerstein. Nonmyopic
informative path planning in spatio-temporal models. In Proc. AAAI, pages 602–607,
2007.
[Popa et al., 2006] D. O. Popa, M. F. Mysorewala, and F. L. Lewis. EKF-based adaptive
sampling with mobile robotic sensor nodes. In Proc. IROS, pages 2451 – 2456, 2006.
[Prudhomme and Reed, 1999] C. Prudhomme and D. W. Reed. Mapping extreme rainfall
in a mountainous region using geostatistical techniques: A case study in Scotland. Int.
J. Climatol., 19:1337–1356, 1999.
[Rabesiranana et al., 2009] N. Rabesiranana, M. Rasolonirina, A. F. Solonjara, and R. Andriambololona. Investigating the spatial anisotropy of soil radioactivity in the region of
Vinaninkarena, Antsirabe - Madagascar. In Proc. 4th High-Energy Physics International
Conference, 2009.
74
BIBLIOGRAPHY
[Rahimi et al., 2003] M. Rahimi, R. Pon, W. J. Kaiser, G. S. Sukhatme, D. Estrin, and
M. Srivastava. Adaptive sampling for environmental robotics. In Proc. ICRA, pages
3537–3544, 2003.
[Rahimi et al., 2005] M. Rahimi, M. Hansen, W. J. Kaiser, G. S. Sukhatme, and D. Estrin.
Adaptive sampling for environmental field estimation using robotic sensors. In Proc.
IROS, pages 3692–3698, 2005.
[Rasmussen and Williams, 2006] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press,Cambridge, MA, 2006.
[Rivest et al., 2012] M. Rivest, D. Marcotte, and P. Pasquier. Sparse data integration for
the interpolation of concentration measurements using kriging in natural coordinates. J.
Hydrology, 416-417:72–82, 2012.
[Rudnick et al., 2004] D. L. Rudnick, R. E. Davis, C. C. Eriksen, D. Fratantoni, and M. J.
Perry. Underwater gliders for ocean research. In Mar. Technol. Soc, 38(2), pages 73–84,
2004.
[Samal et al., 2011] A. R. Samal, R. R. Sengupta, and R. H. Fifarek. Modelling spatial anisotropy of gold concentration data using GIS-based interpolated maps and variogram analysis: Implications for structural control of mineralization. J. Earth Syst. Sci.,
120(4):583–593, 2011.
[S´anchez et al., 2011] J. M. C. S´anchez, D. F. Greene, and M. Quesada. A field test of
inverse modeling of seed dispersal. Amer. J. Botany, 98(4):698–703, 2011.
[Shewry and Wynn, 1987] M. C. Shewry and H. P. Wynn. Maximum entropy sampling.
Journal of Applied Statistics, 14:165–170, 1987.
[Singh et al., 2006] A. Singh, R. Nowak, and P. Ramanathan. Active learning for adaptive
mobile sensing networks. In Proc. IPSN, pages 60–68, 2006.
75
BIBLIOGRAPHY
[Singh et al., 2007] A. Singh, A. Krause, C. Guestrin, W. Kaiser, and M. Batalin. Efficient
planning of informative paths for multiple robots. In Proc. IJCAI, January 2007.
[Singh et al., 2009] A. Singh, A. Krause, and W. J. Kaiser. Nonmyopic adaptive informative
path planning for multiple robots. In Proc. IJCAI, pages 1843–1850, 2009.
[St˚
ahl et al., 2000] G. St˚
ahl, A. Ringvall, and T. L¨
am˚
as. Guided transect sampling for
assessing sparse populations. Forest Science Washington, 46:108–115, 2000.
[Thompson and Wettergreen, 2008] D. R. Thompson and D. Wettergreen. Intelligent maps
for autonomous kilometer-scale science survey. In Proc. iSAIRAS, February 2008.
[Wackernagel, 2009] H. Wackernagel. Geostatistics for Gaussian processes. In Proc. NIPS
Workshop on Kernels for Multiple Outputs and Multi-Task Learning: Frequentist and
Bayesian Points of View, 2009.
[Ward and Jasieniuk, 2009] S. M. Ward and M. Jasieniuk. Review: Sampling weedy and
invasive plant populations for genetic diversity analysis. Weed Science, 57(6):593–602,
2009.
[Webster and Oliver, 2007] R. Webster and M. Oliver. Geostatistics for Environmental
Scientists. John Wiley and Sons, Inc., 2007.
[Wu et al., 2005] J. Wu, C. Zheng, and C. C. Chien. Cost-effective sampling network design
for contaminant plume monitoring under general hydrogeological conditions. Journal of
Contaminant Hydrology, 77:41–65, 2005.
[Xiao et al., 2004] X. Xiao, G. Gertner, G. Wang, and A. B. Anderson. Optimal sampling scheme for estimation landscape mapping of vegetation cover. Landscape Ecology,
20(4):375–387, 2004.
[Zhang and Sukhatme, 2007] B. Zhang and G. S. Sukhatme. Adaptive sampling for estimating a scalar field using a robotic boat and a sensor network. In Proc. ICRA, pages
3673–3680, 2007.
76
BIBLIOGRAPHY
[Zhang et al., 2011] J. G. Zhang, H. S. Chen, Y. R. Su, X. L. Kong, W. Zhang, Y. Shi,
H. B. Liang, and G. M. Shen. Spatial variability and patterns of surface soil moisture in
a field plot of karst area in southwest China. Plant Soil. Environ., 57(9):409–417, 2011.
77
[...]... multiple robots, a large sampling task can be completed easily and fast 18 Chapter 4 Maximum Entropy Path Planning In this chapter, we propose the MEPP (Maximum Entropy Path Planning) algorithm, which can find the paths with maximum entropy Before presenting our own work, we introduce the information- theoretic Multi- Robot Adaptive Sampling Problem (iMASP) Although the optimal paths can be theoretically... a path for single robot For a small sampling task, single robot is easy to coordinate and deploy However, it will be difficult for single robot to accomplish a large sampling task Instead, our work like those in [Singh et al., 2007; Low et al., 2007; Low et al., 2008; Low et al., 2009; Singh et al., 2009; Binney et al., 2010; Low et al., 2011] can generate multiple paths for multiple robots With multiple... Entropy Path Planning (MEPP) algorithm: A polynomialtime approximation algorithm, MEPP, is proposed to find the maximum entropy paths We also provide a theoretical performance guarantee on the sampling performance of the MEPP algorithm for a class of exploration tasks called transect sampling task • Formalization of Maximum Mutual Information Path Planning (M2 IPP) algorithm: For maximum mutual information. .. the robots will perform the transect sampling task So the travelling cost of each robot is the horizontal length of the field And the action space for each robot is limited Multiple robots will be applied to explore the field We assume that the number of robots will be less than the number of sampling locations in each column Our proposed algorithms will find the paths with maximum entropy and the paths... joint entropy of the optimal paths of the MEPP algorithm is close to the optimal paths of iMASP The following theorem bounds the entropy decrease between the 24 Chapter 4 Maximum Entropy Path Planning ∗ optimal paths xme 1:n of the MEPP algorithm and the optimal paths x1:n of iMASP: ∗ Theorem 4.5.4 Let xme 1:n be the optimal paths of the MEPP algorithm and x1:n be the optimal paths of iMASP Let ε 2 },... chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual Information Path Planning) , to find the paths with maximum mutual information Similar to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal paths, the time complexity will exponentially increase with the length of planning horizon In the previous chapter, we have proposed the MEPP algorithm with... above Secondly, these work did not consider the computational efficiency of planning In robotics community, the work of [Low et al., 2009] has defined the information- theoretic Multi- Robot Adaptive Sampling Problem (iMASP) However, for any environmental field, the time complexity of iMASP exponentially increases with the length of planning horizon To reduce the time complexity, the work of [Low et al.,... non-myopic algorithm, the MEPP, to find the paths with maximum entropy 2.3.2 Mutual Information Another metric, mutual information, is also proposed to measure the informativeness of observation paths Given observation paths P and unobserved part X \P, the mutual information between ZP and ZX \P is: I(ZP ; ZX \P ) = H(ZX \P ) − H(ZX \P |ZP ) (2.11) Based on the mutual information, the problem can be formalized... is the set of all possible paths in the field With (2.4) and (2.7), the mutual information for paths P can be evaluated in closed form In this thesis, we will present another non-myopic algorithm, the M2 IPP, to find the paths with maximum mutual information 14 Chapter 3 Related Work To monitor an environmental field, the robots need to sample locations which can give more information about the measurement... of robots, (d) using a large value of m In particular, for anisotropic fields, if the robots along the small correlated direction, the value of ε will be small As a result, we can use a small m which incur little planning time to bound the sampling performance 25 Chapter 5 Maximum Mutual Information Path Planning In this chapter, we propose another approximation algorithm, M2 IPP (Maximum Mutual Information ... al., 2011] can generate multiple paths for multiple robots With multiple robots, a large sampling task can be completed easily and fast 18 Chapter Maximum Entropy Path Planning In this chapter,... MEPP (Maximum Entropy Path Planning) algorithm, which can find the paths with maximum entropy Before presenting our own work, we introduce the information- theoretic Multi- Robot Adaptive Sampling... Mutual Information Path Planning) , to find the paths with maximum mutual information Similar to maximum entropy path planning, if we use the exhaustive algorithm to find the optimal paths, the time