Location Updates—Balancing Update Cost

CHAPTER 11 Modeling and Querying Current Movement

11.4 Location Updates—Balancing Update Cost

11.4.1 Background

The motion of spatial objects over time necessitates the transmission of updates of their current position and speed to the database in order to provide the database with up-to-date information for retrieval and query tasks and to keep the inherent imprecision in the database bounded. The main issue here is when and how often these position updates should be made. Frequent updating may be expensive in terms of cost and performance overhead; infrequent updates result in outdated answers to position queries. Consequently, the location of a moving object is inherently imprecise, since the object location stored in the database, which we will call database location , cannot always be identical to the actual location of the object. This holds regardless of the policy employed to update the database location of an object. Several location update policies may be applicable (e.g., the database location is updated every n clock tick). In this section, we introduce so-called dead-reckoning policies. These update the database whenever the distance between the actual location of a moving object and its database location exceeds a given threshold th — say, 100 meters. Thus, this threshold determines and bounds the location imprecision. For a moving object m , a query “ What is the current location of m ? ” will then be answered by the DBMS with “ The current location of m is ( x, y ) with a deviation of at most 100 meters. ” Here, the issue is how to determine the update threshold th in dead- reckoning policies.

The feature of imprecision leads us to two related but different concepts:

deviation and uncertainty. The deviation of a moving object m at a specifi c instant t is the distance between m ’ s actual location at time t and its database location at time t . In our example, the deviation is the distance between the actual location of m and ( x, y ). The uncertainty of m at an instant t is the size of the area com- prising all possible, current positions of m . In our example, the uncertainty is the size of the area of a circle with a radius of 100 meters. Both deviation and uncertainty are affl icted with a cost in terms of incorrect decision making. The deviation

(uncertainty) cost is proportional to the size of the deviation (uncertainty). We will see that the ratio between the costs of an uncertainty unit and a deviation unit depends on the interpretation of an answer.

To be able to update the database location of a moving object, we need an appropriate localization mechanism. In moving objects applications, each moving object is usually equipped with a global positioning system (GPS) and can thus generate and transmit the updates by using a wireless network. This introduces a third cost factor: communication or transmission cost. Furthermore, we can rec- ognize an obvious trade-off between communication and imprecision in the sense that the higher the communication cost, the lower the imprecision, and vice versa.

This leads to the issue of an information cost model in moving objects databases that balances imprecision and update cost. The model should also be able to cope with the situation where a moving object becomes disconnected and cannot send location updates.

11.4.2 The Information Cost of a Trip

The fi rst issue we deal with relates to an information cost model for a trip taken by a moving object. We have seen that during a trip a moving object causes a deviation cost and an uncertainty cost, which both can be regarded as a penalty due to incorrect decision making. Moreover, a moving object causes update cost, since location update messages have to be sent to the database.

For a moving object the deviation cost depends both on the size of the deviation and the duration for which it lasts. The size of the deviation affects the decision-making process. The higher the deviation, the more diffi cult and imprecise it is to make a reliable decision based on the moving object ’ s current position.

To see that the duration for which the deviation persists plays a role for calculat- ing the cost, we assume that there is one query per time unit that retrieves the location of a moving object. If the deviation lasts for n time units, its cost will be n times the cost of the deviation lasting for a single time unit, because all queries instead of only one have to pay the deviation penalty. Formally, for a given moving object, the cost of the deviation between a starting time t 1 and an ending time t 2 can be described by the deviation cost function COST d ( t 1 , t 2 ) yielding a nonnegative number. Assuming that the penalty for each unit of deviation during a time unit is weighted by the constant 1, the deviation cost function can be defi ned as:

COST t td d t dt

t t 1 2

1 2

( , ) =∫ ( )

where d ( t ) describes the deviation as a function of time. We also denote this function as a uniform deviation cost function. It is the basis for all later descriptions of update policies. Of course, other deviation cost functions are conceivable. An example is the step deviation cost function. This function yields a penalty of 0 for

each time unit in which the deviation falls below a given threshold th , and it yields a penalty of 1 otherwise.

The update cost C 1 covers the effort for transmitting a single location update message from a moving object to the database. It is diffi cult to determine it pre- cisely, because it can be different from one moving object to another, or even vary for a single moving object during a trip (e.g., due to changing availability of resources such as bandwidth or computation). Of course, we have to measure the update cost by using the same kind of unit as for the deviation cost. With respect to the ratio between the update cost and the cost of a deviation unit per time unit, we can state that it is equal to C 1 , since the latter cost factor is assumed to be 1. We can also conclude that in order to reduce the deviation by 1 during a time unit, the moving object will need 1/ C 1 messages.

The uncertainty cost depends on the size of uncertainty and on the duration for which it lasts. A higher degree of uncertainty conveys less reliable information for answering a query. Formally, for a given moving object, the cost of the uncertainty between a starting time t 1 and an ending time t 2 can be described by the uncertainty cost function COST u ( t 1 , t 2 ) yielding a nonnegative number. Let the uncertainty unit cost C 2 be the penalty for each uncertainty unit during a time unit. This implies that C 2 is defi ned as the ratio between the cost of an uncertainty unit and the cost of a deviation unit, since the latter cost is assumed to be equal to 1. Then, the uncer tainty cost function COST u ( t 1 , t 2 ) can be defi ned as:

COST t tu C u t dt

t t

1 2 2

1 2

( , ) =∫ ( )

where u ( t ) is the value of the loc.uncertainty attribute (see Section 11.2.3 ) of the moving object as a function of time. We can now exert infl uence on the weighting and thus the importance of the uncertainty factor and the deviation factor. If for answering the query, “ The current location of the moving object m is ( x, y ) with a deviation of at most u units, ” the uncertainty aspect is to be stressed, C 2 should be set higher than 1, and lower than 1 otherwise. In a dead- reckoning update policy, each update message to the database determines a new uncertainty, which is not necessarily lower than the previous one. Therefore, an increase of communication reduces the deviation but not necessarily the uncertainty.

We are now in the position to defi ne the information cost of a trip taken by a moving object. Let t 1 and t 2 be the times of two consecutive location update messages. Then, the information cost in the half open interval [ t 1 , t 2 [ is:

COSTI([ , [t t1 2) = +C1 COSTd([ , [t t1 2) +COSTu([ , [t t1 2 )

The result contains the message cost at time t 1 but not at time t 2 . Since each location update message writes the actual current position of the moving object in the database, the deviation is reduced to 0. The total information cost is calcu- lated by summing up all COST I ( [ t 1 , t 2 [ ) values for every pair of consecutive update

instants t 1 and t 2 . Formally, let t 1 , t 2 , . . . , t n be the instants of all update messages sent from a moving object. Let 0 be the time point when the trip started and t n + 1 be the time point when the trip ended. Then, the total information cost of a trip is:

COSTI(0,tn+1[) =COSTd([ , [0 t1) +COSTu([ , [0 t1 ) + COST t tI(i, i+1[)

ii n

∑= 1

11.4.3 Cost-Based Optimization for Dead-Reckoning Policies

Next, we consider the issue of information cost optimality. We know that the essential feature of all dead-reckoning update policies consists of the existence of a threshold th at any instant. This threshold is checked against the distance between the location of a moving object m and its database location. Therefore, both the DBMS and the moving object must have knowledge of th . When the deviation of m exceeds th, m sends a location update message to the database.

This message contains the current location, the predicted speed, and the new deviation threshold K . The goal of dead-reckoning policies is to set K , which is stored by the DBMS in the loc.uncertainty subattribute, such that the total information cost is minimized.

The general strategy is the following: First, m predicts the future behavior and direction of the deviation. This prediction is used as a basis for computing the average cost per time unit between now and the next update as a function f of the new threshold K . Then, K is set to minimize f . The proposed method of opti- mizing K is not unique. The optimization is related to the average cost per time unit and not to the total cost between the two instants t 1 and t 2 , because the total cost increases as the time interval until the next update increases. For the case that the deviation between two consecutive updates is described by a linear function of time, we can determine the optimal value K for loc.uncertainty .

Let C 1 denote the update cost and C 2 denote the uncertainty cost. We assume that t 1 and t 2 are the instants of two consecutive location updates, that the deviation d ( t ) between t 1 and t 2 is given by the linear function a ( t − t 1 ) with t 1 ≤ t ≤ t 2 and a positive constant a , and that loc.uncertainty is fi xed at K between t 1 and t 2 . The statement is then that the total information cost per time unit between t 1 and t 2 is minimal if K= (2aC1) (2C2+1) This can be shown as follows: We take the formula for computing the information cost in an interval [ t 1 , t 2 [ and insert our assumptions. We obtain:

COST t t C a t t dt C Kdt

C a t t

I t

t t

[ , [

1 2 1 1 2

1 2 1

1 2

0 5

( ) = + ( − ) +

= + ( − )

Ò Ò

2 2

2 2 1

+C K t( −t )

Let f ( t 2 ) = COST I ( [ t 1 , t 2 [ )/( t 2 − t 1 ) denote the average information cost per time unit between t 1 and t 2 for update time t 2 . We know that t 1 and t 2 are two con-

secutive update times. Therefore, at t 2 the deviation exceeds the threshold loc.

uncertainty so that K = a ( t 2 − t 1 ). We can now replace t 2 in f ( t 2 ) by K / a + t 1 and obtain f ( K ) = aC 1 / K + (0.5 + C 2 ) K . Using the derivative the minimum of f ( K ) is at K = (2aC1) (2C2+1) .

What is the interpretation of this result? Assume that m is currently at instant t 1 . This means that its deviation has exceeded the loc.uncertainty uncertainty threshold. Therefore, m needs to compute a new value for loc.uncertainty and transmit it to the database. Further assume that m predicts a linear behavior of the deviation. Then, loc.uncertainty has to be assigned a value that will remain fi xed until the next update. To minimize the information cost, the recommenda- tion then is that m should set the threshold to K= (2aC1) (2C2+1) .

Finally, we try to detect disconnection of a moving object from the database.

Then, the moving object cannot send location updates. In this case, we are inter- ested in a dead-reckoning policy in which the loc.uncertainty uncertainty threshold continuously decreases between updates. As an example of decrease, we consider a threshold loc.uncertainty decreasing fractionally and starting with a constant K . This means that during the fi rst time unit after the location update u , the value of the threshold is K ; during the second time unit after u the value is K /2; and during the i th time unit after u the value is K / i , until the next update, which determines a new K . Assuming a linear behavior of the deviation, the total information cost per time unit between t 1 and t 2 is given by the function f K( ) =(C1+0 5. K+C K2 (1 1 2 1 3+ + +. . .+1 K a) ) K a .

11.4.4 Dead-Reckoning Location Update Policies

A location update policy is a position update prescription or strategy for a moving object that determines when the moving object propagates its actual position to the database and what the update values are. We discuss here a few dead- reckoning location update policies that set the deviation bound (i.e., the threshold th ) stored in the subattribute loc.uncertainty in a way so that the total information cost is minimized.

The fi rst strategy is called the speed dead-reckoning ( sdr ) policy . At the beginning of a trip, the moving object m fi xes an uncertainty threshold in an ad hoc manner and transmits it to the database into the loc.uncertainty subattribute.

The threshold remains unchanged for the duration of the whole trip, and m updates the database whenever the deviation exceeds loc.uncertainty . The update information includes the current location and the current speed. A slight, more fl exible variation or extension of this concept is to take another kind of speed (e.g., the average speed since the last update, the average speed since the beginning of the trip, or a speed that is predicted based on terrain knowledge).

The adaptive dead-reckoning ( adr ) policy starts like the sdr policy, with an initial deviation threshold th 1 selected arbitrarily and sent to the database by m at the beginning of the trip. Then, m tracks the deviation and sends an update

message to the database when the deviation exceeds th 1 . The update consists of the current speed, the current location, and a new threshold th 2 stored in the

loc.uncertainty attribute. The threshold th 2 is computed as follows: Let us assume that t 1 denotes the number of time units from the beginning of the trip until the deviation exceeds th 1 for the fi rst time and that I 1 is the deviation cost (according to the formula in Section 11.4.2 ) during that interval. Let us assume further a 1 = 2 I 1 / t 12 . Then, th2= (2a C2 1) (2C2+1) . where C 1 is the update cost and C 2 is the uncertainty unit cost. When the deviation reaches th 2 , a similar update is sent. This time the threshold is th3= (2a C2 1) (2C2+1) , where a 2 = 2 I 2 / t 22 , I 2 is the deviation cost from the fi rst update to the second update, and t 2 is the number of time units elapsed since the fi rst location update. That is, a difference between a 1 and a 2 results in a difference between th 2 and th 3 . Further thresholds th i are computed in a similar way.

The main difference between the sdr policy and the adr policy is that the fi rst policy pursues an ad hoc strategy for determining a threshold, while the latter policy is cost based. At each update instant p i , the adr policy optimizes the information cost per time unit and assumes that the deviation following instant p i will behave according to the linear function d ( t ) = 2 tI i / t/ i 2 , where t is the number of time units after p i , t i is the number of time units between the preceding update and the current one at time p i , and I i is the deviation cost during the same time interval. This prediction of the future deviation can be explained as follows: adr approximates the current deviation from the time of the preceding update to time p i by a linear function with slope 2 I i / t i 2 . At time p i this linear function has the same deviation cost (i.e., I i ) as the actual current deviation. Due to the locality principle, the prediction of adr after the update at time p i leads to a behavior of the deviation according to the same approximation function.

The last strategy we discuss is the disconnection detection dead-reckoning ( dtdr ) policy . This policy is an answer to the problem that updates are not generated because the deviation does not exceed the uncertainty threshold, but because the moving object m is disconnected. At the beginning of the trip, m sends an initial, arbitrary deviation threshold th 1 to the database. The uncertainty threshold loc.uncertainty is set to a fractionally decreasing value starting with th 1 for the fi rst time unit. During the second time unit, the uncertainty threshold is th 1 /2 and so on. Then, m starts tracking the deviation. At time t 1 , when the deviation reaches the current uncertainty threshold (i.e., th 1 / t 1 ), m sends a location update message to the database. The update comprises the current speed, the current location, and a new threshold th 2 to be stored in the loc.uncertainty subattribute.

For computing th 2 , we use the function f K( ) =(C1+0 5. K+C K2 (1 1 2+ + 1 3+. . .+ k a1)) K a1 (see Section 11.4.3 ). Since f ( K ) uses the slope factor a of the future deviation, we fi rst estimate this deviation. Let I 1 be the cost of the deviation since the beginning of the trip, and let a 1 = 2 I 1 / t 12 . The formula for f ( K ) does not have a closed form. Therefore, we approximate the sum 1 1 2 1 3+ + +. . .1 k a1 by ln 1( k a1) , since ln( n ) is an approximation

of the n th harmonic number. Thus, the approximation function of f ( K ) is g K( ) =(C1+0 5. K+C K2 (1+ln1 k a1) ) k a1 . The derivation of g ( K ) is 0 when K is the solution of the equation ln( K ) = d 1 / K − d 2 with d 1 = 2 C 1 / C 2 and d 2 = 1/ C 2 + 4 − ln( a 1 ). By using the well-known Newton-Raphson method, we can fi nd a numerical solution to this equation. The solution leads to the new threshold th 2 , and m sets the uncertainty threshold loc.uncertainty to a fractionally decreasing value starting with th 2 .

After t 2 time units, the deviation exceeds the current uncertainty threshold, which is equal to th 2 / t 2 , and a location update containing th 3 is transmitted. The value th 3 is computed as previously but with a new slope a 2 . I 2 is the deviation cost during the previous t 2 time units. This process, which continues until the end of the trip at each update instant, determines the next optimal threshold by incor- porating the constants C 1 and C 2 and the slope a i of the current deviation approximation function.

An interesting question now is which of the three discussed dead-reckoning location update policies causes the lowest information costs. This has been empir- ically investigated in a simulation test bed used to compare the information cost of the three policies on the assumption that the uncertainty threshold is arbitrary and fi xed. The result of this comparison is that the adr policy is superior to the other policies and therefore has the lowest information cost. It may even have an information cost that is six times lower than that of the sdr policy.

Location Updates—Balancing Update Cost

Other Constraints and Derivation Rules

Mapping from ORM to UML