Caching Management of Mobile DBMS

Caching Management of Mobile DBMS Jenq-Foung Yao Department of Mathematics and Computer Science Georgia College & State University Milledgeville, GA 31061 Email: jfyao@mail.gcsu.edu Phone: (912) 445-1626 Fax: (912) 445-2602 Margaret H Dunham Department of Computer Science and Engineering Southern Methodist University Dallas, TX 75275 Email: mhd@seas.smu.edu Phone: (214) 768-3087 Fax: (214) 768-3085 Abstract Unlike a traditional client-server network, a mobile computing environment has a very limited bandwidth in a wireless link Thus, one design goal of caching management in a mobile computing environment is to reduce the use of wireless links This is the primary objective for this research Quota data and private data mechanisms are used in our design so that an MU user is able to query and update data from the local DBMS without cache coherence problems The effect of the two mechanisms is to increase the hit ratio An agent on an MU along with a program on a base station are used to handle the caching management, including prefetching/hoarding, cache use, cache replacement, and cache-miss handling The simulation results clearly indicate that our approaches are improvements to the previous research Keywords: Caching, Mobile Computing, Mobile DBMS, Mobile Unit (MU), Database, Agent, User Profile, Validation Report (VR) The local DBMS contains cache data on an MU INTRODUCTION For the past ten years, personal computer technology has been progressing at an astonishing rate The size of a PC is becoming smaller, and the capacity of software and hardware functionality is increasing Simultaneously, the technologies of cellular communications, satellite services and wireless LAN are rapidly expanding These state-of-art technologies of PC and wireless have brought about a new breed of technology called mobile computing (MC) Several mobile computing examples have been discussed in [7] and [1] Most people acknowledge that the mobile environment is an expansion of distributed systems Mobile units (MUs) and the interfacing devices (base stations that may interact with MUs) are added to the existing distributed systems (see Figure 1) This is a client/server network setting The servers would be on some fixed hosts or base stations, and the clients could be fixed hosts or mobile units The mobile units are frequently disconnected for some periods because of the expansive wireless connection, bandwidth competition, and limited battery power To allow users to access resources at all times no matter which mode they are in, many research issues need to be dealt with The data caching/replication on an MU is one of the important methods that can help to resolve this problem PREVIOUS WORKS In the present research, caching management is handled in two different levels - the file system level and the DBMS level Issues on the file system level have been addressed widely [11] [22] [12] [21] [17] Some of the approaches on the file system level have developed real systems that have been used daily, such as Coda [22] These research efforts on the file system level have some shortcomings The major one is that all of them explicitly exclude a DBMS In addition, they use the optimistic replication control This kind of control allows WRITE operation on different partitions (locations) Committing data in a timely fashion is not important in these systems Data are allowed to have several different versions on the different partitions, and later will be integrated (and committed) In the academic environment of these approaches, users rarely write to the same file at the same time Most of the previous works in mobile DBMS [1] [2] [13] [3] [25] concentrated on the time window “w” and the size of Invalidation Report (IR) These researches impacted the wireless link usage to a certain degree However, these previous approaches assumed read-only operation on the local cache, just one of the possibilities in which to use cache data on an MU These previous researches also uplinked to the fixed network for the cache-miss data, which also is just one of the cache-miss handling possibilities Only a few researchers at the DBMS level have dealt with the update issue on MUs Chan, Si, and Leong proposed a mobile caching mechanism based on an object-oriented paradigm [5] Conceptually their approach is based on an idea that is similar to ours That is, to cache the frequently accessed database items in MUs to improve performance of database queries and availability of database data items for query processing during disconnection This is a concept called hot spot [2] This concept states that frequently accessed data are likely to be accessed again in the future The other research, which dealt with WRITE operations, is in [24] In this research, virtual resources are pre-allocated on an MU so that the MU has its own potential share of data The research is based on a trucking distribution system where each truck is pre-assigned an amount of load When a truck has actually loaded goods, it then reports the actual load to the database server Only the aggregate quantity data have been dealt with The approach is very similar to research proposed by O'Neil [19] O'Neil proposed an escrow transactional method that pre-allocates some fixed portion of an aggregate quantity data to a transaction prior to its commit When the time comes to commit the transaction, there ought to be enough value of this data item available due to the pre-allocation The whole mechanism of the approach in [19] takes place in a centralized DBMS OBJECTIVES FOR THE RESEARCH The objectives of this research are to provide some solutions for unaddressed issues in the DBMS area These issues include how to handle WRITE operations on the local cache, different techniques to handle cache-misses, how to deal with cache coherence, etc We will fully address the issues in the next two sections Performance evaluations based on simulation results are then discussed in section six MODEL OF CACHING MANAGEMENT A client-agent-server architecture is used in our model The rationale is that we would like to build a mobile DBMS on top of existing DBMSs The agent is an interface between a database server and a mobile client All the additional functionality to the existing DBMSs is built in the agent That is, among other things the agent handles all cache prefetch, cache update, cache-miss handling, and cache coherence The agent includes an MU agent on an MU and a VR handler on a base station The data prefetching can be done either through a wireful link or through a wireless link However, the wireful link is always the preference as long as the situations allow The cache will be stored in the local disk of an MU, and organized in the format of relations These relations are deemed part of the local RDBMS and can be queried via the local RDBMS 4.1 Model Assumptions Assumptions in our research are as follows: Only the issues at the DBMS level will be dealt with in this research The environment is a typical client-server network The server is on a fixed network The client could be a mobile unit or a fixed host; we mainly deal with issues on a mobile unit A user is able to use the MU for an extended time frame Therefore, data can be cached there for long periods An MU has an independent local DBMS This local DBMS and the database server on the fixed network all support the relational models SQL on the MU can query both the private RDBMS and the RDBMS on the database server An MU agent serves as a query interface among them All the MUs are portable laptop computers We assume they are as powerful as their desktop counterparts The data prefetching can be done either through a wireful link or through a wireless link However, the wireful link is always the preference as long as the situations allow In addition, a mass prefetching is preferable in the case of low network traffic, such as overnight Please refer to Section 4.5 The cache will be stored in the local disk of an MU, and organized in the format of relations These relations are deemed part of the local RDBMS on the MU and can be queried via the local RDBMS We assume that downlink and uplink channels have the same bandwidth capacity for the evaluation purpose The granularity of the fetching data is a portion of relation 4.2 Caching Granularity The caching granularity is one of the key factors in caching management systems Most of the mobile systems at the Operating system level use a file or a group of files (cluster or replica) as the caching granularity Using a file as the granularity is not an appropriate choice On the DBMS level, the caching granularities are usually an attribute [5], an object [5], a page [4], a tuple [9], or a semantic region [6] Attribute caching and tuple caching create undesirable overheads due to the large number of independent cache attributes/tuples On the other hand, object caching, page caching and semantic caching reduce these overhead problems by collecting data in a group of tuples The static grouping in page caching and object caching is lacking of flexibility compared to the attribute/tuple caching Semantic caching provides the flexibility, permitting the dynamic grouping to the requests of the present queries However, how a semantic caching can be implemented in detail is still unclear, such as who is in charge of the cache update [6] We propose a portion of a relation as the caching granularity This portion of the relation contains a group of tuples from the original relation These tuples are extracted from the original relation with query operators SELECTION(σ) and PROJECTION We also preserve the primary key’s attributes in a cache relation Therefore, our approach is not exactly like that of the updateable snapshot We call our approach “Morsel Caching” because we cache a portion of a base relation as a cache relation We then call this caching granularity “Cache relation” The cache relation is defined in Definition One may view a cache relation as a special case of updateable snapshot Definition Let a database D = {Ri} be a set of base relations For every base relation R i, let Ai stand for the set of its attributes Let A ik be the set of attributes that are included in the primary key of R i A user morsel, UM, is a tuple , where UC = ΠUAσUP (Rj) where Rj is one of the relations in D; UA ⊆ Aj and Ajk ⊆ UA; Up = P1 ∨P2 ∨P3 ∨… ∨Pn where Pj is a conjunctive of simple predicates, i.e P j = bj1 ∧bj2 ∧bj3 ∧… ∧bjl, each bjt is a simple predicate Definition A Cache Relation, CR, is the smallest relation which contains a set of U c’s from Definition and all associated with the same base relation The motivation why we not have a join included to form a cache relation is to make the cache-miss handling and the update synchronizing issues much easier Note that we use a cache relation as the caching granularity in prefetching and in cache replacement The cache update granularity, however, is only a subset of attributes of a tuple owing to the fact that the update takes place via a wireless link whose bandwidth is limited This is different from other approaches, which use the same granularity for prefetching, cache replacement and cache update Thus, our approaches are much more flexible, in that the granularity is not fixed but dynamic for different occasions 4.3 User Profile There are several previous approaches that use a mechanism called “hoarding profile” The hoarding profile lets users choose their preference data to cache This is the most direct and effective way to cache data that users need The drawback is that it needs human involvement and people may not know what they will really use Hence, only the sophisticated users are able to provide the most effective cache data A classic example is Coda [22], which uses the “hoard profile” command scripts to update the “hoard database” Data are fetched to the cache based on the hoard database Another example is Thor [10], which uses an object-oriented query language to describe the hoarding profile Its applications are similar to Coda’s Algorithm Extract-Cache-Relation: Extract-Cache-Relation(user-profile){ /* This algorithm extracts data from the base relation and inserts it into the cache relation */ For each entry of the user profile { if (cache_flag = 0) then { /* the corresponding cache relation is not exist */ { create a cache relation; } set cache_flag = 1; /* mark the cache relation as cached */ For (each attribute j that is not in the cache relation) { add a column for the attribute into the cache relation; } extract data from the base relation and insert the data into the cache relation; } 10 } We adapt this concept as part of our caching mechanism In our model, data are fetched to the cache based on the data access frequency and a user hint A user hint contains the projected needs of a user as specified in a user profile, and the relations that are presently used The user profile is created prior to the very first prefetching We let a user create his user profile by running a simple program which prompts the user to enter the information This information includes the name of a cache relation, the name of the related base relation, the attributes of the primary key, the attributes of the cache relation, and a set of criteria that will be used to create the cache relation The program then organizes the entered information in the format shown in Figure A user profile is an input to another program (see Algorithm ExtractCache-Relation) The output of the program is a set of cache relations Hence incorporating a user profile with the program could extract a portion of base relations into cache relations Alternatively, the DBA could perform these tasks for users Should a user lack a user profile for whatever reasons, the cache relations would be the base relations on the database server That is, if a user chooses not to have a user profile, the cache relations would be the same as the base relations Which base relations will be fetched in the prefetching stage is based on the relation access frequency Cache relations are fetched onto an MU based on the user profile during the prefetching stage A cache relation created with the user profile has a different name from the base relation It is up to the user to name a cache relation in the user profile If a cache relation is not created with a user profile, then the cache relation would share the same name with the original relation In this case, a cache relation is like a replication of the base relation The program (Extract-Cache-Relation) will be run against the user profile at the very first prefetching This is a one-time deal Once the user profile has been created and has been used to extract cache relations, it can also be used to assist in handling cache-misses When a cache-miss occurs, the MU agent can look up the user profile and trace back to the base relation If there is no entry for the cache relation, then the MU agent needs to create a new entry for the missing relation based on the cache-miss query If the cache-miss query involves a join, each relation involved in the join will be an entry that needs to be created That is, if three relations involve a join, three entries will be created The new entry is created in a temporary user profile The MU agent then runs the extract-cache-relation program against the temporary user profile Once this has been done, the temporary user profile will be appended to the user profile In the future, if the user decides to cache more relations, the procedure is similar to the case of a cache-miss This user profile is kept on the MU that a user is using From time to time, it will be backed up to the database server Each entry could be used to create a user morsel (see Definition in Section 4.2) There are six parameters for each entry The first parameter is the cache flag bit The ON bit (1) means the cache relation exists, and the OFF bit (0) means the cache relation does not exist Initially, all the cache flags are set to OFF (0) bit The second parameter is the name of the base relation from which that data will be extracted The third parameter is a name for the new cache relation It is the user’s choice for the name The fourth parameter is a set of attributes of the primary key from the base relation The fifth parameter is a set of attributes from the base relation that will be included in the cache relation The last parameter is a set of criteria that is used to select a user morsel These criteria are in the same format of the user morsel’s criteria (Up), which are defined in Definition in Section 4.2 To extract and insert data into the new cache relation with SQL, the MU agent first submits an INSERT operation to the server to insert the data to a temporary base relation with the same name as the cache relation Note that the data is SELECTed from the corresponding permanent relations and inserted into the temporary after it has been CREATEd This relation then is moved to the local DBMS as a cache relation Sometimes an existing cache relation needs to be modified, such as adding more columns for some new attributes This case happens when a cache relation needs to be extended, as when another entry of the user profile attempts to create the same cache relation We only allow one cache relation to be extracted from a base relation Thus, creating another cache relation is not allowed The solution to this problem is to add the new attributes in this entry of the user profile to the existing cache relation To add a column for a new attribute into an existing relation, use the ALTER command in SQL After adding new attributes into an existing cache relation, use UPDATE command in SQL to insert values for the new attributes of the cache relation Note that one base relation can only produce one cache relation and more attributes can be added into the cache relation later A cache relation will not be split into two or more relations, nor will it be coalesced with other relation(s) This is very different from the semantic caching, in which a semantic region may be split into several semantic regions or coalesced with other semantic regions over a time frame [6] In addition, the MU user will query these cache relations as their own private relations on the local DBMS He is not aware of the morsels within a relation 4.4 Cache Replacement Policy The cache replacement policy is another factor that affects the caching performance This policy determines which part of cache will be replaced when the cache is running out of space for new cache data There are three different types of cache replacement policies, which are based on temporal locality, spatial locality, and semantic locality Temporal locality is the property that data in which have been used recently will be used again soon, such as MRU Therefore, a cache replacement policy using temporal locality property may replace the lease recently used (LRU) data Spatial locality is the property that the data spatially close to the recently used data are likely to be used again in the near future Thus, a cache replacement policy using spatial locality property would replace the data that are spatially farther away from the recently used data The property of semantic locality is that a semantic region which is most similar to a currently being accessed region is most likely to be used in the future Regions which are not related to the current queries, based on a semantic distance function, should be targets for replacement first We propose a new replacement policy which uses the property that less frequently used data are less likely to be used again (LFU) This proposal is based on our observation, Kenning’s empirical results [15], and the common belief of “hot spot” This property is thus called frequency locality The replacement policy, which uses the frequency locality, will replace the least frequently used cache relation with a new cache relation when the cache does not have enough space A frequency function constantly records all the access frequencies for all relations on the database server and the local DBMS The relations of the local DBMS have different names from those of the base relations However, we keep one counter for both In any case, it is reasonable to assume that less frequently used data on a DBMS is less likely to be used again, because the access frequency history is a lifetime record Therefore, using frequency locality property to apply on the cache replacement policy is suitable 4.5 Cache Update There are two aspects in terms of cache update The first aspect is how to update the cache on an MU when data on the database server is updated The second aspect is how the MU user writes on the cache, and how to deal with the data consistency problem For the first aspect, we adapt the idea proposed in [1], in which an invalidation report (IR) has been used to inform MUs which cache data are invalid We use one of the three proposed models, the time stamp model In addition, we add modified data items in the report and change the file server type to be stateful We call this report a Validation Report (VR) The motivation of using VR is because our cache relations are kept in the local DBMS for a long time, if not permanently The base stations in our model keep track of the cached public relations within their wireless cell Only the modified non-quota public data within a certain time (say L seconds) and has been requested by an MU during the registration will be included in the VR When an MU gets into the cell of a base station, the MU has to register to the base station The base station then asks the base station with which the MU was previously registered to hand over all the information about the MU, such as VR This step is required in order for hand-off to be completed A data item in a VR included Time Stamp, Relation Name, Primary Key, and Updated attributes as shown below: (Time Stamp, Relation Name, Primary Key, Updated Attribute(1), Updated Attribute(2), … ,Updated Attribute(n) ) Thus, a data item in a VR is the updated portion of a tuple Each MU has one VR at the base station A VR is sent every L seconds to a specific MU An MU waits and checks the incoming VR before answering a normal query (see Figure 3) The broadcasting time (the time stamp) for a VR is also included in the VR When an MU has received its VR, it then sends an acknowledgment that contains the broadcast time of the VR to the base station to confirm the receiving The base station keeps all the modified data items in the VR from the last time it received an acknowledgment from the MU Upon receiving a new acknowledgment, the base station then compares each data item's timestamp in the VR with the broadcast time in the acknowledgment In addition, it discards the data items in the VR that are older than the broadcast time in the acknowledgment For the second aspect, the MU user updates the cache itself, meaning the user writes on the cache We propose two new ideas here to make cache write possible without releasing the ACID property The first new idea is categorizing relations into two types: the public relations and the private relations When a query accesses a private relation owned by the user, this query can be answered immediately, whereas answering a query, which accesses a public relation, needs to wait for the next VR This is owing to the fact that a private relation is only modified by that one user The data dictionary on an MU contains the information about whether a relation is public or private The second new idea is the quota mechanism The MU user can read and write on the quota data Both of these two new ideas are elaborated in the following two subsections 4.6 Public Relations vs Private Relations Previous research does not differentiate various types of relations The most obvious differentiation is to separate public relations from private relations The difference between the two is that the public relation is shared and can be modified by a group of people, whereas the private relation is solely owned and used by one user Many such private relations exist in the academic environment For example, the relations owned by students are private relations, which are solely for their personal use Some other private relations are written by one user but can be read by a group of people The owner grants the read rights The public access to this type of private relation is read only The owner is the one who can make changes to these private relations Therefore, this type of relation is also categorized as private relation We define the private relation in Definition 3, the public relation in Definition Definition A Private Relation is a relation whose primary copy exists at the MU Only the owner of the private relation can modify this relation Definition A Public Relation is a relation whose primary copy exists at a database server Portion of it may be cached at an MU A group of authorized people can modify this relation Note that the database server knows about the ownership because the ownership information is part of the DBMS' security mechanism, and usually stored in the data dictionary Therefore, the database server knows an MU user is the only one who may update the private relations on the MU, and nobody else is able to modify the copy of the private relations on the database server The private relations should be downloaded at the very first prefetching This is a one-time deal Some of them could be created on the MU We assume that the primary copies of the private relations are at the MU Before an MU user begins using the MU, he or she may work on the database server for some time (for instance, via a fix host) Thus, there may already be private relations existing on the database server When the user switches to use the MU, the private data on the database server may need to be downloaded to the MU's local DBMS (the cache) Because we assume that an MU is reliable, it is safe to keep the primary copy of the data on the MU In addition, from time to time the user may copy the data back to the database server, just in case the MU user wants to share the private data with other users The sharing here should be READ only The MU user may eventually switch back to using fixed host Thus, a copy of the private data on the database server is necessary The differentiation of the private relations and the public relations has several advantages First, the private relations can be updated at a mobile unit without worrying about the data consistency problem Consequently, it would prevent some uplink wireless traffic that handles the WRITE operations at the database server That is, the data consistency problems of the normal WRITE operations need to be taken care of via communication on the wireless link Therefore, WRITE on the private relation prevents this traffic on the wireless link Second, owing to the nature of the private relations that a user can always write to a private relation on the MU, we could treat the WRITE operations on the private relations as cache-hits Thus, allowing the WRITE of the private relation on an MU increases the hit ratio Third, when a user on an MU submits a query to access a private relation, the query can be answered immediately Whereas answering a query, which accesses a public relation, needs to wait for the next VR to get the newest version of data The rationale is that the private relations have only been modified by the user on the MU Therefore, the data version on the MU is always the newest one Accessing the private relations would always get the newest version of data There is no reason to wait for the next VR Hence, it accelerates the response time Thus differentiating relations into two types, namely the public relations and the private relations, is worth the effort 4.7 Quota Mechanism If we wish to write on the public relations of local DBMS, some complication comes up because the public relations are shared by a group of users The complication is that the systems must keep track of all operations on the public relations to ensure the ACID property To ensure data consistency, a locking mechanism is one solution for a pessimistic approach However, the disadvantage of locking mechanism is that a long lock prevents others from accessing the same data [16] If the mobile systems use the locking mechanism, such a long locking situation could happen quite often because MUs are frequently disconnected for an extended length of time Our solution to the problem is to use the quota mechanism to download a quota of data items from the database server to the cache of an MU The leftover quota 10 may return to the server, or alternatively more quotas may be downloaded from the server Using this strategy, mobile clients can have their own allowance of data to work on and prevent a long wait The idea is quite simple, just like resource allocation The database server allocates some data resources to the cache on an MU, and these data resources become delegations of the database server on MUs There are two previous works addressing this concept [19][23] In [19], the author proposed an escrow transactional method that pre-allocates some fixed portion of an aggregate quantity data (see Definition 5) to a transaction prior to its commit When the transaction commits, there will be enough value of this data item due to the pre-allocation Only an aggregate quantity data can be updated this way in this approach The whole mechanism of this approach takes place in a centralized DBMS Thus, it is not quite the same concept as the one that we are addressing The mechanism addressed in [23] is closer to our approach The idea in this paper is that they divide an aggregate quantity data in a server into several fixed units For instance, if there is a data value “20”, they could divide this value into four data units with each data unit being a five Each unit then is allocated to different clients Each client can have full authority to handle the data unit given her The server may choose to keep one unit of data to herself When a client does not have enough data to commit a transaction, the MU may request some more data unit(s) from either the server or the other client who holds the same data unit These two approaches only apply to aggregate quantity data The data that can be handled are still very limited We build on these ideas so that the aggregate quantity data can be dynamically allocated to different MUs with different data units so called quota (see Definition 7) Our approach also allows non-aggregate data (see Definition 6) to be a quota However, only the aggregate quantitative data can be divided into several units and allocated as quota An MU must download the whole non-aggregate data as a quota These approaches significantly enhance the ideas proposed by the two previous researches Our approach allows any kind of data to use the quota mechanism as long as the DBA of the DBMS defines data items as “quota data” in the data dictionary Our approach is also the first one that can have different sizes of data unit In addition, we are the first to propose use of the quota idea in a mobile DBMS environment Definition Aggregate quantity data can be computed with mathematical operators (such as addition, subtraction, etc.) in a database management system The data type of aggregate quantity data is numerical only Definition Non-aggregate data cannot be computed with mathematical operators in a database management system The data types of the non-aggregate data could be numerical, string, and character Note that a numerical data is not necessarily an aggregate quantity data, such as social security numbers We not compute social security numbers in a database management system For the preconditions of Definitions 7, 8, and 9, let D = {R i} be a set of base relations and D c = {CRi} be a set of cache relations where CR i is the cache relation for Ri Also, let CAi be the attributes for CRi and Ai be the attributes for Ri, and CAi ⊆ Ai Let attribute ∈ CAi, ∀tc ∈ CRi; ∃ t ∈ Ri ∋ tc(ai) = t(ai) for all Notation t(ai) represents the value of the attribute a i, which is in the tuple t [18] To improve performance by allowing updates at the MU, however, this may be relaxed Given an aggregate quantity data, we can cache part of the data value at the cache 22 a fraction of the capacity of the downlink in the real world situation We assume that downlink and uplink channels have the same bandwidth capacity for the evaluation purpose Our approaches have significant improvement on the hit ratio over the other approaches By all means, the performance of our approaches will be even better in the real world situation 7, because higher hit ratio means lower uplink traffic We examined the impacts of different percentage of WRITE with different probabilities of private WRITE and quota public WRITE WRITE probability at 10% is the breaking point for our approaches (see Figure 5) That is, when both private WRITE and quota public WRITE is more than 10%, our approaches start to outperform the previous approaches When the WRITE probability increases to 50% (or 90%), the breaking point becomes at 5% (see Figures 6, 7) In addition, when the WRITE probability increases with high percentage of private WRITE and that of quote public WRITE (90% or more), our approaches outperform dramatically the previous approaches (cases U R’-MResult’, UR-MResult and UR-MData) Note our approaches always perform better than the previous approaches U R-MResult and UR-MData The reason why our approaches cannot perform better than case U R’-MResult’ in low private WRITE and quota public WRITE is due to the fact that case UR’-MResult’ broadcast IR instead of VR One last point we would like to discuss is about the impacts on different size of granularity In cache update and cache miss handling, our granularity is a set of tuples, and each tuple only contains a subset of attributes Semantic caching’s granularity (semantic region) is a set of tuples, and granularity of page caching is a page Obviously, the size of our granularity is the smallest, the size of a page is the largest, and the size of semantic region is in the middle Consequently, our performance is the best, semantic caching is second, and the page caching is the worst In equation TQ, the variable b d = n * granularity, where n is the number of granules Obviously, when granularity is smaller the b d is smaller, and a smaller bd results in a larger TQ (throughput) SUMMARY AND FUTURE WORK In this research, we have designed and developed all the required algorithms for a mobile agent on an MU along with a program on a base station The whole design aims at improving data caching/replication on a mobile unit including, among other things, prefetching/hoarding, cache management, cache coherence, and cache replacement The simulation results have shown our approaches are far superior to the previous researches This is because we use a quota mechanism and categorize relations into private and public These approaches enable a user to query private and quota data directly from the local DBMS on an MU without data coherence problem In addition, our approaches significantly reduce usage of the valuable wireless link, which is the most limited resource in a mobile computing environment The previous researches [1] [3] [13] [6] assume a READ only approach, which has been shown to be not very efficient when probabilities of private WRITE and quota public WRITE are high The approaches in [24] and [5] allow WRITE on cache However, these WRITE operations must be kept in sync with the database server’s This consumes a large portion of the valuable wireless link There are some possible extensions of this paper for future research First, we would like to translate all the algorithms into some high level languages, preferably Java Java is highly portable and excellent Unlink (backchannel) is only a fraction of the capacity of the downlink They are not all listed in this paper If interested, please refer to [26] 23 in building a front-end interface, such as a web page The built Java applets can talk with JDBC ODBC is the interface to a RDBMS JDBC can then interact with ODBC Second, we would like to address the issue of how to generate VRs efficiently on the base station, including checking updated data with the database server This is an important issue that we would like to address in the future Lastly, how to handle hand-off efficiently is also an important issue that we would like to address in the future REFERENCES [1] D Barbara and T Imielinski Sleepers and Workaholics: Caching Strategies in Mobile Environments In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, pages 1-12, May 1994 [2] D Barbara and T Imielinski Sleepers and Workaholics: Caching Strategies in Mobile Environments MOBIDATA: An Interactive Journal of Mobile Computing, 1(1), Nov 1994 [3] O Bukhres and J Jing Performance Analysis of Adaptive Caching Algorithms in Mobile Environments International Journal of Information Sciences (IJIS), North Holland, 1995 [4] M Carey, M Franklin, and M Zaharioudakis Fine-grained sharing in page server database systems In Proceedings of ACM SIGMOD Conference, 1994 [5] B Y Chan, A Si, and H V Leong Cache Management for Mobile Databases: Design and Evaluation In Proceedings of The International Conference on Data Engineering, IEEE, pages 54-63, 1998 [6] S Dar, M J Franklin, B T Jonsson, D Srivastava, and M Tan Semantic Data Caching and Replacement In Proceedings of the 22nd VLDB Conference, Mumbai (Bombay), India, pages 330341, 1996 [7] M H Dunham and A Helal Mobile Computing and Databases: Anything New?, SIGMOD Record, 24(4), pages 5-9, Dec 1995 [8] M J Franklin, M J Carey, and M Livny Global Memory Management in Client-Server DBMS Architectures In Proceedings of the International Conference on VLDB, Pages 596-609, 1992 [9] M Franklin Client Data Caching: A Foundation For High Performance Object Database Systems, Kluwer Academic Publishers, 1996 [10] R Gruber, F Kaashoek, B Liskov, and L Shrira Disconnected Operation in the Thor ObjectOriented Database System, In Proceedings of Workshop on Mobile Computing Systems and Applications, pages 51-56, IEEE, Dec 1994 [11] J S Heidemann, T.W Page, R.G Guy, and G J Popek Primarily Disconnected Operation: Experience with Ficus In Proceedings of the Second Workshop on the Management of Replicated Data, Nov 1992 [12] P Honeyman, L Huston, J Rees, et al The LITTLE WORK Project In Proceedings of the 3rd Workshop on Workstations Operating Systems, IEEE, April 1992 [13] J Jing, A Elmagarmid, A Helal, and R Alonso Bit-Sequences: A New Cache Invalidation Method in Mobile Environments, Purdue University, Department of Computer Sciences, Technical Report CSD-TR-95-076, Dec 1995 24 [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] J Jing, A Elmagarmid, A S Helal, and R Alonso Bit-Sequences: An adaptive cache invalidation method in mobile client/server environments Mobile Networks and Applications Journal 2(2), pages 115-127, 1997 G H Kuenning, G J Popek, and P L Reiher An Analysis of Trace Data for Predictive File in Mobile Computing, University of California, Los Angeles, Technical Report CSD-940016, Apr 1994 Also appeared in Proceedings of the 1994 Summer Usenix Conference Won Kim, Nat Ballou, Jorge F Garz and Darrell Woelk, “A Distributed Object-Oriented Database System Supporting Shared and Private Databases ACM Transactions on Information Systems, 9(1), pages 31-51, Jan 1991 H Lei and D Duchamp, Transparent File Prefetching, Columbia University, Computer Science Department, Mar 1995 D Maier The Theory of Relational Databases Computer Science Press, 1983 P E O'Neil The Escrow Transactional Method ACM Transactions on Database Systems, 11(4), Dec 1986 E O’Neil, P O’Neil and G Weikum The LRU-K Page Replacement Algorithm for Database Disk Buffering In Proceedings of the ACM SIGMOD, pages 297-306, 1993 R H Patterson and G A Gibson A Status Report on Research in Transparent Informed Prefetching, Carnegie Mellon University, School of Computer Sciences, Technical Report CMUCS-93-113, Feb 1993 M Satyanarayanan, J J Kistler, L B Mummert, M R Ebling, P Kumar, and Q LU Experience with Disconnected Operation in a Mobile Computing Environment, Canegie Mellon University, School of Computer Science, Technical Report CMU-CS-93-168, June 1993 Also published in Proceedings of the 1993 USENIX Symposium on Mobile and Location-Independent, Cambridge, MA, Aug 1993 N Soparkar and A Silberschatz Data-value Partitioning and Virtual Messages PODS, pages 357367, 1990 G Walborn and P K Chrysanthis ``PRO-MOTION: Management of Mobile Transactions.'' Proceedings of the 11th ACM Annual Symposium on Applied Computing, Special Track on Database Technology, pages 101-108, San Jose, CA, Mar 1997 K L Wu, P S Yu, and M S Chen Energy-Efficient Caching for Wireless Mobile Computing, In Proceedings of the 12th International Conference on Engineering, pages 336-343, Feb 1996 Jenq-Foung Yao Caching Management of Mobile DBMS on A Mobile Unit Ph.D Dissertation, Southern Methodist University, August 1998 Table Parameters that are used in the mathematical equations QL QR QW RW QpubW rpubW QqpubW rqpubW QnqpubW QprivW L B N bq ba br bd TQ bVR QC rR H h' Description Total number of queries that are submitted from an MU in the time interval L QL = QR + QW Number of READ queries in QL ∴QR = QL - QW Number of WRITE queries in QL; QW = QpubW + QprivW and also QW = rW * QL The percentage of the queries that perform WRITE in QL Number of queries, which perform a public WRITE; QpubW = QqpubW + QnqpubW and QpubW = rpubW * QW The percentage of the queries that perform public WRITE in QW Number of queries, which write on quota public relations QqpubW = rqpubW * QpubW The percentage of the queries that perform quota public WRITE in QpubW Number of queries, which write on non-quota public relations Number of queries, which write on private relations QprivW = (1 - rpub) * QW VR broadcast interval The bandwidth of the wireless network Total number of MUs in the cell of a base station Size of a query in bits Size of an answer in bits; there are two types of ba: br and bd Size of a query result in bits Size of a data item in bits which is the cache update granularity The potential maximum number of queries that the wireless link can handle in L interval Size of a VR in bits Total number of queries that can be served completely using cache during the time interval L QC = rR * h * QL rR is the percentage of READ queries in QL rR = - rW The pre-defined hit ratio The adjusted hit ratios including WRITE queries in different cases 26 Table Assumptions of first simulation Parameters Bd Bq bIR br bVR H rW RqpubW RpubW N L B QL Min 64 32 16 64 64 0.7 0.1 0.3 0.3 10 N/A N/A N/A Likeliest 128 64 64 128 256 0.8 0.2 0.5 0.5 100 10 19200 100 Max 256 96 128 256 640 0.9 0.5 0.8 0.8 120 N/A N/A N/A Distribution Triangular Triangular Triangular Triangular Triangular Triangular Triangular Triangular Triangular Poisson N/A N/A N/A 27 Table Normalized throughput of all experiments Cases UR'-MResult' UR-MResult UR-MData UR/PrivW-MResult UR/PrivW-MData UR/Qouta-MResult UR/Qouta-MData UR/PrivW/Quota-MResult UR/PrivW/Quota-MData Exp Exp Exp Exp Exp Exp Exp 1 1 1 0.86 0.86 0.86 0.86 0.86 0.86 0.86 0.86 0.86 0.86 0.85 0.87 0.86 0.85 1.27 1.7 1.69 3.03 1.1 1.18 1.65 1.27 1.7 1.68 2.98 1.11 1.18 1.63 1.18 1.36 1.62 1.23 1.09 1.13 1.34 1.18 1.36 1.61 1.21 1.09 1.13 1.33 1.7 3.5 5.65 8.51 1.28 1.47 3.17 1.7 3.49 5.62 8.36 1.29 1.46 3.13 28 Table Adjusted hit ratios, h’, for all cases Cases Exp Exp Exp Exp Exp Exp Exp UR' 64% 16% 19% 19% 11% 37% 1.5% UR 64% 16% 19% 19% 11% 36% 1.5% UR/PrivW 73% 54% 54% 75% 24% 49% 44% UR/Quota 69% 39% 50% 35% 19% 44% 27% UR/PrivW/Quota 79% 76% 86% 91% 31% 57% 69% 29 Figure Architecture for Mobile Systems (Adapted from Figure in [7]) 30 ( cache_flag, base_relation_name[1], cache_relation_name[1], [Pattribute1, Pattribute2,…, Pattributem], [attribute1, attribute2,…, attribute]n, [criteria1, criteria2,…, criterial] ) ( cache_flag, base_relation_name[2], cache_relation_name[2], [Pattribute1, Pattribute2,…, Pattributem], [attribute1, attribute2,…, attributen], [criteria1, criteria2,…, criterial] ) ……………… ( cache_flag, base_relation_name[n], cache_relation_name[n], [Pattribute1, Pattribute2,…, Pattributem], [attribute1, attribute2,…, attributen], [criteria1, criteria2,…, criterial] ) Figure User Profile 31 Figure Broadcasting Validation Report (adapted from [1]) 32 Figure The Relationship among a Fixed Network, a Base Station, and an MU 33 Wireless Link Uasge (W:20%) 6000 Number of Queries 5500 5000 UR'-MResult' UR-MResult 4500 UR-MData UR/PrivW-MResult 4000 UR/PrivW-MData UR/Quota-Mresult 3500 UR/Quota-MData 3000 UR/PrivW/Quota-MData UR/PrivW/Quota-MResult 2500 100% 90% 70% 50% 20% 10% 0% 2000 Percentage of Private and Quota Write in All Write Figure Impact of Private and Quota Public WRITE (write: 20%) 34 Wireless Link Uasge (W:50%) 8200 Number of Queries 7200 UR'-MResult' UR-MResult 6200 UR-MData UR/PrivW-MResult 5200 UR/PrivW-MData UR/Quota-Mresult 4200 UR/Quota-MData 3200 UR/PrivW/Quota-MData UR/PrivW/Quota-MResult 2200 100% 90% 70% 50% 20% 10% 0% 1200 Percentage of Private and Quota Write in All Write Figure Impact of Private and Quota Public WRITE (write: 50%) 35 Wireless Link Uasge (W = 90%) 9800 8800 Number of Queries 7800 6800 UR'-MResult' 5800 UR-MResult 4800 UR/PrivW-MResult UR-MData UR/PrivW-MData 3800 UR/Quota-Mresult UR/Quota-MData 2800 UR/PrivW/Quota-MResult UR/PrivW/Quota-MData 1800 Percentage of Private and Quota Write in All Write 100% 90% 70% 50% 20% 10% 0% 800 Figure Impact of Private and Quota Public WRITE (write: 90%) 36 Figure Captions: Figure Architecture for Mobile Systems (adapted from Figure in [7]) Figure User Profile Figure Broadcasting Validation Report (Adapted from [1]) Figure The Relationship Among a Fixed Network, a Base Station, and an MU Figure Impact of Private and Quota Public WRITE (write: 20%) Figure Impact of Private and Quota Public WRITE (write: 50%) Figure Impact of Private and Quota Public WRITE (write: 90%) Yao & Dunham ... Energy-Efficient Caching for Wireless Mobile Computing, In Proceedings of the 12th International Conference on Engineering, pages 336-343, Feb 1996 Jenq-Foung Yao Caching Management of Mobile DBMS on A Mobile. .. the fetching data is a portion of relation 4.2 Caching Granularity The caching granularity is one of the key factors in caching management systems Most of the mobile systems at the Operating... of attributes Semantic caching? ??s granularity (semantic region) is a set of tuples, and granularity of page caching is a page Obviously, the size of our granularity is the smallest, the size of

Tiêu đề	Caching Management of Mobile DBMS
Tác giả	Jenq-Foung Yao, Margaret H. Dunham
Trường học	Georgia College & State University
Chuyên ngành	Mathematics and Computer Science
Thể loại	thesis
Thành phố	Milledgeville

Định dạng
Số trang	36
Dung lượng	486,5 KB