Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 275 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
275
Dung lượng
29,14 MB
Nội dung
STRUCTURED CONCURRENCY CONTROL INOBJECT ORIENTED DATABASES Francisco Mariátegui STRUCTURED CONCURRENCY CONTROL INOBJECT ORIENTED DATABASES A Dissertation Presented to the Graduate Faculty of The School of Engineering and Applied Science of Southern Methodist University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy with a Major in Computer Science By Francisco José Mariátegui B.Sc., Honors, Naval Academy of Peru, 1974 Systems Engineering Specialization Degree, Honors, University of Lima, 1977 M.Sc Computer Science, U.S.A Naval Postgraduate School, 1979 M.Sc Computer Systems Management U.S.A Naval Postgraduate School, 1979 May 13, 1989 COPYRIGHT @ 1989 Francisco J Mariategui All Rights Reserved Mariategui, Francisco J B.Sc Naval Sciences, Naval Academy of Peru, 1974 System Engineering, University of Lima, 1977 M.Sc Computer Science, U.S Naval Postgraduate School, 1979 M.Sc Computer Systems Management, U.S Naval Postgraduate School, 1979 STRUCTURED CONCURRENCY CONTROL IN OBJECT ORIENTED DATABASES Advisor: Dr Margaret H Eich Doctor of Philosophy degree conferred August 12, 1989 Dissertation completed May 13, 1989 In the last few years a number of object-‐oriented database systems have appeared in the literature, most of which addresses specific areas such as office information systems (OIS), computer aided design (CAD), computer aided manufacturing (CAM), software engineering (SE), and artificial intelligence (AI) Unfortunately, hardly any one of them addresses the problem of concurrency control from the general-‐ purpose database point of view Due to the extreme differences in types of transactions supported by these environments, the need for combining different concurrencycontrol approaches has been recognized but never thoroughly investigated A high level design of a Multi-‐Group Multi-‐Layer approach to concurrency control for object-‐oriented message-‐passing based databases is presented The design follows a formal definition of transaction The concurrency control takes advantage of the structured nature of transactions to manage an on-‐line serializer The serializer is specified as a set of filters These filters are specifications of algorithms that ensure serializable histories The concurrency control manages these histories by layers Each layer, along with its corresponding filters, constitutes a different level of abstraction in concurrency control processing Mutually exclusive groups of transactions being processed in parallel are assumed The availability of a processor per group is also assumed The performance is improved when this case of large granularity and limited interaction is applied The decomposition of the histories into layers allows the problem to be more manageable, the principles of hierarchical design to be applied, and the benefits of hierarchical thought to be utilized Summarizing, this research has led to the following results: 1) First cut definition of an Object-‐Oriented Data Model (OODM) which encompasses data structures, operations, and integrity constraints 2) Transaction processing model for the OODM environment, which facilitates not only definition of transactions but also, allows investigation of concurrency control 3) Multi-‐group Multi-‐Layer concurrencycontrol technique built on the OODM and transaction models that allow the use of several different concurrencycontrol techniques in parallel in the same environment TABLE OF CONTENTS TABLE OF CONTENTS LIST OF FIGURES 13 ACKNOWLEDGEMENTS 15 CHAPTER 1 -‐ INTRODUCTION 16 1.1 THE PROBLEM 1.2 THE APPROACH 1.3 CONTRIBUTION 1.4 SIGNIFICANCE 1.5 THE CONCURRENCY CONTROL MANAGER 1.5.1 PURPOSE 1.5.2 CONCEPTS AND MEANS 1.5.3 BENEFITS 1.5.4 INTERFACE 1.6 GENERAL OVERVIEW OF THE CCMM 1.7 OUTLINE OF THE DISSERTATION 2.1 INTRODUCTION 2.2 OBJECT-‐ORIENTED DATABASES: AN OVERVIEW 2.2.1 BACKGROUND 2.2.2 DEFINITION OF TERMS 2.2.3 DEFINITION OF PROPERTIES OF OODBS 2.3 DATA MODELS 2.4 AN OBJECT-‐ORIENTED DATA MODEL 2.4.1 DATA STRUCTURE 16 18 19 20 21 21 22 22 23 24 28 30 32 33 35 38 41 44 45 2.4.2 OPERATORS 2.4.3 INTEGRITY RULES 2.4.4 SUMMARY 2.5 MODELING ABILITY OF THE OODM 2.6 SUMMARY 57 62 63 65 67 CHAPTER 3 -‐ UNIT OF CONSISTENCY 69 3.1 INTRODUCTION 3.2 PRELIMINARIES 3.4 THE PAIR < GM, DM > AS A MODELING TOOL 3.5 EXPANDING THE MODEL TO INCLUDE LEAVES 3.6 THE COMPLETE STRUCTURE OF A TRANSACTION 3.5 SUMMARY 69 70 79 87 93 102 CHAPTER 4 -‐ THEORY OF EXECUTION AND SERIALIZABILITY 104 4.1 INTRODUCTION 4.2 PRELIMINARIES 4.3 SERIALIZABILITY 4.4 SERIALIZABILITY AND THE PAIR 4.5 INFORMAL COMPRESSION OF TRANSACTION TREES 4.6 FROM TRANSACTION TREES TO TRANSACTION HISTORIES 4.7 SUMMARY 104 105 107 118 119 128 130 CHAPTER 5 -‐ MULTI-‐LAYER CONCURRENCY: RATIONALE 132 5.1 INTRODUCTION 5.2 THE NEED FOR CHANGE 5.3 REPRESENTATIVE CONCURRENCY CONTROL TECHNIQUES 132 133 137 5.4 STRUCTURED CONCURRENCY CONTROL 5.5 THE LAYERED APPROACH 5.6 EXPANDING THE THEORY OF EXECUTION: GROUP HISTORY 5.7 SUMMARY 142 144 145 147 CHAPTER 6 -‐ MULTI-‐LAYER CONCURRENCY ARCHITECTURE 149 6.1 INTRODUCTION 6.2 BACKGROUND 6.3 ACTIVE HISTORIES 6.4 PREFIXES 6.5 CONTENTS OF PREFIXES OF ACTIVE HISTORIES 6.5.1 NOTATION FOR PREFIXES 6.5.2 MEANING OF THE INDEXES OF HISTORIES 6.5.3 CONTENTS OF HISTORIES 6.6 HIERARCHY OF HISTORIES 6.7 ELEVATOR FUNCTIONS 6.8 NOTATION 6.9 FILTERS 6.9.1 COMPONENT HISTORIES FILTERS 6.9.1.1 Filter F0rsws 6.9.1.2 Filter F0ccg 6.9.2 TRANSACTION HISTORIES FILTERS 6.9.2.1 Filter F1rsws 6.9.2.2 Filter F1ccg 6.9.3 FOREST HISTORIES FILTERS 6.9.3.1 Filter F2rsws 6.9.3.2 Filter F2tcg 6.9.4 GROUP HISTORY FILTERS 149 150 151 155 158 158 159 160 164 165 173 175 177 179 181 182 183 184 185 186 187 189 10 8.2.5 Early Evaluation of Inter-‐Group Conflicts The Multi-‐Group Multi-‐Layer Approach to concurrency "waits" until conflicts have gotten to the top of the hierarchy to detect them and take appropriate action It is in this way that inconsistencies among groups are not allowed The explicit maintenance of the conflict graphs constitutes not only the way to guarantee serializable executions, but it is also the common denominator to maintain all the different techniques synchronized It may seem (at first glance) that the detection of conflicts at the top level of the hierarchy is a "little" too late (actually, it is not) The underlying assumption of this work is that different types of transactions, or applications, or environments must be supported in the same (logical) data bank Furthermore, this data bank ought to provide flexible and adaptable concurrencycontrol techniques so each user would be able to suit their needs (or lack of thereof) It is more likely than not, that different transactions will access different data due to their different nature, their different type of application, and most important, their different environment In other words, the probability of conflicts among groups should be low But, then again, 261 every concurrencycontrol technique works better when the conflict probability is low In any event, and as the title of this subsection suggests, it would be interesting to learn about the potential benefits of early evaluation of conflicts There are several early evaluation times: levels 0, 1, and 2 of the hierarchy (ℎ𝐻) In the case that these early evaluations were beneficial, the issue about how much space is needed would become a problem But more important, global data structures would be needed This would make it problematic to implement in distributed systems 8.2.6 Increase the Number of Processors So far, the existence of only four processors was assumed, one per group, performing concurrencycontrol processing At the top of the hierarchy and given the sequential nature of its filters, any processor (out of the four) may execute The processing needed to be performed at each level of the hierarchy is simple It consists mainly of cycle detections and updates to data structures containing read sets and write sets and other bookkeeping information The potential for speedup using parallel algorithms and an appropriate number of processors, is substantial For example, 262 Kruskal's parallel prefix algorithm [Kruskal, 1985], which could be interpreted as a variant of the precede and conflict relations, has a time complexity of O( ( (𝑙𝑜𝑔 𝑛)/𝑙𝑜𝑔(2𝑛/𝑝 )).(𝑛/𝑝 )), where 𝑝 in the number of processors and 𝑛 the number of elements The creation of a hardware version of the ConcurrencyControl Manager Module (CCMM) is possible This version could take the form of a concurrencycontrol board, which among other things (such as memory) would contain a set of processors acting on the hierarchy of histories The speedup could be equivalent to several orders of magnitude when parallel processors are used instead of only one processor per group 8.2.7 Pipeline the Cycle Algorithm As is, the cycle algorithm is processed a step at a time, that is, a level of the hierarchy at a time An improvement over the original high-‐ level design could be achieved by pipelining the steps of the cycle algorithm 263 Given that the algorithm is described in terms of elevator functions, filters, and bookkeeping information, it would be relatively simple to pipeline the processing of different levels of the hierarchy One would need to investigate which would be the collision vector of the whole scheme In other words, what are the resources used by each level and how they collide Here, resources mean logical as well as physical structures (e.g., memory, cpu's, read sets, etc.) By “collide” it is meant if they are used simultaneously, and if they are, how are they going to interfere with each other Once this "interference" is understood, the pipeline mechanism may be constructed and optimized, giving a further gain in performance The gains in using pipelines can be quite impressive As a reminder, consider that the so-‐called "supercomputers" are (in their majority) highly pipelined vector processing architectures 8.2.8 Router Issues It is assumed throughout this work that there exists a router that would balance the load in the groups The router, as mentioned in chapter 7, should be smart enough to assign each transaction to its "most appropriate" group and at the same time perform load balancing These two minimum requirements could be conflicting At 264 this point there are no definitive answers to put forward concerning this problem although there are some suggestions Basically, the router may be qualified by its level of achievement, i.e., how well would it its job Also, the quality of the router may be directly dependent on the number of characteristics of transactions it is able to analyze If complexity were directly correlated with capability, then the router would be more efficient if it is more complex (up to a certain point, of course) A router could be a simple round robin algorithm that would assign the first transaction to the first group, the second transaction to the second group, and so on Or it can be a complex analyzer of a transaction or a group of transactions using techniques from artificial intelligence, statistics, optimization, etc The definitive answer may lie somewhere in between The router should be simple enough to the job efficiently and complex enough to satisfy most of the requirements Another unanswered question about the router is where to situate it In other words, is the router a new and separate module? It was assumed throughout this work that this is the case, for purposes of modularizing the presentation and thought It could be seen as a 265 smart transaction manager or even a combination of a query processor and a transaction manager driven by an expert system The two questions presented above must be answered before an attempt is made to study the impact that such a device will have when included in the DBMS 266 REFERENCES [Aho, 1974] Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey E., The Design and Analysis of Computer Algorithms, Addison-‐Wesley, 1974 [Aho, 1983] Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey E., Data Structures and Algorithms, Addison-‐Wesley, 1983 [Baase, 1978] Baase, S., Computer Algorithms, Introduction to Design and Analysis, Addison-‐Wesley, 1978 [Baase, 1988] Baase, S., Computer Algorithms, Introduction to Design and Analysis, Second Edition, Addison-‐Wesley, 1988 [Badrinath, 1987] Badrinath B and Ramamritham, K., "Semantics-‐based Concurrency: Beyond Commutativity," Proceedings of the Third International Conference on Data Engineering, 1987, pp 304-‐311 [Badrinath, 1988] Badrinath B and Ramamritham, K., "Synchronizing Transactions on Objects," IEEE Transactions on Computers, Vol 37, No 5, May 1988, pp 541-‐547 [Baneriee, 1987] 267 Banerjee J., Chou, H.T., Garza, J.F., Kim, W., Woelk, D., Ballou, N., and Kim, H.J., "Data Model Issues for Object-‐Oriented Applications," ACM Transactions on Office Information Systems, Vol 5, No 1, January 1987, pp 3-‐26 [Beech, 1987] Beech, D., "Groundwork for an Object Database Model," Research Directions in Object-‐Oriented Programming, MIT Press Series in Computer Systems, Cambridge, Massachusetts, 1987, pp 317-‐355 [Bernstein, 1981] Bernstein, P and Goodman, N., "Concurrency Controlin Distributed Database Systems," ACM Computing Surveys, July 1981, pp 185-‐222 [Bernstein, 1987] Bernstein, P., Hadzilacos, V., and Goodman, N., ConcurrencyControl and Recovery in Database Systems, Addison-‐Wesley, 1987 [Casanova, 1981] Casanova, M.A., The ConcurrencyControl Problem for Database Systems, Lecture Notes in Computer Science, Springer-‐Verlag, 1981 [Cockshot, 1984] Cockshot, W.P., Atkinson, M.P., Chrisholm, J., Bailey, P.J., and Morrison, R., "Persistent Object Management System," Software-‐Practice and Experience, Vol 14, 1984, pp 49-‐71 [Codd, 1970] Codd, E.F., "A Relational Model for Large Shared Data Banks," Communications of the ACM, Vol 13, No 8, June 1970, pp 377-‐387 268 [Codd, 1981] Codd, E.F., "Data Models in Database Management," ACM SIGMOD Record, Vol 11, No 2, February 1981 [Codd, 1982] Codd, E.F., "Relational Database: A Practical Foundation for Productivity," Communications of the ACM, Vol 25, No 2, February 1982, pp 109-‐117 [Croft, 1985] Croft, W.B., "Task Management for an Intelligent Interface," Database Engineering, Vol 8, No 4, December 1985, pp 8-‐13 [Dahl, 1966] Dahl and Nygaard, Communications of the ACM, Vol 9, pp 671-‐678 [Date, 1983] Date, C.J., An Introduction to Database Systems, Volume II, The Systems Programming Series, Addison-‐Wesley, 1984 [Date, 1986] Date, C.J., An Introduction to Database Systems, Volume 1, fourth edition, Addison-‐Wesley Publishing Company, 1986 [Dayal, 1986] Dayal, U and Smith, J.M., "PROBE: A Knowledge-‐Oriented Database Management System," On Knowledge Base Management Systems, Springer-‐Verlag, 1986, pp 227-‐257 [Eich, 1988] 269 Eich, M.H., "Graph Directed Locking," IEEE Transactions on Software Engineering, Vol 14, No 2, February 1988, pp 133-‐140 [ELAbbadi, 1988] ElAbbadi, M., "The Group Paradigm for ConcurrencyControl Protocols", Proceedings of the International Conference SIGMOD on Mgmt of Data, Chicago, Illinois, June 1-‐3 1988, pp 126-‐134 [Eswaran, 1976] Eswaran, K.P., Gray, J.N., Lorie, R.A., and Traiger, I.L., "The Notions of Consistency and Predicate Locks in a Database System," Comm of the ACM, Vol 19, No 11, November 1976, pp 624-‐633 [Fishman, 1987] Fishman, D., Beech, D., Cate, H., Chou, E., Connors, T., Davis, J., Derret, N., Hoch, C., Kent, W., Lynbaek, P., Mahbod, B., Neimat, N., Ryan, T., and Shan, M., "Iris: An Object-‐Oriented Database Management System," ACM Transactions on Office Information Systems, Vol 5, No 1, January 1987, pp 48-‐69 [Garcia-‐Molina, 1987] Garcia-‐Molina, H and Salem, K., "Sagas," Proceedings of the ACM International Conference on Management of Data, May 1987, pp 249-‐259 [Garey ,1979] Garey, M and Johnson, D., Computers and Intractability, A Guide to the Theory of NP-‐Completeness, W.H Freeman and Company, 1979 [Goldberg, 1983] 270 Goldberg A and Robson, D., Smalltalk-‐80: The language and its implementation, Reading, Massachusetts, Addison-‐Wesley, 1983 [Gray, 1982] Gray, J., Lorle, R., Putzolu, G., and Traiger, I., "Granularity of Locking and Degrees of Consistency in a Shared Data Base," Modeling in Data Base Management Systems, G.M Nijssen, ed., North Holland, 1976 [Hadzilacos ,1986] Hadzilacos, T and Yannakakis, M., "Deleting Completed Transactions," Proceedings of the Fifth ACM SIGACT-‐SIGMOD Symposium on Principles of Data Base Systems, pp 43-‐46, 1986 [Harary, 1969] Harary, E., Graph Theory, Addison-‐Wesley, 1969 [Horrowitz, 1978] Horrowitz, E and Sahni, S., Fundamentals of Computer Algorithms, Computer Science Press, 1978, pp 248-‐269 [Knuth, 1975] Knuth, D., Fundamental Algorithms: The Art of Computer Programming, Volume 1, Addison-‐Wesley, Second Edition, 1975 [Kohler ,1981] Kohler, W., "Survey of Techniques for Synchronization and Recovery in Decentralized Computer Systems," ACM Computing Surveys, June 1981, pp 149-‐183 [Kruskal, 1985] 271 Kruskal, C., Rudolph, L., and Snir, M., "The Power of Parallel Prefix," IEEE Transactions on Computers, Vol c-‐34, No 10, 1985 [Kung, 1981] Kung, H and Robinson, J., "On Optimistic Methods for Concurrency Control," ACM Transactions on Database Systems, Vol 6, No 2 June 1981, pp 213-‐226 [Lieberman, 1981] Lieberman, H., "A Preview of Act 1," MIT Artificial Laboratory Memo No 625 [Lochovsky, 1985] Lochovsky, F., "Letter from the editor," Database Engineering, Vol 8, No 4, December 1985, page 1 [Maier, 1983] Maier, D., The Theory of Relational Database, Computer Science Press, 1983 [Maier, 1986] Maier, D and Stein, J., "Development of an Object-‐Oriented DBMS," OOPSLA Proceedings, 1986, pp 472-‐482 [Maier, 1988] Maier, D., Penney, J., and Stein, J., "Is the Disk Half Full or Half Empty," To appear: Workshop on Persistent Object Systems, Appin, Scotland [Manthey, 1987] Manthey, J., The Characteristics of Parallel Algorithms, Scientific Computation Series, MIT Press, 1987, pp 139-‐165 272 [Mariategui, 1988] Mariategui, F., Eich, M., and Rafiqi, S., "The Object-‐Oriented Data Model Defined," March 1988 SMU Technical Report 88-‐CSE-‐28 [McKenzie, 1987] McKenzie, E and Snodgrass, R., "Extending the Relational Algebra to Support Transaction Time," Proceedings ACM SIGMOD Annual Conference, San Francisco, May 27-‐29, 1987, pp 467-‐478 [Moss, 1981] Moss, J., "Nested Transactions: An Approach to Reliable Distributed Computing", Ph.D Thesis M.I.T Dept of Elec Eng and Comp Sci., 1981 [Moss, 1985] Moss, J., Nested Transactions: An Approach to Reliable Distributed Computing, The MIT Press Series in Information Systems, 1985 [Ontologic, 1986] Vbase Technical Overview, Version 1.0, March 6, 1987, 1986 Ontologic Inc [Papadimitriou, 1979] Papadimitriou, C., "The Serializability of Concurrent Database Updates," Journal of the Association for Computing Machinery, Vol 26, No.4, October 1979, pp 631-‐653 [Papadimitriou, 1986] Papadimitriou, C., The, Theory of Database, Concurrency Control, Computer Science Press, 1986 273 [Peterson, 1985] Peterson J and Silberschatz, A., Operating System Concepts, second edition, Addison-‐Wesley Publishing Company, 1985 [Serviologic, 1988] Servio Logic Development Corporation, Beaverton, Oregon, 1988 [Segev, 1987] Segev A and Shoshani, A., "Logical Modeling of Temporal Data", Proceedings of ACM SIGMOD Annual Conference, San Francisco, May 27-‐29, 1987, pp 454-‐466 [Shapiro, 1977] Shapiro, R.M and Millstein, R.E., NSW Reliability Plan, Technical Report 7701-‐1411, Computer Associates, Wakefield, MA, June, 1977 [Shmueli, 1983] Shmueli, O., "Dynamic Cycle Detection," Information Processing Letters, 17:4, 1983 [Silberschatz, 1982] Silberschatz, A and Kedem, Z., "A Family of Locking Protocols for Database Systems that are Modeled by Directed Graphs," IEEE Transactions on Software Engineering, Vol SE-‐8, No 6, Nov 1982 [Stefik, 1986] Stefik, M and Bobrow, D., "Object-‐Oriented programming: Themes and Variations," AI Magazine, January 1986, pp 40-‐62 [Ullman, 1982] 274 Ullman, J., Principles of Database Systems, second edition, Computer Science Press, 1982 [Ullman, 1988] Ullman, J., Principles of Database and Knowledge-‐Base Systems, Volume 1, Computer Science Press, 1988 [Weinreb 1981] Weinreb, D and Moon D., "Lisp Machine Manual," Symbolics Inc [Woelk, 1986] Woelk, D., Kim, W., and Luther, W., "An Object-‐Oriented Approach to Multimedia Databases," Proceedings ACM SIGMOD Conference on the Management of Data, Washington D.C., May 1986 [Zaniolo, 1986] Zaniolo, C., Hassan A., and Beech D., Cammarata Stephanie, Kerschberg Larry, and Maier David, "Object-‐Oriented Database Systems and Knowledge Systems," Proceedings 1st International Workshop on Expert Database Systems, 1986, pp 293-‐305 [Zhu, 1987] Zhu, J and Maier, D., "Abstract Objectin an Object-‐Oriented Data Model," Oregon Graduate Center Department of C.S and Engineering Technical Report No CS/E 87-‐015 275 ... STRUCTURED CONCURRENCY CONTROL IN OBJECT ORIENTED DATABASES A Dissertation Presented to the Graduate Faculty of The School of Engineering and Applied... to concurrency control for object- oriented message-‐passing based databases is presented The design follows a formal definition of transaction The concurrency control. .. databases in general, and object- oriented databases in particular An approach to a solution can be accomplished by focusing our efforts in Structured Concurrency