1. Trang chủ
  2. » Công Nghệ Thông Tin

Distributed Database Management Systems: Lecture 35

42 2 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 42
Dung lượng 196,28 KB

Nội dung

Distributed Database Management Systems: Lecture 35. The main topics covered in this chapter include: query optimization and fragmented queries; joins replaced by semijoins; three major QO algorithms; distributed query processing algorithms;...

Distributed Database Management Systems Lecture 35 In the previous lecture • Query Optimization • Centralized QO –Best access path –Join Processing • QO in Distributed Environment In this lecture • Query Optimization –Fragmented Queries –Joins replaced by Semijoins –Three major QO algorithms Semijoin based Algorithms • Reduces cost of join queries • Semijoin is …… • Join of two relations can be replaced SJ of one or both relations • So R ⋈A S can be replaced: – (R ⋉A S) ⋈A S – R ⋈A (S ⋉A R) – (R ⋉A S) ⋈A (S ⋉A R) • Which one? • Need to estimate costs • Same Assumptions: –R at site 1, S at site –Size (R) < Size (S), so – A (S)  site –Site1 computes R’ = R ⋉A S’ –R’  site –Site2 computes R’ ⋈A S • Ignoring Tmsg semijoin is better if –Size( A(S)) + size(R ⋉A S) < size(R) • Join is better if … • Semijoin is better if… - • SJ with more than two tables Will be more complex • Semijoin approach can be applied to each individual join, consider EMP ⋈ ASG ⋈ PROJ • EMP ⋈ ASG ⋈ PROJ = • EMP’ ⋈ ASG’ ⋈ PROJ where • EMP’ = EMP ⋉ ASG and • ASG’ = ASG ⋉ PROJ rather • EMP” = EMP ⋉ (ASG ⋉ PROJ) 1- Ship PROJ to site of ASG 2- Ship ASG to site of PROJ 3- Fetch ASG tuples as needed for each tuple of PROJ 4- Move both to a third site Optimization involves costing for each possibility • That is it regarding R* algorithm for distributed query optimization • Lets review it SDD-1 Algorithm • System for Distributed Databases • A non-commercial database • Based on the Hill Climbing Algorithm • No semijoins, No rep/frag • Cost of transferring the result to the user site from the final result site is not considered • Can minimize either total time or response time • Input include –Query Graph –Locations of relations –Relations’ statistics 1- Do the initial local processing 2- Select the initial best plan (ES0) –Calculate cost of moving all relations to a single site –Plan with the least cost is ES0 3- Split ES0 into ES1 and ES2 –ES1: Sending one of the relation to other site, relations joined there –ES2:Sending the result back to site in ES0 4- Replace ES0 with ES1 and ES2 when we should have cost(ES1) + cost(local join) + cost (ES2) < cost (ES0) 5- Recursively apply step and on ES1 and ES2, until no improvement • Example • “Find the salaries of engineers working on CAD/CAM project” • Involves EMP, PAY, PROJ and ASG sal(PAY ⋈title(EMP  ⋈pNo( ⋈eNo(ASG  pName = ‘CAD/CAM’  (PROJ))))) Relation Size EMP PAY PROJ ASG 10 Site Assume Tmsg = and TTR = Length of a tuple is So size(R) = card(R) • Considering only transfers costs • Site –PAY  site = –PROJ  site = –ASG  site = 10 – Total = 15 Relation Size EMP PAY PROJ ASG 10 Site Assume Tmsg = and TTR = Length of a tuple is So size(R) = card(R) • Considering only transfers costs • Site –PAY  site = –PROJ  site = –ASG  site = 10 – Total = 15 • Cost for site = 19 • Cost for site = 22 • Cost for site = 13 • So site is our ES0 • Move all relations to site Thanks ... That is it regarding R* algorithm for distributed query optimization • Lets review it SDD-1 Algorithm • System for Distributed Databases • A non-commercial database • Based on the Hill Climbing...In the previous lecture • Query Optimization • Centralized QO –Best access path –Join Processing • QO in Distributed Environment In this lecture • Query Optimization –Fragmented... query • Most systems use single SJs to reduce relation size Distributed Query Processing Algorithms • Three main representative algos are ? ?Distributed INGRES Algorithm –R* Algorithm –SDD-1 Algorithm

Ngày đăng: 05/07/2022, 13:42

TÀI LIỆU CÙNG NGƯỜI DÙNG

  • Đang cập nhật ...

TÀI LIỆU LIÊN QUAN