1. Trang chủ
  2. » Luận Văn - Báo Cáo

ứng dụng công nghệ hdp vào việc tối ưu lưu trữ dữ liệu tại tổng công ty mạng lưới lao telecom tỉnh attapeu nước chdcnd lào

74 4 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Trang 1

TRUONG DAI HQC SU PHAM

VONGVILAI THIDSAMAI

Da Nlng - 2023

Trang 2

TRUCJNG DAI HOC SU PHAM

VONGVILAI THIDSAMAI

CNGDUNG CONG NGHE HDP VAO VIECTOI llLUU TRU*

Chuyen nganh: He thong thong tinMa so: 84.80.104

Ngu’di hurting dan khoa hoc: TS Nguyen Dinh Lau

Da Nang - 2023

Trang 3

LOI CAMDOAN

Toi xin cam doan day la cong trinh nghien cthi do toi thuc hien duai str hiring dan cua TS Nguyen Dinh Lau tai bo mon He thong Thong tin, Khoa Cong nghe Thong tin, Taring Dai hoc Su Pham Da Nang Cac so lieu va ket qua trinh bay trong luan an la (rung thuc, chua duac cong bo boi bat ky tac gid nao hay a bat ky cong trinh nao khac.

lac gia

Vongvilai Thidsamai

Trang 4

LOI CAM ON

> r > e

Trude tien toi xin gui ldi cam on chan thanh va sau sSc den thay giao, TS

Nguydn Dinh Lau - ngudi da hudng dan, khuyen khich, truyen cam hung, chi bao va tao cho toi nhung dieu ki?n tot nhat tu khi bat dau nghien cuu den khi hoan thanh luan van nay.

Toi xin chan thanh cam on cac thay cd giao khoa Cong nghe thong tin, trubng Dai hoc su pham Da NSng, dac biet la cac Thay Co trong Bq mon II? thong Thong tin da tan tinh dao tao, cung cap cho toi nhung kien thuc vo cung quy gia, da tao dieu kien tot nhat cho toi de hoan thanh luan vSn n^y.

Dong then toi xin chan thanh cam on cac ban trong ldp K40.HTTT da tao moi

Trang 5

l ende lai: : UNG DUNG CONG NGHE HDP VAO VIEC TOI U U LUU TRU• • •DU LIEU TAI TONG CONG TY MANG LUOl LAO TELECOM, TINH ATTA PE U NUOC CIIDCND LAO

Nganh: J le thong thongtin

lorn tAt: Cong ngh$ thong tin va vien thong con la mot trong nhirng dieu ki?n chinh quyet dinli sir phat trien cua nen kinh te the gidi No tac dong sail sAc den each chung la dang song, hoc tap va lain vi$c: den each thirc nha nude giao tiep vdi dan Nd Cling t$o ra nhirng lhach thirc kinh te xa hoi trirdc eac ca nhan doanh nghi^p cong dong d moi noi tren trai dal nhAm dat hi$u qua va tinh sang tao cao lion, lat ca chung la dang dung trirdc va can nam bat co hoi nas

Hortonworks Data Platform La mot nen tang phat trien va xay dirng hoan toan md 11 DP dirge rhiet ke de dap img nhu can xir ly du li£u Idn cua doanh nghi$p IIDP la linh hoyt cung cap kha nang nd rong tuyen tinh md rong liru trit va tinh loan Iren mot loat cac phirong phap truy cap (access nethods) batch va real-time, search va streaming Nd bao gom nipt tap hop loan dien cac kha nAng xir ly dir li$u cho doanh nghiep nhir: governance, integration, security va operation.

Ap dung nhtfng cong nghe mdi nay vao viec luu trii va xu ly dir lieu Idn Trong nganh vien thong thi dir li?u la circ ky Idn nhal la doi vdi dir lieu lien quan den cuoc ggi tin nhan va hanh vi sir dyng dir li$u Doi vdi nha mang Lao Telecom thi nhirng dCr lieu nay phat sinh khoang 2.5 I B den 3T

irong mot thang doi vdi nha myng Lao Telecom thi dd lipu nay khoang I.5T den 2TB mot thang Do Idn cua dir lieu phu tlnioc vao so lirgng time bao cua moi nha mang va cac thong tin lien quan ma tong dai sc ghi nhan lai.

I rong de tai tir ket qua phan tich so sanh giira giai phap cu va giai phap mdi ta thay ket qua cua giai phAp mdi tot hern I hdi gian thyc hipn import so li?u vao he thong nhanh horn hAn so vdi giai phap cu Ngoai ra dCr lieu cirdc cua khach hang ngay cang Idn moi nam co the tang 20-30%, vi vay neu van ip dyng cong nghe cu va giai phAp cu s6 khong dAp img dirge nhu can thyc te Vdi cong ngh? mdi va giai phap mdi sc hoan toan cd the dap irng dirge cac yeu cAu nay cong ngh^ dirge ra ddi de chuyen dung cho cac co sd dil lieu Idn xir ly dir lieu Idn trong thdi gian thyc.

De tai cd gia trj ve mat ly thuyet Co the sir dyng de tai nhir la tai li$u tham khao doi vdi sinh vien nganh hp thong thong tin va hieu phirong phap sir dung cong ngh£ HDP de img dyng vao thyc te Dira ra hgp gdp phan Idn vao vi?c liru trir va xir Iv d& lieu ciia nha mang Lao Telecom.

lir khoa: toi iru liru tnr dur lieu; xir ly dfr lieu Idn: cong ngh? HOP: cong ngh^ vien thong: cong igh$ thong tin.

Xac nhan ciia giao vien hironanX

Ngirfri thuc hi{*n de tai

V0NGV1LAI THIDSAMAI

Trang 6

Name of thesis: APPLYING HDP TECHNOLOGY TO OPTIMIZE DATA STORAGE AT LAO TELECOM NETWORK COMPANY, ATTAPEU PROVINCE, LAOPDR.

Major: Information system

Abstract: Information technology and telecommunications are also one of the main conditions determining the development of the global economy It deeply affects the way we live, learn, and work: the way the state communicates with the people It also creates economic and social challenges for individuals, businesses, and communities everywhere on the planet to achieve higher efficiency and creativity All of us are facing and need to seize this opportunity.

Hortonworks Data Platform (HDP) is a fully open development and construction platform designed to meet the needs of enterprise big data processing HDP is flexible, providing linear scalability, expandable storage and computing across a range of access methods, batch and real-time, search and streaming It includes a comprehensive set of data processing capabilities for enterprises such is governance, integration, security, and operation.

Applying these new technologies to the storage and processing of big data is crucial In the telecommunications industry , data is extremely large, especially for data related to calls, messages, and data usage behavior For Lao Telecom, this data generates about 2.5TB to 3T per month, while for other telecom companies, this data is about 1.51 to 2TB per month The size of the data depends on the number of subscribers of each telecom company and the related information that the switchboard will record.

In the research project, the comparison analysis results between the old solution and the new solution showed that the new solution performs better The time it takes to import data into the system is much faster than the old solution In addition, the customer billing data is increasing every year, possibly by 20-30% so if the old technology and solution are still applied, they will not meet practical •equirements With the new technology and solution, these requirements can be completely met This technology was developed specifically for large databases and processing big data in real-time.

The topic is valuable in terms of theory It can be used as a reference for students in the field of information systems to understand the method of using HDP technology to apply it in practice It

provides a significant contribution to the storage and processing of data for Lao Telecom.

Keywords: data storage optimization; big data processing: HDP technology: telecommunications technology: information technology.

Trang 7

MUC LUC• •

Ldl CAM DOAN

Ldl CAM ON ii

DANH MVC CAC KY HIEU, CHU VIET TAT v

DANH MUC BANG vi

DANH SACH HINH VE vii

MO DAU 1

1 Ly do chon de tai 1

2 Muc ticu nghien cuu 2

3 Doi tuong va pham vi nghien cuu 3

4 Phuong phap nghien cuu 3

5 Y nghia khoa hoc va thuc tien cua de tai 3

6 Ket qua du kien 3

7 Bo cue luan van 3• •

CHUONG 1 DAT VAN DE VE BAI TO AN TINII CUOC DANG DUNG TAILAO TELECOM 5

1.1 Mo hinh, hien trang va nghiep vu cua he thong Lao Billing 5

1.1.1 Giai thieu mo hinh he thong Lao Billing 5

1.1.2 Cac nghiep vu linh cudc 6

1.2 Nhung ton tai cua he thong 13

1.3 Ket chuong 14

CHUONG 2 PHAN TICH, LUA CHON VA THIET KE GIAI PHAP 15

2.1 Mo hinh giai phap cu 15

2.1.1 Mo hinh vat ly he thong Billing hien tai 15

2.1.2 Mo hinh logic he thong Billing hien tai 16

2.2 Mo hinh giai phap he thong Billing moi va hoan toan mien phi[ 101 17

2.3 Mo hinh giai phap mien phi ket hop co phi 21

• • • '2.4 So sanh giai phap mien phi hoan toan va giai phap ket hop co phi 26

2.5 Ket chuong 28

CHUONG 3 UNG DUNG CONG NGHE HORTONWORKS DATAPLATFORM3.1 Horton works Data Platform 29

Trang 8

3.8 Trien khai cac tuy chon cua Hadoop 33

3.9 Cai dal Hortonworks Sandbox tren windows sir dung Oracle VirtualBox 34

3.9.1 Cai dat tren Windows bang each sir dung Oracle VirtualBox 34

3.9.2 Import tap tin Sandbox: File-> Import Appliance 35

* r3.9.3 Cira so Import Virtual Appliance xuat hien 36

3.9.4 Import Virtual Applicance 36

9 9 '» * 13.9.5 Man hinh cai dat thiet bi xuat hien Ban co the phan bo RAM nhieu hon so vdi» 9mac dinh de nang cao hieu suat 37

3.9.6 Thiet bi dirge Import 38

3.9.7 Mo Sandbox 38

3.9.8 Dgi may ao khdi dong len 39

3.9.9 Sir dung trinh duyet tren may chu de mb cac URL hien thi tren giao di?n dieukhien 39

3.10 Ket chirong 40

CHUONG 4 DANH GIA THUC NGHIEM SO SANH GIAI PHAP CU VAGIAI PIIAP MC5I 41

*4.1 Mo hinh logic he thong thirc nghiem 41

z z4.2 Phurong phap lay so lieu thirc nghiem 42

z z4.3 Phan tich, so sanh so lieu thirc nghiem gitra hai he thong 53

4.4 Ket chiromg 57

KET LU AN, KHUYEN NGHI 59

DANH MUC TAI LIEU TH AM KHAO 60• •

Trang 9

DANH MUC CACKY HIEU, CHU VIET TAT

KPI Key Performance

La chi so danh gia cong vi?c, cong cu do lirongIndicator nharn phan anh hieu qua hoat dong

Balacing

Trang 10

DANH MUC BANG

So hieu

2.1 So sanh tinh nang EBS cua cac nha cung cap dich vu 262.2

- f

So sanh Database cua cac nha cung cap dich vu 27

7 -Thong tin chi tiet may chu trong qua trinh chay chay script 43

-r a

-Dir lieu chi tiet tai may chu trong mot lan chay thirc nghiem 45

4.7 So lieu thirc nghiem tren cong nghe cu vdi dung lirong 2G7 474.8

Trang 11

DANH SACH HINH VE

2.4 Mo hinh giai phap ket hop co phf va mien phi 223.1

-

7 -Hadoop trong mot mo hinh kien true hien dai 293.2 Hoilonworks Data Platform: Enterprise Hadoop 303.3

4.3 Cong thuc va ket qua tinh ca mau thuc nghiem" Z A • A 544.4

Trang 12

1. Lydo chon de tai

Cong nghe thong tin ngay cang dong vai tro quan trong trong phat trien kinh te, xa hoi Cong nghe thong tin va vien thong la mot trong nhirng dong lire chinh tao nen bo mat the ky 21 Ngoai ra, cong nghe thong tin va vien thong con la mot trong nhirng dieu

ta dang dung trudc va can nam bat co hoi nay.

Theo nghien cuu ciia Tong cong ty mang ludi Lao Telecom - la mot trong nam

* X

cong ty hang dau ve linh vuc cong nghe thong tin tai Lao, nganh cong nghe thong tin se dong gop khoang 1,16 nghin ty USD vao GDP cua chau A - Thai Binh Duong vdi ty le tang trudng hang nam 0,8% Nam 2017, chiem khoang 6% GDP ciia chau A - Thai Binh

Duong den tir cac san pham va dich vu cong nghe thong tin, thong qua viec sir dung cac cong nghe so Tong cong ty mang ludi Lao Telecom du bao con so nay se tang len 60% GDP khu vuc chau A - Thai Binh Duong vao nam 2021 Cung theo so lieu Tong cong ty mang ludi Lao Telecom, khoang 84% cac to chuc, doanh nghiep trong khu vuc da va

Trang 13

khi ap dung cong nghe moi, tCr do moi co danh gia chinh xac dirge su thay doi nay co thirc sir hicu qua hay khong.

Tuy nhien viec thuc hien ap dung va nghien cuu ap dung cong nghe thong tin

vao qua trinh san xuat tai cac doanh nghiep con chain Nhieu cong ty van con su dung cac cong nghe trudc thdi diem hien tai ca chuc nam, mac du cong nghe thay doi hang ngay va dac biet la doi vdi nganh cong nghe thong tin Doi vdi cac cong ty chuyen sau ve cong nghe van chua cd nhung nghien cuu co ban hoac cd cac san pham cot loi

cd tieng vang tren thi trudng Day la mot diem you lam cho nganh cong nghe thong

tin cua nude ta chua phat tricn dat dugc nhu ky vong.

Xuat phat tir thuc trang nay, tac gia dua ra mot van de khong mdi nhung van

phuofng an khac phuc.

2 Muc tieunghien cuu

Ap dung nhung cong nghe mdi nay vao viec luu tru va xir ly dir li?u ldn Trong

Trang 14

3 Doi tuong va pham vi nghienciru

Doi tuong nghien cuu:

Cac phuong phap su dung cong nghe HDP trong vice xu ly va luu tru du lieuPham vi nghien cuu:

Du lieu CDR cua nha mang Lao Telecom

4 Phuong phap nghiencuu

Tim hieu ly thuyct ve cong nghe HDP

T %

Tim hieu ve cac phuong phap sir dung

5 Y nghiakhoa hoc va thuc tien cua de tai

de ung dung vao thuc te.

Ve mat thuc tien: Ket qua nghien du lieu cua nha mang Lao Telecom.

Ion vao viec luu tru va xu

6.Ket qua du*kien

Ly thuyet

- Hieu duoc cac phuong phap HDP trong xu ly du lieu.Thuc tien

Ung dung phuong phap HDP vao phan toi uu Import du lieu CDR vao Database.

7 Bocue luan van

Chirong 1 Dat van de ve bai toan tinh cuYrc dang dung tai Lao Telecom.

Telecom viec trinh bay ve su can thiet va quan trong cua viec xem xet lai each tinh cuoc hien tai dang duoc ap dung tai vien thong Lao Telecom.

Trang 15

tap trung vao tirng khfa canh cua van de de co the dira ra cac giai phap phu hop vil

dap ung duoc yeu cau cua khach hang hoac doi tuong sir dung.

Chuong 3 Ung dung cong nghe HORTONWORKS DATA.

Chuong nay gidi thieu va tim hieu cai dat, ling dung cong nghe HortonWork Data de giai quyct bai toan Y nghia cua ting dung cong nghe HORTONWORKS DATA la viec sir dung nen tang du lieu Hadoop cua Hortonworks de xu ly va phan tfch cac du lieu Ion Cong nghe nay dupe su dung de giai quyet cac van de phiic tap va xu ly cac tap du lieu ldn, da dang va phirc tap.

Chuong 4 Danh gia thuc nghiem so sanh giai phap cu va giai phap moi

Sau khi thuc hien chuong trinh hoan thanh, tien hanh danh gia va so sanh cac giai cu va moi Danh gia thuc nghiem so sanh giai phap cu va giai phap moi qua trinh danh gia hieu qua cua mot giai phap moi bang each so sanh vdi giai phap cu da dupe su dung trudc do Muc dich cua qua trinh nay la de xac dinh xem giai phap moi co tot hon, hieu qua hon, hoac can dupe cai tien hay khong so vdi giai phap cu.

Trang 16

CIILONG1 DAT VAN DE VEBAI TOAN TINH CUdC DANG DUNGTAI LAO TELECOM.

1.1 Mo hinh,hientrang va nghiep vuciia lie thongLao Billing.

1.1.1 Gidithieu mo hinh hethong LaoBilling.

Hinh 1.1 Mo hinh he thong Lao Billing.

Trang 17

+ He thong co khoang 30 tong dai thuoc nhicu doi tac khiic nhau.

+ Toan bo cac thong tin ve cudc goi, tin nhan va licit sir truy nhap du lieu deu dugc cac tong dai ghi nhan lai.

Trang 18

+ Bude 2: Kiem tra mau dinh dang cua file CDR:

- Neu khong dung mau dinh dang thi khong xu ly file do.- Neu dung mau dinh dang file:

- Thay doi ten file.

- Chuyen file sang mot thu muc khac tren FTP Server.b Nghiep vu Import du lieu vao Database.

+ Bude 1: Quct thu muc chua file da download ve.

i u + ngay + thang +nam

+ Bude 2: Kiem tra mau dinh dang file.

- Neu khong dung mau dinh dang thi khong xu ly.

- Neu dung mau dinh dang file thi chuyen sang Birac 3.+ Birac 3: Doc noi dung trong file.

Trang 19

+ Bude 5: Insert theo batch vao Database.

- Neu insert khong thanh cong thi ghi thong tin ra log va luu file den thu muc Unratc va chuyen sang Bude 6.

+ Bude 3: He thong tong hop theo cac tieu chi.

+ Bude 4: Cap nhat du lieu tong hgp vao bang tong hop.c Nghiep vu Tong hop cudc nong.

thong cung tong hgp so lieu cac dieu chinh nay.

+ Bude 5: Tfnh toan cong ng cho cac thue bao, hgp dong cua khach hang.

Trang 20

+ Birac 6: Nhan vien tinh cudc chuan bi cau lenh SQL de kiem tra viec thuc hien*

mdi fill so tien theo nguyen tac phan tfch ng.

+ Bude 3: Kiem tra lai xem con giao dich nao dugc day vao them khong.

hgp dong) de dieu chinh Cac thong tin ca ban de tim kiem bao gom:

So CMT/Ho chieu, ma so thue.

So hop dong co thue bao can dieu chinh.

So thue bao can dieu chinh.

+ Bude 2: Nhap so lieu dieu chinh.

Trang 21

cac thue bao trong hop dong, cac quy tac phan bo nay co the la:

Phan ho deu cho cac thue bao trong hap dong.

Phan bo theo ty le phan tram cua cudc phat sinh.

Phan bo theo ty le phan tram cua so tien phai thanh loan.

Sau khi dp dung quy tac phan bo cho tirng hop dong se tinh toan du-pc so lieu dieu chinh cho tirng thue bao trong hap dong do.

+ Bude 5: Thuc hien dieu chinh cho thue bao.■ •

chinh de thuc hien chot so cuoi ky.

+ Bude 8: Phan tich lai cong no cua thue bao.

ghi cong no chi tiet thuc hien.

Xac dinh so tien dieu chinh doi vdi thuc bao do (DCO).

Trang 22

Tru but no cuoi ky trudc cua thue bao do so tien bang vdi so tien dieu chinh.

Trang 23

g Nghiep vu Kiem tra.

+ Bude 3: Thuc hien kiem tra dir lieu theo cac hudng (Bill Item), neu co hudng sai thi se kiem tra tai Bude 4 0 budc nay ta se co danh sach cac hudng bi lech, ta thuc hien

F •»

tiep kiem tra lung hudng bi lech.

+ Budc 4: Thuc hien kiem tra dir lieu theo ngay cua hudng bi lech 0 budc nay ta se co danh sach cac ngay bi lech ciia hudng day, va thuc hien kiem tra tiep den muc nhd hon + Budc 5: Kiem tra du lieu tong hgp cua thuc bao, va dua ra danh sach cac thuc bao bi lech cudc.

h Nghiep vu In thu.

+ Budc 1: Lay danh sach tat ca khach hang cd thong bao cudc tren he thong.

F /

+ Budc 2: Sap xep lai theo thu tu uu tien vdi cac tieu chi NSD chon.

+ Budc 3: Gan cho KH mot so thu tu nhu da sap xep d Budc 2 va luu vao CSDL Kern theo viec gan la tao ra mot ma barcode (nhu mo ta d tai lieu THNV) Va tao ra mot ma jobin cho khach hang, mdi khach hang se thuoc mot nhom co chung mot ma jobin, trong

mot jobin se co khoang 3000-4000 item noi lien tiep nhau, tuan theo nguyen tac mot jobin khong dugc thuoc 2 nhom in mot jobin khong dugc thuoc 2 to thu (Nhom in la mot bang danh muc, danh muc nay se chi ra mot hinh thuc quan ly se thuoc mot nhom in nao day, vi du nhom 1 gom KNT va N1K, nhom 2 gom ng dong, nhom 3 KXD ) j Nghiep vu Phat hanh thong bao cudc.

+ Budc 1: Thuc hien Import cac dir lieu da tinh cudc vao CSDL.

Trang 24

+ Bude 4: Tien hanh in thir khach hang vira tim duoc thoa man cac dieu kien tren.

+ Bude 5: Kiem tra thong tin tren thong bao cudc, chi tiet cudc in thir, neu chinh cac thi nhan vien phong Billing ky xac nhan.

1.2Nhung ton taiciia he thong

Vdi do phuc tap cua qua trinh xu ly nhir tren, du lieu cua he thong ngay cang

a Cac nhugc diem cua he thong:

Theo quy dinh KPI ve he thong, du lieu ve cuoc goi cua khach hang cham

import het so lugng dir lieu ton trong khoang thdi gian sir co.

Dir lieu trung binh thang trong nam 2018 la khoang 1.5TB, nhung sang den nam 2019 thi du lieu trung binh thang khoang 2TB tuong img vdi khoang 50 trieu thue bao Du lieu chi tiet nay tang cao do so lugng thue bao tang hang nam, ngoai ra nhu cau su dung cua khach hang cung tang.

Trang 26

CHUONG 2 PHAN TICH, LUA CHON VA THIET KE GIAI PHAP2.1 Mo hinh giai phap cu.

2.1.1 Mo hinh vat lyhethong Billing hien tai.

CORKSWITCH • 02

Load Balancing-02

Load Balancing-01

Switch DB-01Switch App-02

W- M ii

Hinh 2.1 Mo hinh vat ly he thong Billing.

Trang 27

Cap Switch core: la cap switch dam nhiem vai tro giao dien giira he thong Billing va cac he thong khac.

Cap Load Balancing: lam nhiem vu phan tai cac connection tir ben ngoai vachia

b Nhom cac thiet bi may chu

Thiet bi may chu dugc chia lam hai nhom chinh, nhom ung dung vao nhom Database Nhom ung dung co cau hinh thap han so vdi may chu nhom Database nhung co so lugng nhieu han.

c Nhom cac thiet bi luu tru

tren hai tu dia giong nhau nham muc dich du phong cho nhau.

2.1.2Mohinh logic he thongBilling hien tai.

- Khdi giao dicn vdi cua hang;

Hinh 2.2 Mo hinh logic he thong Billing.

Trang 28

a Khoi import chi tiet cudc

va thong tin ve tien cudc cua khach hang.

2.2 Md hinh giaiphaphethong Billing mdi va hoan toan mienphi[ 10J

co e

2L g<u

£P o o rS S

co o

X u?2 i

S - <3

Apache PigScriptingApache Oozie

Trang 29

a Khoi churc nang import (ETL)*

Red Hat Fuse: la mot nen tang tich hop ma nguon mo dua tren Apache Camel.

Toe do nhanh: Vdi mot may don cai dat Kafka cd the xu ly sb lupng dulieu tir viec doc va ghi len tbi hang tram megabyte trong mot giay tir hang nganmay khach.

Kha nang mb rong: Kafka dupe thiet ke cho phep de dang dupe mb rongva

trong suot vbi ngubi dung (nghia la khong co thoi gian chet - ngirng hoat dong trong khi them mot nut may chu moi vao cum) Khi Kafka chay tren mbteum, luong du lieu se dupe phan chia va dupe van chuyen tbi cac nut trong cum, do do

Trang 30

HDFS: la 1 he thong luu tru chinh dugc dung bdi Hadoop Nd cung cap truy cap

MapReduce: Day la he thong dua tren YARN dung de xir ly song song cac tapdir lieu Ion MapReduce framework gom mot single master (may chu) JobTracker va cac slave (may tram) TaskTracker tren mdi cluster-node Mastered nhiem vu quan ly tai nguyen, theo doi qua trinh lieu thu tai nguyen va lap licit quan ly cac tac vu tren cac may tram, theo doi chung va thuc thi lai cac taevu bi loi Nhung may slave TaskTracker thuc thi cac tac vu dugc master chi dinh va cung cap thong tin trang thai tac vu (task-status) de master theo doi.

gi Zookeeper la phuungphap phoi hgp tat ca cac yeu to cua he thong phan tan Hadoop.HBase: la mot he cu su du lieu ma nguon mu dugc xay dung dua tren BigTabledugc mo la trong nghien cuu “BigTable: A Distributed Storage Systemystem for Structure Data” HBase cung cap kha nang liru tru du lieu ldn len tdi hang ty ban ghi, hang trieu cot khac nhau cung nhu hang petabytes dung lugng HBase la mot NoSQL dien hinh bdi vay cac tables cua HBase khong cd mot schemas co dinh va khong cd cac quan he

Trang 31

- tat ca scripts chay tren Hadoop Cluster.

Hive: la ha tang kho du lieu cho Hadoop Nhiem vu chinh la cung cap sir tdnghpp du lieu, truy van va phan tich No ho trp phan tich cac tap du lieu Idn dupcluu trong HDFS cua Hadoop cung nhu tren Amazon S3 Diem hay cua HIVEla ho trp truy xuat giong

Trang 32

luc tren toan bo tap dir lieu ma khong can phai trfch xuat mau tinh toan thu nghiem Toe

do xu ly cua Spark co duoc do vice tinh toan duoc thuc hien cung luc trennhicu may khac nhau Dong thoi viec tinh toan duoc thuc hien b bo nho trong(in-memories) hay thuc hien hoan toan tren RAM.• •

2.3Mo hinh giai phap mienphi ket hopcdphi.F

Muc dich cua giai phap nay la ket hop giua cac cong nghe cd license va open

Trang 33

Data Collector

i i iiii i•i i i iiiii i i i•iii••ii i

c ~9 c- E

a -3 e c1 E

cXi E& roc-E

r

I i Il I l I l II II l ItI i Ii III Il Il I l Ii l ItII IIi

C o

Relational database management system

PipelineMemSql Node

PipelineMemSql Node

PipelineMemSql Node

Pipeline k

B £z—X

bo quy tac, khi do viec dinh tuyen den kenh nao hoan toan dua tren bo quy tac nay.

Tfnh nang chuyen doi dir li»u(Transformation): la qua trinh chuyen doi dtrlieu tir mot dinh dang (vi du: tep co sd dir lieu, tai lieu XML hoac trang tinh Excel)sang mot

dinh dang khac Bdi vi trong doanh nghiep dir lieu thirdng nam d cac vjtri va co nhieu dinh dang khac nhau, nen viec chuyen doi dir lieu la can thiet dedam bao dir lieu cd the

lien ket vdi nhau hoac co the ducrc dung bdi he thong khac.

Tinh nang quan ly giao dich(Transaction management): la mot hoat dong trong irng dung de dam bao rang ket qua la xac dinh va chinh xac Trong qua trinhehuyen doi

Trang 34

hien lai ludng xir ly tir dau.

a Khoi hang doi (cache)

Khoi hang doi dune chia lam ba phan chinh la khoi tiep nhan du lieu dau vao (Producer), khoi luu tru cache du lieu(cac topic) va khoi du lieu dau ra (Consumer) Khoi nay cho phep chay Cluster tren nhieu node de tang toe do xu ly va dam bao tinh du phong Moi loai du lieu se duoc luu vao cac Topic, moi

Topic se co nhieu Partition, tren mot node se luu mot hoac nhieu Partition.

Producer: co nhiem vu day du lieu vao mot hoac nhieu topic Ngudi dung co the quyct dinh lieu nhung thong diep (mdi dong cua du lieu) nao se cung thuoc vao mot partition thong qua mot chudi khoa dinh kern vdi thong diep Neu khong producer se gan mot khoa ngau nhien va quyet dinh dich den cua thong diep dua tren gia tri bam cua khoa Topic: la mot hang doi cua thong diep (mdi dong du lieu) co ten do ngudi dung dat Cac thong diep mdi do mot hoac nhieu producer

Trang 36

du- lieu hoan toan nam tren bo nho trong, vi vay du lieu duoc xuly vdi toe do cue cao Pipeline co kha nang md rong theo nen tang cluster vi vay cd hieu suat va tinh du phong cao.

MemSQL: la co so du lieu tren bo nho (In-memory database - IMDB), day

thdi gian (windows frame) dupe thuc hicn tuong doi thudng xuycn.

• Ho trp JSON: MemSQL ho trp JSON tuong doi tot bang each ho trp loaidu

lieu la JSON, co the danh chi muc (indexing) tren mot doi tuong thuoc JSON true tiep thay v'l phai tach du lieu roi mdi danh chi muc nhu hau het cac CSDLkhac MemSQL con ho trp truy cap true tiep mot doi tuong trong JSON bang DML, nho do cd the truy van, loc bdt, chuyen doi mot vai du lieu thuoc JSONthanh chuoi, so va cd the thuc hien tinh toan dupe Tham chi co the truy van cac doi tupng long nhau trong JSON.

• Kieu du lieu dia ly: MemSQL ho trp tuong doi day du kieu du lieu dia ly, cac ham thuc hien tinh toan dia ly nhu tinh khoang each, tinh giao cat, tmhkhoanh

• Luu du lieu snapshot tren o cung: MemSQL khong chi luu du lieu tren bp

Trang 37

lai nguyen ven tir d cung, thao tac nay trong trudng hop co sd du lieuco kich thudc

2.4 So sanh giai phap mien phihoan toan va giaiphap ket hop co phi.

cung cap dich vu EBS nhu sau:

Bang 2.1 So sanh tinh nang EBS cua cac nha cung cap dich vu

Mulesoft Oracle Microsoft Red Dell

Ngày đăng: 23/06/2024, 21:27

Xem thêm:

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN