1. Trang chủ
  2. » Luận Văn - Báo Cáo

Thuyết trình big data

36 1,4K 6
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 36
Dung lượng 2 MB

Nội dung

Thuyết trình big data

Trang 1

Big Data

NHÓM 1 GVGD: TS Nguyễn Đức Thái

Trang 2

Memory storage…

Computer Memory: 640K Ought to

be Enough for Anyone

Trang 3

How much data?

 7 billion people

 Google processes 100 PB/day; 3 million servers

 Facebook has 300 PB + 500 TB/day; 35% of

world’s photos

 YouTube 1000 PB video storage; 4 billion views/ day

 Twitter processes 124 billion tweets/year

 SMS messages – 6.1T per year

 US Cell Calls – 2.2T minutes per year

 US Credit cards - 1.4B Cards; 20B transactions/ year

3

Trang 4

4 Big Data Security

3 SQL vs NoSQL

2 Big Data Technology Today

1 Big Data Overview

5 Big data trends

6 Demo with MongoDB & Ref docs

Trang 5

1 Big Data Overview (tt)

“ Big data is not a single technology but a combination of old and new

tech-nologies that helps companies gain actionable insight ”

(“Big Data For DummiesPublished by John Wiley & Sons, Inc ” book reference )

Trang 6

1 Big Data Overview (tt)

Trang 7

Characteristics of Big Data

Trang 8

Sources of Big Data

Trang 9

Examining Big Data Types

Structured Data

Trang 10

Structured Data(…)

Computer- or machine-generated:

Machine-generated data generally

refers to data that is created by a

machine without human intervention.(Sensor data, Web log data, Point-of-sale data, Financial data…)

Human-generated: This is data that humans, in interaction with

computers, supply (Input data, stream data, Gaming-related data…)

Trang 11

Click-Examining Big Data Types

Unstructured Data

Trang 12

Unstructured Data(…)

Unstructured data is everywhere

Machine-generated unstructured

data: Satellite images, Scientific

data, Photographs and video, Radar

Trang 13

Managing different data types

Trang 14

Managing different data types

Integrating data types into a big data environment need:

Connectors: enable you to pull data

in from various big data sources

Metadata is the definitions,

mappings, and other characteristics used to describe how to find, access, and use a company’s data (and

software) components

Trang 15

What will we do with Big Data?

Trang 16

How to store and handle Big Data?

Trang 17

2 Big Data Technology Today

Storage…NoSQL Database

Trang 18

2.Big Data Technology Today(tt)

Processing

Trang 19

2.Big Data Technology Today(tt)

 The Apache Hadoop software library is a

framework that allows for the distributed

processing of large data sets across clusters of computers using simple programming models.

Trang 20

2.Big Data Technology Today(tt)

Instead of treating

memory as a cache,

why not treat it as a

primary data store?

 Facebook keeps 80% of its

data in Memory (Stanford

Data Grid

Trang 21

2.Big Data Technology Today(tt)

Transfer data:

Trang 22

2.Big Data Technology Today(tt)

Open-source software framework from Apache Hadoop

 Google MapReduce

 GFS (Google File System)

 HDFS

 Map/Reduce

Trang 23

3 SQL vs NoSQL

Data storage

File

SQL DBMS

NoSQL

Trang 24

3 SQL vs NoSQL (…)

 A relational database is a set of tables containing data fitted into predefined categories.

 Each table contains one or more data categories in columns.

 Each row contains a unique instance of data for the categories defined by the columns.

Trang 25

3 SQL vs NoSQL (…)

key-value store is a system that stores

values indexed for retrieval by keys.

Some of the market leaders:

Riak Amazon Dynamo Voldermort

Trang 26

3 SQL vs NoSQL (…)

Column-oriented databases

column-oriented databases contain one extendable

column of closely related data

Some of the market leaders:

HBase Cassandra

Trang 27

3 SQL vs NoSQL (…)

Document-based stores These databases

store and organize data as collections of

documents, rather than as structured tables

with uniform sized fields for each record

Some of the market

leaders:

MongoDB CouchDB SimpleDB

Trang 28

3 SQL vs NoSQL (…)

SQL 2008 Data storage capacity

Trang 29

 files stores the file’s metadata For

details, see The files Collection

Trang 30

3 SQL vs NoSQL (…)

BSON Types The chunks Collection

The files Collection

Trang 31

3 SQL vs NoSQL (…)

Trang 32

4 Big Data Security

• Secure computations in distributed

programming frameworks

• Security best practices for non-relational data stores

• Secure data storage and transactions logs

• Cryptographically enforced access control and secure communication

• Granular access control

• Real-time security/compliance monitoring

Trang 33

4 Big Data Security (…)

Technical Recommendations for sercurity

• Use Kerberos for node authentication

• Use file layer encryption

Trang 34

5 Big data trends

• Big data – of the people, by the people, for the people

• Big data and social computing

• Cloud computing

• Mobile Applications and HTML5

• Internet and big data

Trang 35

6 Demo with MongoDB & Ref docs

Ref docs:

 Judith Hurwitz, Alan Nugent, Dr Fern Halper, and Marcia Kaufman: Big Data For Dummies John Wiley & Sons, Inc 2013

 “Technology Trends for 2013” prepared

by Kaushal Amin, Chief Technology Officer, KMS Technology – Atlanta, GA, USA

 Website: http://hadoop.apache.org/

Demo with MongoDB

Trang 36

Thank You !

Ngày đăng: 13/08/2016, 20:37

TỪ KHÓA LIÊN QUAN

w