1. Trang chủ
  2. » Luận Văn - Báo Cáo

Distributed systems mini project report on the topic of distributed computing and volunteer computing

14 0 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Distributed Computing and Volunteer Computing
Tác giả Tôn Huỳnh Long, Lê Hoàng Minh, Lê Tử Quân
Trường học Ho Chi Minh City University of Technology and Engineering
Chuyên ngành Computer Science
Thể loại Mini-project Report
Năm xuất bản 2023
Thành phố Ho Chi Minh City
Định dạng
Số trang 14
Dung lượng 1,75 MB

Nội dung

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY FACULTY OF COMPUTER SCIENCE AND ENGINEERING BK TP.HCM Distributed Systems Mini-project Report On the topic of Distributed Computing and V

Trang 1

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY FACULTY OF COMPUTER SCIENCE AND ENGINEERING

BK TP.HCM

Distributed Systems

Mini-project Report

On the topic of Distributed Computing and Volunteer Computing

Group 2

Tôn Huỳnh Long 2052153

Lê Hoàng Minh 2052595

Ho Chi Minh City, December 2023

Trang 2

1 Introduction

1.1 Inspiration

On December 7, 2018, the current largest known prime number was discovered to be

252,589,933 1 by volunteer Patrick Laroche for the Great Internet Mersenne Prime Search (GIMPS) GIMPS is a collaborative project established in 1996, where internet users can

volunteer the computing power of their computers to help find very large Mersenne prime numbers The project has discovered 17 prime numbers so far, with 15 of them being the largest known prime at the time of discovery

The average aggregate computing power of the project is estimated to be over 4.7 PetaFLOPS, achieved through thousands of volunteer computers distributed worldwide each day Folding@home is a similar distributed computing project where volunteers and dedicate their own computing power to help with simulating the folding and movement of proteins The results of such simulations can then be used in laboratory experiments to help with biomedical research efforts

The project received heightened interest during the COVID-19 pandemic, and at its peak,

reached a combined throughput of over 2 ExaFLOPS This marks the first time an exascale

computing system has ever been achieved

1.2 Project objectives

The aim of our project is to attempt to create a system where multiple independent com- puters can work together to find a large prime numbers The size of numbers to be tested for primality is 8192-bit

2.1.1 Definition

Distributed computing refers to two or more computers networked together sharing the same computing work The objective of distributed computing is to share the job between multiple computers The objective of distributed computing is to sharing the job between multiple computers Distributed network is mainly heterogeneous in nature in the sense that the processing nodes, network topology, communication medium, operating system etc may

be different in different network which are widely distributed over the globe

Trang 3

A distributed system can be categorized as a group of mostly autonomous nodes communi- cating over a communication network and having the following features:

2.1.1.1 No common physical clock

This plays an important role to introduce the element of “distribution” in a system and takes the responsibility to provide inherent asynchronous amongst the processors In distributed network the nodes do not share common physical clock

2.1.1.2 No Shared Memory

This is an important aspect of for message-passing communication among the nodes present

in a network There is no common physical clock concept in this memory architecture But

it is still possible to provide the abstraction of a common address space via the distributed shared memory abstraction

2.1.1.3 Geographical Separation

In distributed computing system the processors are geographically distributed even over the globe However, it is not essential for the processors to be present on a wide-area network

(WAN) It is possible to make a network/cluster of workstations (NOW/COW) present on a

LAN can be considered as a small distributed system.Due to the low-cost high-speed off-the- shelf processor’s availability NOW configuration becomes popular For example, the Google search engine is built on the NOW architecture

2.1.1.4 Autonomy and Heterogeneity

The processors are autonomous in nature because they have independent memories, different configurations and are usually not part of a dedicated system connected through any network, but cooperate with one another by offering services or solving a problem together

2.1.2 Comparison with Parallel computing

Although there are many similarities to be drawn between parallel and distributed comput- ing, there are some key differences with respect to computing: cost and time

Parallel computing actually subdivides an application into small enough tasks that can be executed at the concurrently, while distributed computing divides an application into tasks that can be executed at different sites using the available networks connected together In parallel computing multiple processing elements exist within one machine in which every processing element is dedicated to the overall system at the same time, but in distributed computing a group of separate nodes possibly different in nature that each one contributes processing cycles to the overall system over a network

Trang 4

Parallel computing needs expensive parallel hardware to coordinate many processors within the same machine, while distributed computing uses already available individual machines which are cheap enough in today’s market

On the other hand, another differences is shared memory architecture exists in parallel computing whereas does not in distributed computing The two architectures are shown below:

Hinh 1: Parallel System architecture

A 4

Memory Memory Memory Memory

Hinh 2: Distributed System architecture

In the first figure, we see a typical shared-memory architecture where four processors (the four CPU boxes in the following diagram) can all access the same memory address space (that

is, the Memory box) If our application were to use threads, then those would be able to access exactly the same memory locations, in the case of parallel computing

Trang 5

In the case as the second figure shows, however, the various concurrent tasks cannot normally access the same memory space The reason being that some tasks run on one computer and others on another, physically separated, computer Since these computers are able to talk to each other over the network, one could imagine writing a software layer (a middleware) that could present our application with a unified logical (as opposed to physical) memory space These types of middlewares do exist and implement what is known as distributed shared- memory architecture

In reality, a hybrid of the two systems is most likely to be used Just like in a pure distributed- memory architecture, the computers communicate over the network However, each computer may have multiple processors This is called a shared memory architecture

Hinh 3: Shared Memory architecture

Each of these architectures has its own pros and cons In the case of shared memory systems, sharing data across concurrent threads of a single executable is faster than using the network

In addition, having a single, uniform memory address space makes writing the code simpler

2.1.3 Advantages of Distributed computing

e Inherently Distributed Computations: The applications which are distributed over the globe like money transfer in banking, reservations in flight journey which involves consensus among parties are inherently distributed in nature

e Resource Sharing: As the replication of resources at all the sites is neither cost-effective nor practical for performance improvement, the resources are distributed across the system It is also impractical to place all the resources at a single site as it can degrade

significant performance

e Access to Geographically Remote Data and Resources: In many instances data cannot be replicated at each site due to its heavy size and it also may be risky to keep

Trang 6

2.2

the vital data in each site For example, banking system’s data cannot be replicated everywhere due to its sensitivity So it is rather stored in central server which can be accessed by the branch offices through remote log in Advances in mobile communication through which the central server can be accessed which needs distributed protocols and middleware

Enhanced Reliability: Enhanced reliability is provided by the distributed system as

it has inherent potential by replicating resources Further, in general the distributed resources do not crash or malfunction Reliability involves several points, such as avail- ability, integrity and fault-Tolerance, etc

Increased Performance/Cost Ratio: The performance/cost ratio is improved by resource sharing and accessing geographically remote data and resources In fact, any job can be partitioned and can be distributed over numbers of computer in a distributed system rather than to allocate whole job to the parallel machines

Scalability: More numbers of nodes may be connected to the wide-area network which

does not directly affect in communication performance

Modularity and Incremental Expandability: Heterogeneous processors running the same middleware algorithm may be simply included into the system without altering the performance and the existing nodes can be easily replaced by other nodes

Scheduling tasks in Distributed Systems

Distributed Systems are very powerful and helpful computer systems that are known to solve tasks and problems in a feasible and fast way The basic idea of distributed computing

is paralleling computational requirements of large number of current and rising applications

so that it has an evolutionary aspect about job scheduling and task scheduling in available runtime environment Therefore, scheduling plays an important role in distributed computing

Different types of scheduling are based on different criteria, such as static vs dynamic en-

vironment, centralized vs distributed etc Different distributed computing scheduling criteria are listed as follows, although they could be overlapped and not clearly distinct of each other: Static scheduling: Pre-schedule jobs, all information about available resources and tasks in application must be known and further more a task is assigned once to a resource,

so that it’s easier to adapt based on scheduler’s perspective

Dynamic scheduling: It is more flexible than static scheduling where jobs are dynam- ically available for scheduling over time by the scheduler with no issues, to be able of determining run time in advance It is highly critical to include load balance as main

factor to obtain stable and efficient scheduler algorithm

Trang 7

Centralized scheduling: As mentioned in dynamic scheduling, a centralized / dis- tributed scheduler is responsible for making global decision By using centralized schedul- ing; ease of implementation, efficiency and more control and monitoring on resources are gained benefits On the other hand; such scheduler lacks scalability, fault tolerance and

efficient performance so it’s not recommended for large-scale grids

Distributed / Decentralized scheduling: Such type of scheduling is more realistic for real grids despite of its weak efficiency compared to centralized scheduling, in which there are local schedulers’ requests to manage and maintain state of jobs queue as there

is no more central control entity

Co-operative scheduling: In this case, system already have many schedulers, each one is responsible for performing certain activity in scheduling process towards common system wide range based on the cooperation of procedures, given rules and current system

users

Preemptive scheduling: This scheduling criterion allows each job to be interrupted during execution and a job can be migrated to another resource leaving its originally allocated resource unused to be available for other jobs It is more helpful if there are constraints as priority to be considered

e Non Preemptive scheduling: In which resources aren’t being allowed to be re-allocated until the running and scheduled job finished its execution

Immediate/ Online scheduling: In which scheduler schedules any recently arriving job as soon as it arrives with no waiting for next time interval on available resources at that moment

Batch/ Offline scheduling: the scheduler holds arriving jobs as group of problems to

be solved over successive time intervals, so that it is better to map a job for suitable resources depending on its characteristics

2.2.1 Volunteer computing

Volunteer computing is an arrangement in which people (volunteers) provide computing resources to projects, which use the resources to do distributed computing and/or storage

e Volunteers are typically members of the general public who own Internet-connected personal computers Organizations such as schools and businesses may also volunteer the use of their computers

e Projects are typically academic (university-based) and do scientific research But there are exceptions; for example, GIMPS and distributed.net (two major projects) are not

Trang 8

academic

Several aspects of the project/volunteer relationship are worth noting:

e Volunteers are effectively anonymous; although they may be required to register and supply email address or other information, they are not linked to a real-world identity

e Because of their anonymity, volunteers are not accountable to projects If a volunteer misbehaves in some way (for example, by intentionally returning incorrect computational results) the project cannot prosecute or discipline the volunteer

e Volunteers must trust projects in several ways:

— The volunteer trusts the project to provide applications that don’t damage their computer or invade their privacy

— The volunteer trusts that the project is truthful about what work is being done by its applications, and how the resulting intellectual property will be used

— The volunteer trusts the project to follow proper security practices, so that hackers cannot use the project as a vehicle for malicious activities

The first volunteer computing project was GIMPS (Great Internet Mersenne Prime Search),

which started in 1995 Other early projects include distributed.net,SETI@home, and Fold- ing@home Today there are over 30 active projects

Volunteer computing is important for several reasons:

e Because of the huge number (> 1 billion) of PCs in the world, volunteer computing can supply more computing power to science than does any other type of computing This computing power enables scientific research that could not be done otherwise This advantage will increase over time, because the laws of economics dictate that consumer products such as PCs and game consoles will advance faster than more specialized prod- ucts, and that there will be more of them

e Volunteer computing power can’t be bought; it must be earned A research project that has limited funding but large public appeal can get huge computing power In contrast, traditional supercomputers are extremely expensive, and are available only for applications that can afford them (for example, nuclear weapon design and espionage)

e Volunteer computing encourages public interest in science, and provides the public with voice in determining the directions of scientific research

Trang 9

3 Implementation

3.1 Method of determining primality

For small numbers, a simple division test would suffice for determining whether or not a number is prime However for very large numbers, iterating through every division would not

be feasible

For such large numbers, probabilistic methods may be used While these methods cannot definitively prove a number is prime, it can guarantee that that number is prime with a certain probability

One such probabilistic methods is the Miller-Rabin primality test, used in the implementa- tion of algorithms such as RSA cryptosystem The test can prove that a number is a composite number - in other words not prime, but cannot definitively determine if that number is a prime number

With each iteration in the Miller-Rabin method, if a number "pass" then the probability

of a false positive is proven to be at most a Therefore, through & iterations, the probability

that n is in fact a prime number is at least 1 — (4)*

In the Miller-Rabin primality test, for an odd number n:

1 Write n — 1 as 2°d, where s and d are positive integers and d is odd

2 Select an integer a between 2 and n — 1 inclusive

3 If a” ¢ 4 —-1 (mod n) for any r such that 0 < r < s, then n is definitely not prime

Otherwise, n is possibly prime with a probability of at least 3

4, Perform step 2 and step 3 k times with different values of a to reduce the probability of

n not being prime down to an acceptable level

Using repeated squaring, the running time of this algorithm is O(klog*n), where n is the

number tested for primality, and k is the number of rounds performed; thus this is an efficient,

polynomial-time algorithm

3.2 Distributing tasks in the Miller-Rabin test

In the Miller-Rabin test above, each suitable value of a has a chance of producing a false

positive Hence multiple rounds of the test with different values of @ need to be performed

to get a reliable result In a cooperative network, n, s, d and different values for a could be assigned to participants so that step 3 can be done separately and independently Then the result of that calculation can be sent back to a database

Trang 10

Over time, with more tests of different a@ associated with a number n, we can get a good idea if n is prime or not

3.2.1 Architecture

In our design, there are 1000 candidates to be tested Each will be tested against 200 different bases a, split across 10 blocks of 20 bases each

Our server was developed using the Flask framework and is hosted using https: //pythonanywhere

com

HTTP GET request params = (username, password)

HTTP 200 OK data = JSON object

HTTP POST request params = (username, password) data = JSON object

HTTP 200 OK

Hinh 4: Communication between a volunteer and the server

The volunteer begins by submitting their username and password and requesting a task

with an HTTP GET request

If the credentials are accepted then the server will choose a random candidate and a random block, package them in JSON format and return the data back to the volunteer

The volunteer, upon receiving the candidate and block number (the bases within any of the blocks are already known volunteer-side), can then perform 20 rounds of the Miller-Rabin algorithm

Ngày đăng: 17/10/2024, 19:46