1. Trang chủ
  2. » Luận Văn - Báo Cáo

The midterm of probabilities and statistics research on encryption and decryption

46 0 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Encryption and Decryption
Tác giả Mai Bao Thach
Người hướng dẫn Mr. Nguyễn Quéc Binh
Trường học Ton Duc Thang University
Chuyên ngành Probabilities and Statistics
Thể loại Midterm Research
Năm xuất bản 2021
Thành phố Ho Chi Minh City
Định dạng
Số trang 46
Dung lượng 4,34 MB

Nội dung

18 CHAPTER 2: MONOALPHABETICAL SUBSTITUTION CIPHER 2.1 Definiton of substitution cipher: A replacement figure is a sort of encryption where characters or units of text are supplanted b

Trang 1

FACULTY OF INFORMATION TECHNOLOGY

ĐẠI HỌC TON ĐỨC THẮNG

THE MIDTERM OF PROBABILITIES AND STATISTICS

RESEARCH ON ENCRYPTION AND

DECRYPTION

Instructors: MR.NGUYEN QUOC BINH Student MAI BAO THACH — 520H0490

Class: 20H50304 Course:

HO CHI MINH CITY, 2021

Trang 2

VIETNAM GENERAL CONFEDERATION OF LABOR

TON DUC THANG UNIVERSITY

FACULTY OF INFORMATION TECHNOLOGY

ĐẠI HỌC TÔN ĐỨC THẮNG TON DUC THANG UNIVERSITY

THE MIDTERM OF PROBABILITIES AND STATISTICS

RESEARCH ON ENCRYPTION AND

DECRYPTION

Instructor: MR.NGUYEN QUOC BINH

Student: MAI BAO THACH - 520H0490

Class: 20H50304 Course: 24

HO CHI MINH CITY, 2021

Trang 3

After working for a half semester with the enthusiastic help and support of Mr Nguyén Quéc Binh, I was able to complete the report in the most complete and effective way His teaching has given our students a lot of knowledge as well as full skills in the specialized subject Although couple of months is quite short, but that time has also helped me to easily approach the major step by step with a solid foundation, especially with the encouragement and help from seasoned lecturers

I sincerely thank

Trang 4

REPORT COMPLETED AT TON DUC THANG

UNIVERSITY

I hereby declare that this is my own report and is under the guidance of Mr Nguyễn Quéc B The research contents and results in this topic are honest and have not been published in any form before The data in the tables for analysis, comments and evaluation are collected by the author himself from different sources, clearly stated in the reference section

In addition, the project also uses a number of comments, assessments as well

as data of other authors, other agencies and organizations, with citations and source annotations

If I find any fraud I take full responsibility for the content of my report Ton Duc Thang University is not related to copyright and copyright violations caused

by me during the implementation process (if any)

Ho Chi Minh city, 09 April, 2022

Author (sign and write full name)

Mai Bao Thach

Trang 5

Confirmation section of the instructors

Ho Chi Minh city, day month year (sign and write full name)

The evaluation part of the lecturer marks the report

Ho Chi Minh city, day month year (sign and write full name)

Trang 7

W)./.9).92000A./9).4))0 1 3

REPORT COMPLETED AT TON DỤC THANG UNIVERSITY 4

-TEACHERS CONEIRMATION AND ASSESSMENT SECTION 5

SUMMARY — 6

WV.).)00/9)/299)/00011.177.557 7

J9) 00.)).).4020009) 0 9

LIST OF DIAGRAMS, CHARTS AND TABLES - sec 10 CHAPTER 1: INTRODUCTION - Án HH HH gry 12 '52NNH).2 5 12

'P“N) lo na 13

hà in hố 14

1.4 Asymmectric CTYJẨOSYSÉGI: Ăn ng ngư 15 CHAPTER 2: MONOALPHABETICAL SUBSTTTUTION CIPHER 18 2.1 Definiton of substitution cÏpher: Ặ. ng gen 18

2.3 Idea oŸ solution and algoriflhim: s5 5s Son ng gen 19 2.4 Example and evaluation with analysÌS: .- - Sen sseeeexee 19

3.3 Idea oŸ solution and algoriflhim: s5 5 ng ng nen 23 3.4 Example and evaÌuafÏ0H: - - - - - SH HH ng 23

Trang 8

CHAPTER 4: EXPERIMENTS ON PYTHON -oeeeeeeeere 26

40.40000111 45

Trang 9

AES: Advanced Encryption Standard

Trang 10

10

LIST OF DIAGRAMS, CHARTS AND TABLES

[iu 0 305 010 17

[i08 si oa40119) 0070070587 17

Picture 3 )40014115ì6 SN 17

Picture 4 Example of cipher K€WS§ - - «ch HH HH HH rư 19 Picture 5 Monoalphabetic Substitution Cipher Illustration - << «2 21 Picture 6 Frequency table II 22

Picture 7 Values for EncryPfIOI - - «+ 1x nh HH ch nh nên 26 Picture 8 Source code of Encryption aÏlgOrIfhim - «cành ren 27 Picture 9 Validation Of Ï€f€TS .- - 5 - «+ «SH HE nh nên 28 Picture TƠ AlBOTI{Hirm - <5 <5 3 1 HH re 29 Picture 11 Importing testcases eee eeeeeceeeeececeeeesecceseeceseeecesaeeceseecesaeeseeeeons 30 Picture 12 Generated cipher key§ Ì - - «Ăn 1H nh HH kh nh ren 30 Picture 13 Testcase 2a 30

Picture E00 31

lai f8 € 2i ovi:-o0oii 00c, Vn 31

Picture 16 Testcase 33

Picture 17 Generated cipher key§ Ổ - << «HH nh HH ch nh rên 33 Picture 18 Testcase mm 35 Picture 19 Generated cipher keyS 4 - << sành HH ch nh rên 36 Picture 20 Erequency table In Python Error! Bookmark not defined Picture 21 Import from 1nput values ẨIÏ€ cece ceeeee cee eeeceneeseeeaeeceeeeceeeeees 38 Picture 22 Source code for Decryption algOTItHm - - «+ se sec sen 39 Picture 23 Code of checking ©XISEITE << HH nh ng cư 40

Picture 25 Code of replacing letter with frequency table - -«c+sc+sxssse2 42

Trang 11

Picture 26 Testcase

Trang 12

The process is operate like this:

Sender - Plain text — Encrypt — Transfer — Decrypt — Authorized users

Moreover encryption is a crucial part for any individuals or organizations to prevent hacker from robbing their sensitive information or code

Here is the example:

When a bank want to deliver someone9s credit card and their account numbers, those information need to encrypt in order to reduce the possibility of theft The way

to encrypt and its application and experiments are called as cryptography

How it works ?

The research given that the encryption strength depends a lot on the length of the security key In 20" century, the first length that developer use is 40-bit encryption with 2*° possible permuations and 56-bit format But now on, the hackers are so powerful and can break the defense of these formats through brute-force attack easily, this led to 128 bit-system becoming the standard of encryption9s length

Trang 13

For instance, The Advanced Encryption Standard, which stand for AES, is a

convention for information encryption made in 2001 by the U.S Public Institute of Standards and Technology AES utilizes a 128-digit block size, and key lengths of

What is Decryption ?

This concept is the reverse of above Encryption This is a Cypher Security format that make hacker or thief inconvenient in finding a chance to steal the information when they are not allowed to read these datas It tranforms the cipher text into the original text that people with decryption keys can easily read it and understand throughout some tools This techniques required some coding function to make it

Trang 14

14

unreadable However, we already know that Encryption protects data but the accesser must have the authorized tools to reach the plain data this means the Decryption can

be done manually or using decode application

What are the types of Decryption ?

In this concept, I will introduce just a few type of decryption such as AES About AES, it is exceptionally effective in 128-cycle structure, and AES likewise uses 192 and 256-bit keys for substantial information encryption AES is by and large accepted to be impervious to all assaults, barring savage power, which attempts to decipher messages utilizing all potential blends of 128, 192, or 256-cycle cryptosystems In any case, Cyber Security experts guarantee that AES will at last be hailed as an accepted norm for information encryption in the private area

What is the advantage of Decryption ?

There are many purpose on utilizing Decryption but the main course still is the fresh and unbreakable organization supervision This method help Cipher Security in the whole new levels of protect information as it reduce the amount and the percentage

of confusion in reading and understanding the datas

What is the process of Decryption ?

The data or information or cipher text will be delivered to the receiver After that, it is enable to convert from random code or keys into the original form of the datas

The least complex method for demonstrating likelihood of a framework is through balance For instance the idea of a "fair" coin implies there are two potential

Trang 15

results that are undefined Since each outcome is similarly possible the result is 50/50

heads or tails

Comparatively for a fair kick the bucket there are 6 potential results, that are largely similarly reasonable This implies they each have the likelihood 1/6 The possibility of balance is behind arbitrary inspecting If we have any desire

to comprehend a populace we can take various arbitrary cases and it educates us something concerning the entirety Anyway this is possibly evident assuming the example is irregular as for the properties we're estimating That is assuming we traded individuals haphazardly we would probably gauge them

Another model is a spinner, similar to a roulette wheel The model is that a fair twist is similarly liable to land anyplace on the periphery circle So by evenness the likelihood of a result is relative to the length of the curve it subtends on the circle This likewise is relative to the point of the curve, which is corresponding to the region

of the circular segment

What is asymmetric cryptography ?

This concept is known as cryptography format, which is utillize to pair a public key with a private data in order to encode and decode information and prevent hacker from robbing the access or sensitive data,

To understand more about public key, it is a cryptographic key which is used

to encrypt datas in order to decrypt by the receiver with their private key (private key

is only shared with the sender)

Trang 16

16

There are a majority of protocols that depend on asymmetric cryptography which included the TLS (Transport layer security) and the SSL (Secure sockets layer) which makes HTTPs possible

The purpose of using asymmetric cryptography is increasing the information protection This technique does not require to publish the private keys when encrypting So that, we can protect our information and hold the these data outside those cybercriminal9s range

How does asymmetric cryptography work ?

Hilter kilter cryptography is regularly used to confirm information utilizing computerized marks A computerized mark is a numerical method used to approve the legitimacy and trustworthiness of a message, programming or advanced report It

is what might be compared to a written by hand signature or stepped seal

In light of uneven cryptography, advanced marks can give affirmations of proof to the beginning, character and status of an electronic archive, exchange or message, as well as recognize informed assent by the underwriter

What are examples of asymmetric cryptography ?

The RSA calculation - - the most generally utilized unbalanced calculation - -

is installed in the SSL/TLS, which is utilized to give secure interchanges over a PC organization RSA gets its security from the computational trouble of figuring huge

numbers that are the result of two enormous indivisible numbers

Duplicating two huge primes is simple, however the trouble of deciding the first numbers from the item - - calculating - - structures the premise of public-key cryptography security The time it takes to factor the result of two adequately huge primes is past the abilities of most assailants

Trang 17

are moving to a base key length of 2048-bits

Encryption (used to protect sensitive information)

» V/V —

vJ —

Trang 18

18

CHAPTER 2: MONOALPHABETICAL SUBSTITUTION

CIPHER

2.1 Definiton of substitution cipher:

A replacement figure is a sort of encryption where characters or units of text are supplanted by others to encode a text arrangement Replacement figures are a piece

of early cryptography, originating before the advancement of PCs, and are presently

somewhat old

In a replacement figure, a letter like An or T, is rendered into another letter, which

actually encodes the grouping to a human peruser The issue is that basic replacement figures don't actually encode successfully regarding PC assessment - with the ascent

of the PC, replacement figures turned out to be generally simple for PCs to break Nonetheless, a portion of the thoughts behind the replacement figure keep on living

on - a few types of present day encryption could utilize a very enormous message set and an incredibly complex replacement to encode data really

2.2 State the problem:

Our requirement is using monoalphabetical substitution cipher algorithm to encode a plaintext into some scripts that the hackers or thief can not reach it easily But the case

we will focus in is English alphabet and with the length about 50 words to 5000 words depending on what testcase we need

Moreover, the condition here is we need to use distinct character and replace separately that mean a letter will be replaced exactly one specified letter, no exception

here

The monoalphabetical substitution cipher algorithm are born to do that, it will receive original text and transfer it into cipher text which means the code that not everyone can read and understand it Each letter in the plaintext will be replace by a fixed letter

Trang 19

in the alphabet which means we need to generate the cipher alphabet before execute the algorithm

When you generate ‘a9 will be replaced by ‘r9 which means every time ‘a9 appears in the plaintext, it will change into ‘r9 The cipher alphabet will be generate again when use to encode another script not the old one which means it will have a permutation

of cipher alphabet when you want to encode a new script

2.3 Idea of solution and algorithm:

My idea is to solve and replace the letter in their lowercase format so it will be easier

to solve when uppercase appear making it compatible for all letter in plaintext

A[BI|G[IĐIEIET|IG|IHITTT|K|ITC|]MIN|O|IP[†[QGIRI|SIT|U|V[W|X|IYTZ

MIZ ||L INl||o [| T ||K lv |[[A l| 9 |6 l[F |[El[R Í x |[y lulÌc ÍnlÍP [pl|ÌB Jwllo [s || 1

Picture 4 Example oƒ cipher keys

About the cipher alphabet, it will be generate randomly when encoding a new script (default, we will have 26 factorial of position in cipher alphabet)

In above table, we see that each default alphabet is fixed with a letter in the alphabet and just will be replaced by it as long as in the same script we use not the new one The idea is using list to contain default alphabet and then fixed the alphabet with cipher alphabet to maintain each letter fixed with the key letter in cipher As the result, after replace respectively each letter with key using loop

2.4 Example and evaluation with analysis:

Given the string script of Ielts reading passage below and encode it with random cipher alphabet:

“Not many people have mental imagery as vibrant as Lauren or as blank as Niel They are the two extremes of visualisation Adam Zeman, a professor of cognitive and behavioural neurology, wants to compare the lives and experiences of people with

Trang 20

20

aphantasia and its polar-opposite hyperphantasia His team, based at the University of Exeter, coined the term aphantasia this year in a study in the journal Cortex.= With the support of the cipher alphabet fixed letter it will tranform into this:

“lih nvlp cuicsu rvtu nulhvs anvyugp vb taxgvlh vb svzgul ig vb xsvlo vb laus hrup vgu hru hwi uehgunub id tabzvsabvhail vmvn kunvl, v cgidubbig id qiylahatu vim xurvtaizgvs luzgisiyp, wvlhb hi qincvgu hru satub vlm uecugaulqub id cuicsu wahr vervlhvbav vlm ahb cisvg-iccibahu rpcugervlhvbav rab huvn, xvbum vh hru zlatugbahp id ueuhug, qialum hru hugn vervlhvbav hrab puvg al v bhzmp al hru jizglvs gighue.=

Note that all commas, dot or dash will be maintain as it original forms

We can notice that the way using monoalphabetical substitution cipher is very simple cause you just need to replace one by one letter with your generated key table (just another letters without any special symbols) This lead to a lot of secure problems because it is very easy to break it throughout some methods Such as, Frequency analysis use the amount of appearance of a letter to construct a table tell that whichthe frequency for that letter being encoded In English (and different dialects) there's an immense variety in how continuous various letters show up “e" is the most widely recognized one, representing around 13% of all letters in a message, next is "t" at 9%

"a" at 8% thus on.[1] To figure out the code, I just count how often each letter shows

up in the ciphertext, and afterward I surmise that the letter that shows up most often

is an “e", the second most successive one is a "t, etc Subsequent to having done this for a portion of the letters, it becomes conceivable to perceive words, on the off chance that for instance "t?e" shows up habitually, the ? is probably going to be a "h" - for each new letter I accurately surmise, speculating the leftover ones becomes more straightforward and pretty soon the code is broken

Trang 22

22

CHAPTER 3: FREQUENCY ANALYSIS

3.1 Definition of Frequency Analysis:

In cryptographic, Frequency Analysis which is called with another name is frequency counting or counting letters This algorithm is the research on finding the frequency appearance of a letter in cipher text This algorithm is use to break some substitution ciphers

Actually, frequency analysis is based on the stretch of written language, it will find out the frequency of a certain letter or the combination of letters occur

Recurrence examination comprises of counting the event of each letter in a text Recurrence investigation depends on the way that, in some random piece of text, certain letters and mixes of letters happen with fluctuating frequencies For example, given a segment of English language, letters E, T, An and O are the most widely

recognized, while letters Z, Q and X are not as oftentimes utilized

79% 14% 27% 41% 122% 21% 19% 59% 68% 0.2% 08% 39% 23% 65% 72% 1 0.1% 58% 61% 88% 27% 10% 23% 02% 19% 10

Picture 6 Frequency table

We can expect that most examples of text written in English would have a comparative appropriation of letters Anyway this is possibly obvious assuming that the example of text is sufficiently long An extremely short text might prompt something else entirely

While attempting to decode a code text in view of areplacement figure, we can utilize arecurrence examination to assist with recognizing the most repeating letters in a code text and thus cause theory of what these letters to have been encoded as (for example

E, T, A, O, and so on) This will assist us with unscrambling a portion of the letters in

Trang 23

the text We can then perceive designs/words in the part of the way decoded text to distinguish more replacements

3.2 State the problem:

Given a cipher text and we have to decrypt it into original text, the problem is wwe have to count the appearances of a letter and use the default frequency table to decode

it

But in the case that we replace a letter like ‘a9 with ‘e9 and there is another e in the text This leads to misunderstand and the cipher ‘e9 will be not replace and error occurs

So we have to mark it as it is replaced or not

3.3 Idea of solution and algorithm:

The solution is we need to create a visited list to contain those index or value that are replaced before in case we miss those which are not replaced at all Beside that, we need to sort ascending order same as alphabet to make it more suitable in pratical situation This means when we face that to letter have the same frequency, we need

to consider based on the alphabet For instancec, ‘a9 = 2, ‘h9 = 2, but a will be replace

by the letter frequency before “h9

So the algorithm here is, we need to count exactly how many time a letter appear and sort it descending The most appeared letter will be replace with ‘e9

3.4 Example and evaluation:

Given the string script cipher passage below and encode it with random cipher alphabet:

“caj uzfmq-19 tsiqyymu asn uspnjq ysih njfjdj qsysxjn cz tjztwj9n wmfjn mc sgegjyucjq sww sntjucn zg jemncjiuj: yjqmumij, juzizymun, siq tzwmcmun vmcazpce sih qzpoc,

mc swnz migwpjiujq ntzden siq tdsucmuswwh sww bmiqn zg tahnmusw gsumwmemjn sn gmcijnn ujicjdn, xhyn, siq ntzde uwpon vjdj uwznjq cajdjgzdj, ysih

Ngày đăng: 04/10/2024, 15:44

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN