1. Trang chủ
  2. » Luận Văn - Báo Cáo

Bài báo cáo tổng hợp môn tin học ứng dụng processing and creating reports on excel

28 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 28
Dung lượng 4,51 MB

Nội dung

Table of contents Chapter 1: Searching for data 1.1 Create and retrieve data from world bank databank page, Go to page 1.1.2 Choose 5 favorite countries: Bhutan, Singapore, Japan, Nepal,

Trang 1

BỘ GIÁO DỤC VÀ ĐÀO TẠO

TRƯỜNG ĐẠI HỌC SƯ PHẠM KỸ THUẬT TP.HCM KHOA ĐÀO TẠO CHẤT LƯỢNG CAO

Sinh viên thực hiện : Trần Nguyễn Hiền Vân

TP Hồ Chí Minh, tháng năm 2028 2

Trang 2

Table of contents

Chapter 1: Searching for data

1.1 Create and retrieve data from world bank databank page, Go to page 1.1.2 Choose 5 favorite countries: Bhutan, Singapore, Japan, Nepal, Vietnam

1.1.3 Selection of 10 criteria in the clustered index include Access to clean fuels and technologies for cooking (% of population)

1.1.4 Selection of survey time: from 2006 to 2020

Chapter 2: Processing and creating reports on Excel

2.1 Processing data from Excel files

2.2 Create dashboards and reports that match existing data 2.2.1 Create data reporting dashboard

2.2.2 Calculation of the remaining data (finding sum, max, min, average, vlookup, ) is done in the Excel file "Data Analytics" including content sentences

2.2.3 Use Conditional Formatting to highlight (Excel File “Data Conditional Formatting)

Chapter 3: Analyzing processed data on GOOGLE COLAB

3.1 Upload, connect Google drive with Google Colab

3.2 Run the commands to read the file with pandas, then tell the size of the tuple 3.3 Description of the data fields

3.4 Plot the distribution chart for at least 2 columns, commentEnter code 3.5 Draw at least 2 arbitrary graphs using seaborn, comment

Trang 3

Chapter 1: Searching for data

1.1 Create and retrieve data from world bank databank page, Go to pagehttps://databank.worldbank.org/source/world-development-indicators

1.1.1 Select 1 Database out of 86 provided databases I chose the topic: Sustainable Energy for All

1.1.2 Choose 5 favorite countries: Bhutan, Singapore, Japan, Nepal, Vietnam

1.1.3 Selection of 10 criteria in the clustered index includeAccess to clean fuels and technologies for cooking (% of population)

• Access to clean fuels and technologies for cooking, rural (% of rural population) • Access to clean fuels and technologies for cooking, urban (% of urban population) • Access to electricity (% of population)

• Access to electricity, rural (% of rural population) • Access to electricity, urban (% of urban population) • Adjusted savings: carbon dioxide damage (% of GNI) • Adjusted savings: consumption of fixed capital (% of GNI)

• Adjusted net savings, including particulate emission damage (% of GNI) • Adjusted net savings, excluding particulate emission damage (% of GNI)

Trang 4

Criteria and indicators students choose to do the test

Criteria for qualitative and quantitative:

I chose the topic of sustainable fuel, because Vietnam is a very polluted country and ranks 2nd in the world in terms of cancer incidence from such agents Therefore, I want to analyze the data and compare it with the data in Vietnam to show how the developed countries manage clean energy sources, from which some temporary solutions can be given Vietnam

- Bhutan is my country of choice because this is the place in the world where people want to live, because it has fresh air and and generally agrees with the metrics they use to manage waste

Criteria for time

Selection period from 2006 - 2020

- I want to analyze the oldest time to the present, the purpose is to find the difference in sustainable energy when the country transforms technology, and as a measure to analyze the change from the past to the present, and How human impact has affected fossil fuel energy resources

- The next criterion for the time I choose is that I want to make it more difficult for myself, try to exploit my full potential by choosing a long time to analyze data instead of following the request of the topic is a minimum of 10 years

Chapter 2: Processing and creating reports on Excel

Trang 5

From datablank data after selecting requirements and criteria We export to an excel file with raw data as follows:

2.1 Processing data from Excel files

Step 1: Use the trim function to trim excess data and remove spaces before and after the text in the COUNTRY column

Column A: copied from column B Note:

Trim function syntax Cell (A1) = Trim (B1)

Inference: Column A data has been filtered to remove spaces before and after the text

Step 2: Select an empty cell to enter the desired data and format Select data area

Trang 6

Display data: no cells found

Inferred: no blank cells and no need to enter data, desired format

Step 3: Use the ISNUMBER function to check whether the data is text or not

If the function is of the NUMBER format, the ISNUMBER function will display TRUE, indicating that the format is correct

If the function is in TEXT format, the ISNUMBER function will display FALSE, which indicates that the format is not correct

Result: When entering the ISNUMBER function, it checks that all cells display TRUE, indicating that the data has been formatted as a policy in NUMBER format

Step 4: Color the error cells

Select data area

Home -> Condition Formatting -> New Rule -> Format only cell that contains Select Format only cell with -> Errors -> Choose pink format format

Result: Cells in the data range do not contain pink cells (which have error types: #N/A, #VALUA, #REFI, #NULL!)

Inference: all data is not colored so the data does not contain error cells

Step 5: Replace cells with 0 into empty cells

Select File -> Option -> Advanced -> Check Show a zero in cells that have zero value in Display options for this worksheet

Trang 7

Result: there are 2 cells containing 0's, H45, I45

Step 6: Replace empty cells with the character “-”

Select Format Cells -> Number -> Accounting -> Symbol (None) Result: Show empty cells as

Step 7: Filter with New Query

Select Data -> New Query -> From File -> From Excel Workbook -> Insert Excel File -> Display a New Query table

Start filtering -> Tick the arrow -> Remove null values -> Oke

Trang 8

The obtained data table has the following form

Trang 9

The same way to remove null values in the remaining columns Columns without null values remain the same

After filtering is complete, export the file

Select Close & Load -> Close & Load to export to an Excel page

Trang 10

From there, the data table obtained has the following form:

2.2 Create dashboards and reports that match existing data 2.2.1 Create data reporting dashboard

Question 1: Draw column chart From Data file create pivot from Access to clean fuels and technologies for cooking (% of population) and Access to clean fuels and technologies for cooking, rural (% of rural population)

Trang 16

2.2.2 Calculation of the remaining data (finding sum, max, min, average, vlookup, ) is done in the Excel file "Data Analytics" including content sentences

1.Total access by countries from 2006 - 2020

2 Bhutan uses the lowest access to clean fuel and technology for cooking (% of population), what is the data, and similar to other countries

From there, find the country with the lowest rate among countries

3 What is the highest access to clean fuel and technology for cooking (% of population) country, what is the data, and similar to other countries

4 What is the average access to clean fuel and technology for cooking (% of population) country, what is the data, and similar to other countries

5 Find area codes, abbreviations, Serial numbers of countries

6 What is the total amount of access to fuel and cooking technology (% of population) of Bhutan and VietNam, Singapore in 2020

Trang 17

average and exponential flattening in 2021

8 Use the exponential leveling method with alpha = 0.2 to make the same time forecast in question 1

Trang 18

9 Compare the results of questions 7 and 8 Which method gives better results

The result in question 7 has a smaller MSE value than sentence 8 So the result in question 7 will be better

Trang 19

2.2.3 Use Conditional Formatting to highlight (Excel File “Data Conditional Formatting)

1 Use Conditional Formatting to highlight cells with more than 20 of Bhutan country

Trang 20

2020

5 Use Conditional Formatting to highlight cells 100% of Bhutan country 2006 - 2020

Chapter 3: Analyzing processed data on GOOGLE COLAB 3.1 Upload, connect Google drive with Google Colab

After creating the Excel file in chapter 1 and processing chapter 2 Upload the drive file csv

Open googledrive, create a folder, create a googlecolab file, and download the file from your computer's library as file.csv

Generate code to connect google drive with google Colab

Trang 21

Copy the link into the code

Paste the link into the code

3.2 Run the commands to read the file with pandas, then tell the size of the tuple

Trang 22

Data set size includes 14 columns, 5 rows

Includes a column containing data about country, year, and criteria such asAccess to clean fuels and technologies for cooking (% of population)

• Access to clean fuels and technologies for cooking, rural (% of rural population) • Access to clean fuels and technologies for cooking, urban (% of urban population) • Access to electricity (% of population)

• Access to electricity, rural (% of rural population) • Access to electricity, urban (% of urban population) • Adjusted savings: carbon dioxide damage (% of GNI) • Adjusted savings: consumption of fixed capital (% of GNI)

• Adjusted net savings, including particulate emission damage (% of GNI) • Adjusted net savings, excluding particulate emission damage (% of GNI)

3.3 Description of the data fields

Trang 23

From the above data we can see that there are many different columns and rows

Trang 24

And are sorted much better based on the 10 criteria outlined above

3.4 Plot the distribution chart for at least 2 columns, commentNhập code

In general, the data input is simpler and the results are concise On the contrary, in the cleaned data, although the input has many complexities in terms of code, the results are easy to see, the chart plot is relatively harmonious and the colors are stable And also beautiful results

The data presented in Chart format is distributed in a harmonious color and the relative proportions are approximately the same

Trang 25

The data has 5 columns showing 5 different countries about Access to clean and technology, in which the highest percentage is Singapore at 100%, followed by Jaapan, Bhutan, Vietnam and the lowest is the country Nepal occupies, respectively rate of 20%

3.5 Draw at least 2 arbitrary graphs using seaborn, comment

The data is represented as a column, using the command “sns.boxplot” From the fairly simple data about Access to electricity but somewhat we understand their figures are in the range of 90 - 100%

Trang 26

The data is shown by 4 dotted lines from bottom to top, in general, countries have different over the years, but in general, access to clean fuels and technology over the years has changed and grown from the command " sns.implot”

The data is shown by 4 dotted lines from bottom to top In general, countries have differences over the years, but in general, access to clean fuels and technology over the years has changed and grown unevenly from the command “sns.implt”

Trang 27

From the command "pairplot" we have an overview of many different shapes, but most of them have small and medium differences

With the command "sns.heatmap, matrix" we see that there is a difference between the criteria, most of the numbers are from 0.1 to 1 with different parameters

Ngày đăng: 15/04/2024, 19:01

w