1. Trang chủ
  2. » Luận Văn - Báo Cáo

Luận văn tốt nghiệp Khoa học máy tính: Development of a mobile application for price prediction of real estates

43 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGYFACULTY OF COMPUTER SCIENCE AND ENGINEERING BACHELOR THESIS DEVELOPMENT OF A MOBILE APPLICATION FOR PRICE PREDICTION OF REAL ESTATES Major: Compu

Trang 1

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY

FACULTY OF COMPUTER SCIENCE AND ENGINEERING

BACHELOR THESIS

DEVELOPMENT OF A MOBILE APPLICATION FOR

PRICE PREDICTION OF REAL ESTATES

Major: Computer Science

Committee : Computer Science 2 Supervisor : Assoc Prof Dr.Quan Thanh Tho Reviewer : Assoc.Prof.Dr Bui Hoai Thang

Trang 2

KHOA:KH & KT Máy tính NHIỆM VỤ LUẬN ÁN TỐT NGHIỆP

BỘ MÔN:KHMT Chú ý: Sinh viên phải dán tờ này vào trang nhất của bản thuyết trình

HỌ VÀ TÊN: Phạm Tuấn Anh MSSV: 1651006

1 Đầu đề luận án:

Development of a mobile application for price prediction of real estates

2 Nhiệm vụ (yêu cầu về nội dung và số liệu ban đầu):

✔ Investigate background technologies and frameworks to build the mobile application.

✔ Analyze and design the desired mobile application

✔ Research theory of Linear Regression.

✔ Implement a price prediction AI model using Linear Regression Model.

✔Implement a prototype

3 Ngày giao nhiệm vụ luận án:

4 Ngày hoàn thành nhiệm vụ:

5 Họ tên giảng viên hướng dẫn: Phần hướng dẫn:

PGS.TS Quản Thành Thơ

PHẦN DÀNH CHO KHOA, BỘ MÔN:

Người duyệt (chấm sơ bộ):

Trang 3

Ngày tháng năm

PHIẾU CHẤM BẢO VỆ LVTN

(Dành cho người hướng dẫn/phản biện)

1 Họ và tên SV: Phạm Tuấn Anh

MSSV: 1651006 Ngành (chuyên ngành): KHMT

2 Đề tài: Development of a mobile application for price prediction of real estates

3 Họ tên người hướng dẫn/phản biện: PGS.TS Quản Thành Thơ

4 Tổng quát về bản thuyết minh:

- Số bản vẽ vẽ tay Số bản vẽ trên máy tính:

6 Những ưu điểm chính của LVTN:

The student has successfully developed a mobile application that can process the collected data and visualize information in a meaningful way The student has also employed an AI technique to predict prices of the real estates based

on the historical data

7 Những thiếu sót chính của LVTN:

The thesis needs to be elaborated in many parts to provide more details and discussion about the technologies used

8 Đề nghị: Được bảo vệ □ Bổ sung thêm để bảo vệ □ Không được bảo vệ □

9 3 câu hỏi SV phải trả lời trước Hội đồng:

a

b

c

10 Đánh giá chung (bằng chữ: giỏi, khá, TB): Điểm : 7.3/10

Ký tên (ghi rõ họ tên)

Trang 4

PHIẾU CHẤM BẢO VỆ LVTN

(Dành cho người hướng dẫn/phản biện)

1 Họ và tên SV: Phạm Tuấn Anh

MSSV: 1651006 Ngành (chuyên ngành): Khoa học Máy tính

2 Đề tài: Development of a mobile application for price prediction of real estates

3 Họ tên người hướng dẫn/phản biện: Bùi Hoài Thắng

4 Tổng quát về bản thuyết minh:

- Số bản vẽ vẽ tay Số bản vẽ trên máy tính:

6 Những ưu điểm chính của LVTN:

- Showed an understanding about some machine learning techniques based on Linear Regression, such as Simple Linear Regression, Multiple Linear Regression, and Polynomial Regression Those techniques were used in predicting data, especially real estate prices

- Designed and implemented a mobile application for users to predict real estate prices based on location of the properties such as City, District, Ward, Street

7 Những thiếu sót chính của LVTN:

8 Đề nghị: Được bảo vệ X Bổ sung thêm để bảo vệ  Không được bảo vệ 

9 3 câu hỏi SV phải trả lời trước Hội đồng:

a

b

c

10 Đánh giá chung (bằng chữ: giỏi, khá, TB): Trung bình Điểm : 7.0/10

Ký tên (ghi rõ họ tên)

Bùi Hoài Thắng

Trang 5

We declare that this thesis was carried out by ourselves under the guidance andsupervision of Associate Prof.Dr Quan Thanh Tho The presented figures in this thesis foranalysis and evaluations are accomplished by our own work In addition, other figuresfrom various resources used in this thesis are explicitly cited in the reference part We willtake full responsibility for any fraud detected in our thesis

Trang 6

First of all, I would like to express many thanks to my supervisor, who instructed meabout knowledge and technologies that will apply to my topic, Associate Professor DoctorQuan Thanh Tho of the Faculty of Computer Science and Engineering at Ho Chi MinhCity University of Technology During the thesis time, he helped me to learn newtechnology effectively and supported me in every small problem to finish my work

Besides, I am also extremely grateful to my family for providing me with unfailingsupport and continuous encouragement throughout my years of study Thisaccomplishment would have never been possible without them

Trang 7

The real estate market in Vietnam is currently strongly developing and attracts manyinvestors In investing or buying real estate, price is a factor that investors concern themost According to investors, searching and comparing the real estate prices on manywebsites to make price predictions, which takes them a lot of time Therefore, they desire

to have a tool to solve the above problem In this thesis, we will develop a mobileapplication applying an artificial intelligence model in price prediction and propose thedevelopment directions

Trang 8

Declaration 2

Acknowledgement 3

Abstract 4

List of figures 7

Chapter 1 Introduction 8

1.1 Introduction to topic 9

1.2 General Objectives and Scope of topic 9

Chapter 2 Background 11

2.1 React Native 12

2.2 Node.js 12

2.3 Firebase 12

2.3.1 Firebase Authentication 12

2.3.2 Firebase Real-time Database 13

2.4 Python and supported libraries 14

2.5 PostgreSQL 14

Chapter 3 Price Prediction Model 15

3.1 Linear Regression 16

3.1.1 Linear Regression 16

3.1.2 Polynomial Features 17

3.1.3 Cost function 17

3.1.4 Model Evaluation 18

Trang 9

3.2 Data preparation 20

3.2.1 Data set using 20

3.2.3 Outliers Removal 22

3.3 Training Model 24

Chapter 4 Implementation 27

4.1 System Architecture 28

4.2 Use-case Diagram 30

4.3 Activity Diagram 34

Chapter 5 Conclusion 38

6.1 Achieved Results 39

6.2 Future work of the Thesis 39

BIBLIOGRAPHY 40

Trang 10

LIST OF FIGURES

3.1 Data set using 20

3.2 Describing Interquartile Range and Outliers 23

3.3 Training model with Simple Linear Regression 24

3.4 Training model Polynomial Regression (Degree of 6) 25

3.5 The best degree of predicting line in District 7 (2-degree model) 26

3.6 The best degree of predicting line in District 6 (3-degree model) 26

4.1 System Architecture 28

4.2 Use-case Diagram of the app 30

4.3 Register Flow chart 34

4.4 Login Flow chart 34

4.5 Log out flow chart 35

4.6 View profile flow chart 35

4.7 Visualize chart Flow chart 36

4.8 Manage users Flow Chart 37

Trang 12

1.1 Introduction to topic

Real estate price is one of the vital factors affecting the investment decisions of real estateinvestors Therefore, they need a forecasting model to help them predict the price of aparticular property With the fast development of artificial intelligence, the AI model hasbeen applying in price prediction

In this thesis, we will develop a mobile application incorporating the AI model (The AImodel, which bases on collected data and then generates the predicting result) in order tohelp investors to have a useful tool in investment or buying a property The mobileapplication is convenient to carry when investors travel When they reach the destination,they just choose that location and types of land on the app, the system will automaticallygenerate the chart which describes the prediction of price

1.2 General Objectives and Scope of topic

The main aim of this topic is to develop a mobile application including the followingmain features:

Register/Login/Logout: Before using our system users need to sign up an unique

account After that users can login and logout to the system, before using users need

to login the system

Visualize chart: After users provide enough information for the application, then the

application will generate the chart based on the provided information

View profile: Users/ Admin can view the profile information of theirs.

Manage users: The administrator has the right to manage users, he/she can view the

list of all users who used the application and the detailed information of each person,find exact users based on users’ name or email Besides, admin have permission todelete users

Display price-prediction colors on google maps: if the price of land has an upward

trend, green color will be displayed on that area, otherwise red color will be displayed

Trang 13

Since the data about real estate is so big, so we limit our system as followings:

 The area that we will implement located in Ho Chi Minh City Vietnam

 We will predict for prices of land, not for the whole real estate Real estate usuallycontains buildings and land However, a building having the same type of land has awide range of characteristics, which lead to different prices for a property

 The app should be deployed on Android device

 The app should be handled at least 1000 users without any problems

 The app response’s time for any function should be less than 10 seconds

 The app size is maximum 200 MB

Trang 14

Chapter 2

Background

In this chapter, we will discuss about technologies to build the application

Contents

2.1 React Native 12

2.2 Node.js 12

2.3 Firebase 12

2.3.1 Firebase Authentication 12

2.3.2 Firebase Real-time Database 13

2.4 Python and supported libraries 13

2.5 PostgreSQL 14

Trang 15

2.1 React Native

React Native is a framework developed by the famous technology company Facebook in

2015 It is used for creating mobile apps for both Android and IOS platforms under onecommon language which is Javascrip, because of this React Native apps savedevelopment time Furthermore, With React Native Framework, you can render UI forboth iOS and Android platforms It is an open source framework

2.2 Node.js

Modern apps have several requirements, which cannot be provided by the app itself, such

as central data storage, communication routing, and user management In order to providesuch services, apps rely on an external software component known as the back-end Theback-end will be executed on one or more remote servers, listen to network requests fromdevices the run the app, and provide them with the services that requests require Theback-end Node.js is written almost entirely in JavaScript

2.3 Firebase

2.3.1 Firebase Authentication

Most apps need to know the identity of a user Knowing a user's identity allows an app tosecurely save user data in the cloud and provide the same personalized experience acrossall of the user's devices

Firebase Authentication provides backend services, and ready-made UI libraries toauthenticate users to the app It supports authentication using passwords, phone numbers,popular federated identity providers like Google, Facebook and Twitter, and more

Trang 16

2.3.2 Firebase Real-time Database.

The Firebase Real-time Database is a NoSQL database from which we can store and syncthe data between our users in real-time Real-time syncing makes it easy for your users toaccess their data from any device: web or mobile The Real-time Database integrates withFirebase Authentication to provide simple and intuitive authentication

2.4 Python and supported libraries

Python is an interpreted, object-oriented, high-level programming language with dynamicsemantics Python supports modules and packages, which encourages program modularityand code reuse Python is becoming increasingly popular in Machine Learning along withits frameworks and standard libraries One of the reasons behind Python’s increasingpopularity is the wealth of libraries Some libraries that used for training model as follows:

NumPy: NumPy is a Python library used for working with arrays It is an open

source project and you can use it freely In Python we have lists that serve the purpose

of arrays, but they are slow to process NumPy aims to provide an array object that ismuch faster than traditional Python lists

SciPy: SciPy is a scientific computation library that uses NumPy underneath It

provides more utility functions for optimization, stats and signal processing SciPyhas optimized and added functions that are frequently used in NumPy

Matplotlib: Matplotlib is a plotting library used for 2D graphics Matplotlib can be

used in Python scripts, the Jupyter notebook, web application servers…

Trang 17

Pandas: Pandas is an open-source library that is made for working with relational or

labeled data It provides various data structures and operations for manipulatingnumerical data and time series Pandas is fast and it has high-performance andproductivity

Scikit-learn: is library for machine learning in Python The scikit-learn library

contains a lot of efficient tools for machine learning and statistical modeling

Flask: is a web application framework written in Python, it is a good choice for

building API for machine learning service because it is easy to use and supports manyPython libraries

2.5 PostgreSQL

PostgreSQL is a powerful, open source object-relational database system that uses andextends the SQL language combined with many features that safely store and scale themost complicated data workloads PostgreSQL runs on all major operating systems, it isthe open source relational database of choice for many people and organizations.PostgreSQL comes with many features aimed to help developers build applications

Trang 18

Chapter 3

Price prediction Model

_

In this chapter, we discuss about theory of Linear Regression Model, processing Data Sets, building the price prediction Model based on the Theory of Linear Regression

Contents

_

3.1 Linear Regression 16

3.1.1 Linear Regression 16

3.1.2 Polynomial Features 17

3.1.3 Cost Function 17

3.1.4 Model Evaluation 18

3.2 Data preparation 20

3.2.1 Data set using 20

3.2.2 Outliers Removal 22

3.3 Training model 24

Trang 19

3.1 Linear Regression

3.1.1 Linear Regression

Linear Regression is usually common and simple machine learning algorithm which usedfor prediction analysis in statistics fields In statistics, linear regression is a linearapproach to modeling the relationship between an output (or dependent variable) and one

or more variables (or independent variables) Linear Regression Model was applied inmany real life areas for solving the predictive problems Here we apply Linear RegressionModel for predicting price of particular land in the future based on the price of that

property in the past There are two types of Linear Regression: the first one is Simple

Linear Regression and the other one is Multiple Linear Regression.

Simple Linear Regression: is a linear model that has only dependent variable (the output)

and only one independent variable (the input variable) The Simple Linear Model can berepresented by the following equation:

x b

b

y  0  1

Where:

b0is called intercept

b1is the coefficient of x which is the input (independent) variable

Multiple Linear Regression: is also a linear model that has a target variable (dependent

variable) and a set of independent variables {x1, x2,.…, xn} A Multiple Linear Regressioncan be seen as a generalization of a Simple Linear Regression It is described by thefollowing equation:

n

nx b x

b x

Trang 20

3.1.2 Polynomial Features

In the Simple Linear Regression, the prediction line is a straight line which shows therelationship between the target variable and input variable The data used for trainingmodel is often complicated so the prediction of linear line is not efficient enough anymore

To solve this problem, we use another approach in order to better improve the importantrelationships between input variables and the target variable This approach is to usePolynomial Features which are features created by raising existing features to an exponent(creating new input features based on the existing features) This is another type of Linear

Regression called Polynomial Regression (Polynomial regression extends the linear

model by adding extra features, obtained by raising each of the original features to apower) which has the following form:

n

nx b x

b x

b

b

y    2  

2 1

b x

Trang 21

J is the Cost Function

h(x) is the prediction function (the function illustrates the relationship between inputvariables and target variable)

y is the real value in the data samples

In order to find the prediction function h(x) (prediction line), we need to calculate theminimum value of the Cost Function We had a method to find the minimum of this

function is Gradient Descent (an iterative optimization algorithm to find the minimum of

a function)

3.1.4 Model Evaluation

The main objective of Linear Regression is to find a prediction line that minimizes theprediction error of all the data points Thus, we need metrics to evaluate the accuracy ofthe training models There are many metrics for evaluation but we just mainly focus ontwo common metrics which are Root Mean Square Error (RMSE) and R-squared Score(R2)

Root Mean Square Error (RMSE)

Root Mean Squared Error (RMSE) represents the average of the difference between theactual values and predicted values in the data set

RMSE

Where:

y is the actual value

is the predicted value

Ngày đăng: 31/07/2024, 10:17

w