1. Trang chủ
  2. » Luận Văn - Báo Cáo

final report database systems topic database systems for boston house price prediction

18 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Database Systems for Boston House Price Prediction
Tác giả Nguyễn Văn Hiếu, Đặng Anh Quốc, Nguyễn Đức Hùng, Trần Tuấn Minh
Người hướng dẫn Nguyen Huy Anh, Lecturer
Trường học VIETNAM NATIONAL UNIVERSITY, HANOI INTERNATIONAL SCHOOL
Chuyên ngành Database Systems
Thể loại Final Report
Năm xuất bản 2023
Thành phố Hanoi
Định dạng
Số trang 18
Dung lượng 1,52 MB

Nội dung

Group 9: Database Systems for Boston House Price prediction11/01/2023VIETNAM NATIONAL UNIVERSITY, HANOIINTERNATIONAL SCHOOL---*****---FINAL REPORT: DATABASE SYSTEMSTopic: Database System

Trang 1

Group 9: Database Systems for Boston House Price prediction 11/01/2023

VIETNAM NATIONAL UNIVERSITY, HANOI INTERNATIONAL SCHOOL

-***** -FINAL REPORT: DATABASE SYSTEMS

Topic: Database Systems for Boston House Price Prediction

Trang 2

Group 9: Database Systems for Boston House Price prediction

11/01/2023

Member whose contributions each member of the group :

Relational schema, Retrieving the database, Make 10 question

Relational schema, Insert real sample data

Inserting real sample data

Create physical database

Trang 3

Group 9: Database Systems for Boston House Price prediction 11/01/2023

Table of contents

I Introduction 4

II Data dictionary 4

III Analyzing and draw the ERD diagram 6

IV Relational Schema 8

V Build a database using SQL Sever 9

VI Business Questions 14

VII Conclusion: 18

VIII Reference 18

Trang 4

I Introduction:

Housing is one of the most basic demands of human life, along with food, water, and other necessities As people's living circumstances improved, demand for housing increased rapidly Housing markets have a favorable impact on a country's currency, which

is a significant factor in the national economy Numerous factors influence housing sales prices, including the size of the property, its location, the materials used in construction, the age of the property, the number of bedrooms and garages, and so on

A house-price prediction model can provide numerous benefits to home purchasers, property investors, and home builders This model may provide a wealth of information and expertise to home purchasers, property investors, and home builders, such as the valuation of current market house prices, which will assist them in determining house pricing Meanwhile, this model can assist potential purchasers in determining the features

of a property that are appropriate for their budget

In this project, we will develop and evaluate the performance and the predictive power of a model trained and tested on data collected from houses in Boston’s suburbs Once we get a good fit, we will use this model to predict the monetary value of a house located at the Boston’s area A model like this would be very valuable for a real state agent who could make use of the information provided on a daily basis

II Data dictionary:

The dataset used in this project comes from the UCI Machine Learning

Repository This data was collected in 1978 and each of the 506 entries represents aggregate information about 13 features of homes from various suburbs located in Boston and some data about buyer, seller and investor information is generated by us from collecting information on the internet And here is 13 detailed attribute information can be found below:

Trang 5

Attribute Information:

- CRIM: Per capita crime rate by town

- ZN: Proportion of residential land zoned for lots over 25,000 sq.ft

- INDUS: Proportion of non-retail business acres per town

- CHAS: Charles River dummy variable (= 1 if tract bounds river;

0 otherwise)

- NOX: Nitric Oxide concentration (parts per 10 million)

- RM: The average number of rooms per dwelling

- AGE: Proportion of owner-occupied units built before 1940

- DIS: Weighted distances to five Boston employment centers

- RAD: Index of accessibility to radial highways

- TAX: Full-value property-tax rate per 10,000 dollars

- PTRATIO: Pupil-teacher ratio by town

- LSTAT: % lower status of the population

- MEDV: Median value of owner-occupied homes in 1000 dollars

Trang 6

III Analyzing and draw the ERD diagram

The entity relationship diagram can be thought of as the database's design sketch ERD provides visualization for database design, hence it serves the following functions:

- Supports in the definition of information system requirements across the organization and assists users in planning how to organize data It facilitates planning before beginning to build the tables

- The ERD diagram can be used as a document to help others comprehend the database's core

Trang 7

- The ERD diagram depicts the database's logical structure so that users can understand it

- Once the relational database has been deployed, the ERD can still be used as

a reference point if the debug or business process needs to be re-established later

Analyzing the entities:

+ Property: the property table includes the address, number of floors, year of construction, area of 1 property, and the ID attached to each property

+ Person: the person table plays the role of managing the properties, through the propertyID and it is divided into 3 main categories (Seller, Customer, Investor) through the ID of the table role

+ Roles: role table for information about types of people (Seller, Customer, Investor)

+ Status: status table to view the status of the property based on the ID of the table status (sold, on sale, fixing)

+ HousePrices: house price list for sale date and original selling price of the property via propertyID

+ MarketData: provides information about the real estate market by address and date and at each time there is a main keyword, MarketDataID

+ Prediction: provide the property's predicted date and price via the propertyID and the MarketData table influenced prediction table via the MarketDataID + PropertyType: this table for found property's classification (Villa, Apartment, Cabin, Penthouse) via TypeID

+ Interior: this table shows the interior of each property based on the PropertyID

Trang 8

Group 9: Database Systems for Boston House Price prediction 11/01/2023

IV Relational Schema:

The Relational Schema is generated from the ERD, displaying the table elements that correspond to the entities and providing table designers in SQL server with a more detailed perspective of the table implementation

Trang 9

Group 9: Database Systems for Boston House Price prediction 11/01/2023

V Build a database using SQL Sever

1 Create Database:

- Create and Using database:

Create DataBase

CREATE DATABASE BostonHousePricePredictionDB

GO

Use Database

USE BostonHousePricePredictionDB

GO

- Create table Role

Create table Role 1

CREATE TABLE Role (

roleID int IDENTITY( , ) NOT NULL, set roleID as PK

roleName varchar( 30 ) NOT NULL, CONSTRAINT [PK_Role] PRIMARY KEY CLUSTERED set roleID as PK

(

roleID ASC ) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]);

GO

- Create table Status

Create table Status 2

CREATE TABLE Status (

statusID int IDENTITY( , ) NOT NULL, set statusID as PK

statusName varchar( 30 ) NOT NULL, CONSTRAINT [PK_Status] PRIMARY KEY CLUSTERED set personID as PK

(

statusID ASC ) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

- Create table PropertyType 3

Create table PropertyType 3

CREATE TABLE PropertyType (

typeID int IDENTITY( , ) NOT NULL, set typeID as PK

typeName varchar( 30 ) NOT NULL CONSTRAINT [PK_PropertyType] PRIMARY KEY CLUSTERED set personID

as PK

(

Trang 10

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

- GO

- Create table MarketData 4

Create table MartketData 4

CREATE TABLE MarketData (

marketDataID int IDENTITY( , ) NOT NULL,

date date NOT NULL,

address varchar( 100 ) NOT NULL,

CRIM float NOT NULL,

ZN float NOT NULL,

INDUS float NOT NULL,

CHAS bit NOT NULL,

NOX float NOT NULL,

RM float NOT NULL,

AGE float NOT NULL,

DIS float NOT NULL,

RAD int NOT NULL,

TAX int NOT NULL,

PTRATIO float NOT NULL,

LSTAT float NOT NULL,

MEDV float NOT NULL,

CONSTRAINT [PK_MarketData] PRIMARY KEY CLUSTERED set marketDataID

as PK

(

marketDataID ASC

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

- Create table Person

Create table Person 5

CREATE TABLE Person (

personID int IDENTITY( , ) NOT NULL,

name varchar( 50 ) NOT NULL,

phone varchar( 11 ) NOT NULL,

address varchar( 100 ) NOT NULL,

gender bit NOT NULL,

Trang 11

personID ASC

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

- Create table Property

Create table Property 6

CREATE TABLE Property (

propertyID int IDENTITY( , ) NOT NULL, set propertyID as PK

squareFootage decimal( 10 4 , ) NOT NULL,

floor int NOT NULL,

yearBuilt date NOT NULL,

address varchar( 100 ) NOT NULL,

saleDate date NOT NULL,

salePrice decimal( , ) NOT NULL,

typeID int FOREIGN KEY REFERENCES PropertyType typeID ( ) NOT NULL,

statusID int FOREIGN KEY REFERENCES Status( statusID ) NOT NULL,

personID int FOREIGN KEY REFERENCES Person personID ( ) NOT NULL, CONSTRAINT [PK_Property] PRIMARY KEY CLUSTERED set personID as PK

(

propertyID ASC

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

-Create table Interior

Create table Interior 7

CREATE TABLE Interior (

propertyID int IDENTITY( , ) NOT NULL, set propertyID as PK

numbedRooms int NOT NULL,

numBathrooms int NOT NULL,

kitchen bit NOT NULL,

pool bit NOT NULL,

garden bit NOT NULL,

garage bit NOT NULL,

CONSTRAINT FK_Property FOREIGN KEY ( propertyID ) REFERENCES

Property propertyID ( ), set propertyID as FK for table

Property(propertyID)

CONSTRAINT [PK_Interior] PRIMARY KEY CLUSTERED set personID as PK

Trang 12

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

- Create table Prediction

Create table Prediction 8

CREATE TABLE Prediction (

propertyID int NOT NULL, set propertyID as PK

marketDataID int NOT NULL set marketDataID as PK

CONSTRAINT FK_MarketData FOREIGN KEY ( marketDataID ) REFERENCES

MarketData marketDataID ( ), set marketDataID as FK for table MarketData(marktDataID),

CONSTRAINT FK_Prediction FOREIGN KEY ( propertyID ) REFERENCES

Property propertyID ( ), set marketDataID as FK for table

MarketData(marktDataID),

predictionDate date NOT NULL,

predictionPrice decimal( , ) NOT NULL,

CONSTRAINT [PK_Perediction] PRIMARY KEY CLUSTERED set personID as PK

(

propertyID , marketDataID ASC

) WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]

);

GO

2 Insert Table :

-INSERT INTO Role

INSERT INTO Role VALUES( 'Seller' );

INSERT INTO Role VALUES( 'Customer' );

INSERT INTO Role VALUES( 'Investor' );

GO

-INSERT INTO Status

INSERT INTO Status VALUES( 'On Sale' );

INSERT INTO Status VALUES( 'Sold' );

INSERT INTO Status VALUES( 'Fixing' );

Trang 13

-INSERT INTO PropertyType

INSERT INTO PropertyType VALUES( 'Villa' );

INSERT INTO PropertyType VALUES( 'Cabin' );

INSERT INTO PropertyType VALUES( 'Apartment' );

INSERT INTO PropertyType VALUES( 'Villa' );

GO

-INSERT INTO Person

INSERT INTO Person VALUES( 'Andrew Garfield' , '08172645781' , '408 5th Ave, Brooklyn, United States' , , 0 1 );

INSERT INTO Person VALUES( 'Charlie Puth' , '04716284625' , '3548 S Jefferson St #52, Falls Church, United States' , , 0 2 );

INSERT INTO Person VALUES( 'Selena Gomez' , '0751827458' , '1455 S Lamb Blvd, Las Vegas, United States' , , 0 3 );

-INSERT INTO Property

INSERT INTO Property VALUES( 5000 , 2 , '04/11/2018' , '974 Blue Hill Avenue, Boston, United States' , '01/09/2023' , 3012 , , , ); 1 1 1

INSERT INTO Property VALUES( 3670 , 1 , '06/20/2012' , '521 Washington

St, Boston, United States' , '10/10/2019' , 5921 , , , 1 1 1 );

INSERT INTO Property VALUES( 2830 , 3 , '07/10/2015' , '415 American Legion Hwy, Boston, United States' , '01/07/2020' , 5921 , , , 2 2 1 );

-INSERT INTO Interior

INSERT INTO Interior VALUES( , , , , , ); 3 2 1 1 0 1

INSERT INTO Interior VALUES( , , , , , ); 2 2 1 0 0 0

INSERT INTO Interior VALUES( , , , , , ); 3 1 1 0 1 1

INSERT INTO Interior VALUES( , , , , , ); 1 1 1 1 0 0

INSERT INTO Interior VALUES( , , , , , ); 2 2 1 0 1 1

-INSERT INTO MarketData

INSERT INTO MarketData VALUES( '01/01/2023' 'Blue Hill Avenue, Boston, , United States' , 0.00632 , 18 , 2.31 , , 0 0.538 , 6.575 , 65.2 , 4.09 , , 1

296 , 15.3 , 4.98 , 24 );

INSERT INTO MarketData VALUES( '03/01/2023' 'Blue Hill Avenue, Boston, , United States' , 0.02731 , , 0 7.07 , , 0 0.469 , 6.421 , 78.9 , 4.9671 , , 2

Trang 14

United States' , 0.02729 , , 0 7.07 , , 0 0.469 , 7.185 , 61.1 , 4.9671 , , 2

242 , 17.8 , 4.03 , 34.7 );

-INSERT INTO Prediction

INSERT INTO Prediction VALUES( , , 1 1 '01/01/2023' , 9992.213 );

INSERT INTO Prediction VALUES( , , 1 2 '03/01/2023' , 10000.324 );

INSERT INTO Prediction VALUES( , , 1 3 '06/01/2023' , 8823.534 );

VI Business Questions:

1 Display all houses in Boston ?

SELECT *

FROM Property

Output:

2. Print out the houses with the smallest principal amount?

SELECT *

FROM Property

WHERE salePrice = ( SELECT MIN ( salePrice ) FROM Property )

Output:

3.Print out houses located in the Adams ?

Trang 15

WHERE Property address like '%' + 'Adams' + '%'

Output:

4. Find and print the homes with the lowest principal on Blue Hill Avenue ?

SELECT *

FROM Property

WHERE Property address like '%' + 'Blue Hill Avenue' + '%'

ORDER BY Property salePrice

Output:

5. Find 10 homes with the lowest predicted price in June 2023?

SELECT TOP 10 *

FROM Property

INNER JOIN Prediction ON Property propertyID = Prediction propertyID WHERE MONTH ( predictionDate ) = 6

ORDER BY predictionPrice

Output:

6. Find the 10 homes with the lowest predicted prices between September and December 2023?

Trang 16

FROM Property

INNER JOIN Prediction ON Property propertyID = Prediction propertyID WHERE MONTH ( predictionDate ) BETWEEN 9 AND 12

ORDER BY predictionPrice

Output:

7.Find homes by status?

CREATE PROCEDURE PropertyStatus @statusName nvarchar( 30 )

AS

SELECT *

FROM Property

INNER JOIN Status ON Property statusID = Status statusID

WHERE Status statusName = @statusName

EXEC PropertyStatus @statusName = 'On Sale'

Output:

8.Print out the predicted prices of homes currently for sale in September?

SELECT *

FROM Property

INNER JOIN Status ON Status statusID = Property statusID

INNER JOIN Prediction ON Prediction propertyID = Property propertyID WHERE Status statusID = 1 AND MONTH ( Prediction predictionDate ) = 9

Trang 17

Group 9: Database Systems for Boston House Price prediction 11/01/2023

Output:

9.Print owners of more than 2 property?

SELECT Person name AS Name , COUNT ( propertyID ) AS NumberOfProperties

FROM ( Property

INNER JOIN Person ON Property personID = Person personID ) GROUP BY name

HAVING COUNT ( propertyID ) > 2 ;

Output:

10.Use trigger-tran to suppress unwanted inserts?

CREATE TRIGGER CheckInsert

ON Person

FOR INSERT, UPDATE

AS BEGIN ROLLBACK TRAN Print( 'CHECK INSERT CAREFULLY!!!' ) END

GO INSERT INTO Person VALUES( 'Melody Mark' , '08172645723' , '408 Adams

St, Brooklyn, United States' , , ) 1 2

Ngày đăng: 08/08/2024, 18:33

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN