1. Trang chủ
  2. » Luận Văn - Báo Cáo

final report database systems homes com database systems for boston house price prediction

19 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

Realizing this, we decided to work on house price prediction in Boston.. A house-price prediction model can provide numerous benefits to home purchasers, property investors, andhome buil

Trang 1

FINAL REPORTDatabase Systems

Homes.com: Database Systems for Boston HousePrice Prediction

Lectuter: Cu Nguyen GiapClass: INS 205502

Vu Thi Tra MyNguyen Chi NghiaTran Thanh Nhan

Trang 2

Table Contents

Trang 3

Member whose contributions each member of the group :

Write reportCollect informationBussiness nerrativesRelational schemaRetrieving the databaseMake 10 questions

Relational schema Retrieving the databaseInsert real sample dataMake 10 questions

Slide

Bussiness nerrativesERD

Create physical database Inserting real sample dataMake 10 questions

Trang 4

1.About Company:

Homes.com is the fastest growing home search website in theindustry That’s because they’ve built a site that both agents andhomebuyers love A place where agents are empowered to grow theirbusiness and provide best in class service without giving away theircommissions A place where homebuyers can connect directly with theagents that know the home best and work with the agent of their choice.

About search, content, and advertising strategies are designed to bring millions of transaction-ready buyers and sellers to Homes.com, where they can find a great agent, or connect to their current one and collaborate during the entire process Homes.com offer a full line of advertisingproducts and online marketing services designed to help real estate professionals connect with interested buyers and sellers If your goals include connecting with quality buyers and sellers searching for their next home and leveraging the right tools and services to grow your business, you’ve come to the right place It has tons of resources to help you stay informed of what’s happening in the industry, what’s working for successful agents, and what tactics are leading to success in today’s market.

2.Bussiness Nerratives

Housing is one of the most basic demands of human life, along with food, water, and othernecessities As people's living circumstances improved, demand for housing increased rapidly.Housing markets have a favorable impact on a country's currency, which is a significant factor in thenational economy Realizing this, we decided to work on house price prediction in Boston A house-price prediction model can provide numerous benefits to home purchasers, property investors, andhome builders This model may provide a wealth of information and expertise to home purchasers,property investors, and home builders, such as the valuation of current market house prices, whichwill assist them in determining house pricing Meanwhile, this model can assist potential purchasers indetermining the features of a property that are appropriate for their budget.

In this project, we will develop and evaluate the performance and the predictive power of amodel trained and tested on data collected from houses in Boston’s suburbs Once we get a good fit,we will use this model to predict the monetary value of a house located at the Boston’s area A modellike this would be very valuable for a real state agent who could make use of the information provided

Trang 5

II.Data dictionary:

The dataset used in this project comes from the UCI Machine Learning Repository This datawas collected in 1978 and each of the 506 entries represents aggregate information about 13 featuresof homes from various suburbs located in Boston and some data about buyer, seller and investorinformation is generated by us from collecting information on the internet And here is 13 detailedattribute information can be found below:

Attribute Information:

- CRIM: Per capita crime rate by town

- ZN: Proportion of residential land zoned for lots over 25,000 sq.ft.- INDUS: Proportion of non-retail business acres per town

- CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)- NOX: Nitric Oxide concentration (parts per 10 million)

- RM: The average number of rooms per dwelling- AGE: Proportion of owner-occupied units built before 1940- DIS: Weighted distances to five Boston employment centers- RAD: Index of accessibility to radial highways

- TAX: Full-value property-tax rate per 10,000 dollars- PTRATIO: Pupil-teacher ratio by town

- LSTAT: % lower status of the population

- MEDV: Median value of owner-occupied homes in 1000 dollars

Trang 6

III.Analyzing and draw the ERD diagram

The entity relationship diagram can be thought of as the database's design sketch ERDprovides visualization for database design, hence it serves the following functions:

- Supports in the definition of information system requirements across the organizationand assists users in planning how to organize data It facilitates planning before beginning tobuild the tables.

- The ERD diagram can be used as a document to help others comprehend the database'score.

Trang 7

- Once the relational database has been deployed, the ERD can still be used as a referencepoint if the debug or business process needs to be re-established later.

Analyzing the entities:

+ Property: the property table includes the address, number of floors, year of construction, areaof 1 property, and the ID attached to each property.

+ Person: the person table plays the role of managing the properties, through the propertyIDand it is divided into 3 main categories (Seller, Customer, Investor) through the ID of the table role + Roles: role table for information about types of people (Seller, Customer, Investor) + Status: status table to view the status of the property based on the ID of the table status (sold,on sale, fixing)

+ HousePrices: house price list for sale date and original selling price of the property viapropertyID

+ MarketData: provides information about the real estate market by address and date and ateach time there is a main keyword, MarketDataID

+ Prediction: provide the property's predicted date and price via the propertyID and theMarketData table influenced prediction table via the MarketDataID

+ PropertyType: this table for found property's classification (Villa, Apartment, Cabin,Penthouse) via TypeID

+ Interior: this table shows the interior of each property based on the PropertyID.

Trang 8

IV.Relational Schema:

The Relational Schema is generated from the ERD, displaying the table elements that correspondto the entities and providing table designers in SQL server with a more detailed perspective of thetable implementation.

Trang 9

V.Build a database using SQL Sever

- Create table Role

Create table Role 1 CREATE TABLE Role (

roleID int IDENTITY(1 1, ) NOT NULL, set roleID as PK roleName varchar(30) NOT NULL,

CONSTRAINT [PK_Role] PRIMARY KEY CLUSTERED set roleID as PK (

roleID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]); GO

- Create table Status

Create table Status 2 CREATE TABLE Status (

statusID int IDENTITY(, ) NOT NULL, set statusID as PK statusName varchar(30) NOT NULL,

CONSTRAINT [PK_Status] PRIMARY KEY CLUSTERED set personID as PK (

statusID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

- Create table PropertyType 3

Create table PropertyType 3 CREATE TABLE PropertyType (

typeID int IDENTITY(1 1, ) NOT NULL, set typeID as PK typeName varchar(30) NOT NULL

CONSTRAINT [PK_PropertyType] PRIMARY KEY CLUSTERED set personID as PK

(typeID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Trang 10

-Create table MarketData 4

Create table MartketData 4CREATE TABLE MarketData (

marketDataID int IDENTITY( ,) NOT NULL,

date date NOT NULL,

address varchar(100) NOT NULL,

CRIM float NOT NULL,

ZN float NOT NULL,

INDUS float NOT NULL,

CHAS bit NOT NULL,

NOX float NOT NULL,

RM float NOT NULL,

AGE float NOT NULL,

DIS float NOT NULL,

RAD int NOT NULL,

TAX int NOT NULL,

PTRATIO float NOT NULL,

LSTAT float NOT NULL,

MEDV float NOT NULL,

CONSTRAINT [PK_MarketData] PRIMARY KEY CLUSTERED set marketDataID as PK

-Create table Person

Create table Person 5CREATE TABLE Person (

personID int IDENTITY( , ) NOT NULL,

name varchar(50) NOT NULL,

phone varchar(11) NOT NULL,

address varchar(100) NOT NULL,

gender bit NOT NULL,

roleID int FOREIGN KEY REFERENCES Role(roleID) NOT NULL,

CONSTRAINT [PK_Person] PRIMARY KEY CLUSTERED set personID asPK (personID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF,

Trang 11

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

-Create table Property

Create table Property 6CREATE TABLE Property (

propertyID int IDENTITY( ,) NOT NULL, set propertyID as PKsquareFootage decimal(10, ) NOT NULL,

floor int NOT NULL,

yearBuilt date NOT NULL,

address varchar(100) NOT NULL,

saleDate date NOT NULL,

salePrice decimal( ,) NOT NULL,

typeID int FOREIGN KEY REFERENCES PropertyType typeID( ) NOT NULL,

statusID int FOREIGN KEY REFERENCES Status(statusID) NOT NULL,

personID int FOREIGN KEY REFERENCES Person personID( ) NOT NULL,

CONSTRAINT [PK_Property] PRIMARY KEY CLUSTERED set personID as PK( propertyID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

-Create table Interior

Create table Interior 7 CREATE TABLE Interior (

propertyID int IDENTITY(1 1, ) NOT NULL, set propertyID as PK numbedRooms int NOT NULL,

numBathrooms int NOT NULL,

kitchen bit NOT NULL,

pool bit NOT NULL,

garden bit NOT NULL,

garage bit NOT NULL,

CONSTRAINT FK_Property FOREIGN KEY (propertyID) REFERENCES Property propertyID( ), set propertyID as FK for table Property(propertyID)

CONSTRAINT [PK_Interior] PRIMARY KEY CLUSTERED set personID as PK (

propertyID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

Trang 12

-Create table Prediction

Create table Prediction 8 CREATE TABLE Prediction (

propertyID int NOT NULL, set propertyID as PK marketDataID int NOT NULL set marketDataID as PK

CONSTRAINT FK_MarketData FOREIGN KEY (marketDataID) REFERENCES MarketData marketDataID( ), set marketDataID as FK for table MarketData(marktDataID),

CONSTRAINT FK_Prediction FOREIGN KEY (propertyID) REFERENCES Property propertyID( ), set marketDataID as FK for table MarketData(marktDataID),

predictionDate date NOT NULL,

predictionPrice decimal( ,) NOT NULL,

CONSTRAINT [PK_Perediction] PRIMARY KEY CLUSTERED set personID as PK(

propertyID, marketDataID ASC

)WITH PAD_INDEX ( = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY

= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

2 Insert Table :

-INSERT INTO Role

INSERT INTO Role VALUES('Seller');

INSERT INTO Role VALUES('Customer');

INSERT INTO Role VALUES('Investor');

-INSERT INTO Status

INSERT INTO Status VALUES('On Sale');

INSERT INTO Status VALUES('Sold');

INSERT INTO Status VALUES('Fixing');

-INSERT INTO PropertyType

INSERT INTO PropertyType VALUES('Villa');

INSERT INTO PropertyType VALUES('Cabin');

INSERT INTO PropertyType VALUES('Apartment');

Trang 13

-INSERT INTO Person

INSERT INTO Person VALUES('Andrew Garfield', '08172645781', '408 5th Ave, Brooklyn, United States', , 0 1);

INSERT INTO Person VALUES('Charlie Puth', '04716284625', '3548 SJefferson St #52, Falls Church, United States', , 0 2);

INSERT INTO Person VALUES('Selena Gomez', '0751827458', '1455 S LambBlvd, Las Vegas, United States', , 0 3);

-INSERT INTO Property

INSERT INTO Property VALUES(5000, , 2 '04/11/2018', '974 Blue HillAvenue, Boston, United States', '01/09/2023', 3012, , , ); 1 1 1INSERT INTO Property VALUES(3670, , 1 '06/20/2012', '521 Washington St, Boston, United States', '10/10/2019', 5921, , , 1 1 1);

INSERT INTO Property VALUES(2830, , 3 '07/10/2015', '415 AmericanLegion Hwy, Boston, United States', '01/07/2020', 5921, , , 2 2 1);

-INSERT INTO Interior

INSERT INTO Interior VALUES( , , , , , );3 2 1 1 0 1INSERT INTO Interior VALUES( , , , , , );2 2 1 0 0 0INSERT INTO Interior VALUES( , , , , , );3 1 1 0 1 1INSERT INTO Interior VALUES( , , , , , );1 1 1 1 0 0INSERT INTO Interior VALUES( , , , , , );2 2 1 0 1 1

-INSERT INTO MarketData

INSERT INTO MarketData VALUES('01/01/2023','Blue Hill Avenue, Boston, United States', 0.00632, 18, 2.31, , 0 0.538, 6.575, 65.2, 4.09, ,1296, 15.3, 4.98, 24);

INSERT INTO MarketData VALUES('03/01/2023','Blue Hill Avenue, Boston, United States', 0.02731, , 0 7.07, , 0 0.469, 6.421, 78.9, 4.9671, ,2242, 17.8, 9.14, 21.6);

INSERT INTO MarketData VALUES('06/01/2023','Blue Hill Avenue, Boston, United States', 0.02729, , 0 7.07, , 0 0.469, 7.185, 61.1, 4.9671, ,2242, 17.8, 4.03, 34.7);

-INSERT INTO Prediction

Trang 14

INSERT INTO Prediction VALUES( , ,1 1'01/01/2023', 9992.213);

INSERT INTO Prediction VALUES( , ,1 2'03/01/2023', 10000.324);

INSERT INTO Prediction VALUES( , ,1 3'06/01/2023', 8823.534);

VI Business Questions:

1 Display all houses in Boston ?

3.Print out houses located in the Adams ?

SELECT FROM * Property

Trang 15

Output:

4.Find and print the homes with the lowest principal on Blue Hill Avenue ?

SELECT FROM * Property

WHERE Property.address like '%'+'Blue Hill Avenue'+'%' ORDER BY Property salePrice.

Output:

5.Find 10 homes with the lowest predicted price in June 2023?

SELECT TOP 10 *FROM Property

INNER JOIN Prediction ON Property.propertyID = Prediction.propertyID WHERE MONTH(predictionDate) = 6

Trang 16

INNER JOIN Prediction ON Property.propertyID =

Prediction.propertyID WHERE MONTH(predictionDate) BETWEEN 9 AND

ORDER BY predictionPrice

Output:

Trang 17

7.Find homes by status?

CREATE PROCEDURE PropertyStatus @statusName nvarchar(30) ASSELECT *

8.Print out the predicted prices of homes currently for sale in September?

SELECT * FROM Property

INNER JOIN Status ON Status.statusID = Property statusID.

INNER JOIN Prediction ON Prediction.propertyID = Property.propertyID WHERE Status.statusID = 1 AND MONTH(Prediction predictionDate ) = 9

Output:

Trang 18

9.Print owners of more than 2 property?

SELECT Person.name AS Name, COUNT(propertyID) AS NumberOfProperties FROM (Property

INNER JOIN Person ON Property.personID = Person.personID)

GROUP BY name

HAVING COUNT(propertyID) > 2;

10.Use trigger-tran to suppress unwanted inserts?

CREATE TRIGGER CheckInsert ON Person

FOR INSERT, UPDATE

ROLLBACK TRAN

Print('CHECK INSERT CAREFULLY!!!')

INSERT INTO Person VALUES('Melody Mark', '08172645723', '408 AdamsSt, Brooklyn, United States', , )1 2

Trang 19

INS 205502

The task of this database system is to help the store data and information in a consistentmanner, avoiding redundancy in each specific category By building a database system, customerscan easily look up house prices in Boston (quantity, condition, ).

In I, we summarize the importance of housing and the housing market impacts in relation tothese important factors In II, we generate Boston housing data to bring them into the ERD diagramin III From ERD we created Relational Schema and then used SQL statement to build database byusing SQL Sever Finally made up 10 questions which can be answered by retrieving in theinformation from the database.

Reference

Boston house price prediction | Kaggle

Boston House Price Prediction Using Machine Learning

Machine Learning Project: Predicting Boston House Prices With Regression | by VictorRoman | Towards Data Science

Machine_Leaning_Engineer_Udacity_NanoDegree/projects/boston_housing at master· rromanss23/Machine_Leaning_Engineer_Udacity_NanoDegree · GitHub

Boston Home Prices Prediction and Evaluation | Machine Learning, Deep Learning, andComputer Vision

Boston Housing - Price Prediction

The End

Ngày đăng: 08/08/2024, 18:33

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w