1. Trang chủ
  2. » Luận Văn - Báo Cáo

Final report elephantry

204 82 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

MINISTRY OF EDUCATION AND TRAINING FPT UNIVERSITY Capstone Project Document Applied Recommender System and Big Data technology in an online recommender system service Bamboo Team Nguyễn Quang Trung – SE03897 Đào Huy Đạt – SE03781 Group Nguyễn Á Đông – SE03494 Members Nguyễn Mạnh Hiếu – SE03870 Phạm Khắc Độ – SE03824 Supervisor Ngô Tùng Sơn Ext Supervisor N/a Capstone Project code Elephantry - Hanoi, August/2017 Table of Contents Acknowledgements Definitions and Acronyms Chapter INTRODUCTION 1.1 Introduction 1.1.1 Purpose: 1.1.2 Capstone project information: 1.1.3 People: 1.2 Background: 1.2.1 Recommendation and Related Works 1.2.2 Market Research 10 1.3 The current system: 11 1.4 Initial idea of project: 13 Chapter SOFTWARE PROJECT MANAGEMENT PLAN 14 2.1 Problem Definition: 14 2.1.1 Name of this Capstone project: 14 2.1.2 Problem Abstract: 14 2.1.3 Project Overview 14 2.2 Project Organization: 17 2.2.1 Software Process Model 17 2.2.2 Roles and Responsibilities 19 2.2.3 Tools and Techniques 21 2.3 Project Management Plan: 22 2.3.1 Task Sheet: Assignments and Timetable 22 2.3.2 Risk Management: 22 2.3.3 All Meeting Minutes 25 2.3.4 Coding Convention: 26 Chapter SOFTWARE REQUIREMENTS SPECIFICATION (SRS) 27 3.1 User Requirement Specification 27 3.1.1 Purpose 27 3.1.2 Scope 27 3.2 Function Requirements 27 3.3 Use case diagram 30 3.4 System Features 31 3.5 Use cases 34 3.5.1 Common 34 3.5.2 Admin/ Root Admin 37 3.5.3 Customer 55 3.5.4 Root Admin 73 3.5.5 Guest 83 3.5.6 Auto System 86 3.6 Non-function Requirements 92 3.6.1 Availability 92 3.6.2 Portability 93 3.6.3 Maintainability 93 3.6.4 Usability 93 3.6.5 Security 93 3.7 Entity Relationship Diagram 95 3.8 State Machine Diagram 96 Chapter SOFTWARE DESIGN DESCRIPTION (SDD) 97 4.1 Purpose 97 4.2 Design Overview 97 4.2.1 System Architecture 99 4.2.2 The Elephantry system overview: 97 4.2.3 System Architecture Explanation 98 4.3 System Architectural Design 100 4.3.1 Common Architecture Design 100 4.3.2 Database Diagram Design 108 4.3.3 Database Detail Design 109 4.4 Package Diagram 118 4.5 Class Diagram 124 4.5.1 Customer create new Recommendation 124 4.5.2 Guest sign up 127 4.5.3 Manage Recommendation 129 4.5.4 View statistical information 131 4.6 Sequence Diagram 133 4.7 User Interface Design 136 4.7.1 Customer Create New Recommendation 136 4.7.2 Customer Manage Recommendation 136 4.7.3 Customer View Analysis & Report 137 4.7.4 Customer Send Feedback to Admin 137 4.7.5 Customer Update Profile Information 138 4.7.6 Customer Change Password 138 4.7.7 Customer View Notification 139 4.7.8 Sign up 139 4.7.9 Sign In 140 4.7.10 View Landing Page 141 4.7.11 View Admin Dashboard 141 4.7.12 Root Admin view queue information 142 4.7.13 Root Admin setting queue 142 4.7.14 Root Admin manage customer 143 4.7.15 Root Admin manage admin 143 4.7.16 Root Admin create new admin 144 4.7.17 Root Admin view statistical information 144 4.7.18 View Current Hadoop Information 145 4.7.19 Root Admin View Feedback 145 Chapter SOFTWARE TEST DOCUMENT (STD) 146 5.1 Introduction 146 5.1.1 Purpose 146 5.1.2 Scope of testing 146 5.2 Test Plan 149 5.2.1 Testing Tools and Environment 149 5.2.2 Resource and responsibilities: 153 5.2.3 Test Strategy 153 5.2.4 Test Type 154 5.3 Test Report 165 5.3.1 Unit Testing 165 5.3.2 Integration Testing 167 5.3.3 System Testing 171 5.3.4 Summary Test Report 175 Chapter SOFTWARE USER’S MANUAL 176 6.1 Installation Guide 176 6.1.1 Installation Hadoop in Linux 176 6.1.2 Installation Hadoop Multiple Node 183 6.1.3 Deploy Elephantry System to Server 189 6.2 User’s Guide 195 6.2.1 Admin’s Guide 195 6.2.2 Customer’s Guide 198 References .………………………………………………………………………………203 Acknowledgements This project was completed thanks to the support from many people; we would like to sincerely thank all of them We would first like to thank my thesis advisor Mr Ngo Tung Son of the FPT University He was giving us all support and guidance which made me complete the project duly, although he had busy works schedule, he had always steered us in the right the direction whenever he thought I needed it We would like to thanks to all teachers, educators at FPT University Thanks for your dedicated help and knowledge you had taught us is critical to the completion of this capstone project Finally, we must express our very profound gratitude to our family (and friends) for providing us with unfailing support and continuous encouragement throughout us years of study and through the process of researching and writing this thesis This accomplishment would not have been possible without them Thank you Definitions and Acronyms Acronym Definition Note FU FPT University SRS Software Requirement Specification PM Project manager RC Recommendation CF Collaborative Filtering UC Use Case N/A Not Available URL Uniform Resource Locator HTTP Hypertext Transfer Protocol GUI Graphical User Interface UI User Interface OS Operating System CPU Central Processing Unit RAM Random-Access Memory IDE Integrated Development Environment OS Operation System API Application Programming Interface HDFS Hadoop Distributed File System TC Test Case SDK Software Development Kit Chapter INTRODUCTION 1.1 Introduction Purpose This chapter provides an overview of the project information, review of the existing system and raising a proposal for ideas of improvement Capstone project information Project name: Applied Recommender System and Big Data technology in an online recommender system service Project code: Elephantry Project group name: Bamboo Team Product type: Web application Timeline: From 8th May 2017 to 26th August 2017 Business Type: Software as a service People Supervisors Supervisor Full name Phone E-Mail Title Ngô Tùng Sơn 01644576026 sonnt5@fe.fpt.vn Lecturer Table 1-1 Supervisor’s information Team members # Full name Student code Phone E-mail Role in Group Nguyễn Quang Trung SE03897 0932599945 trungnqse03897@fpt.edu.vn Leader Đào Huy Đạt SE03781 0978756134 datdhse03781@fpt.edu.vn Member SE03494 0169893099 dongnase03494@fpt.edu.vn Member SE03870 0166302329 hieunmse03870@fpt.edu.vn Member SE03824 0915402033 dopkse03824@fpt.edu.vn Member Nguyễn Á Đông Nguyễn Mạnh Hiếu Phạm Khắc Độ Table 1-2 Team member’s information 1.2 Background Recommendation and Related Works In this project, you should understand the definition of the term are “Recommendation” and “Recommendation System or Recommender System (the same meaning)” “Recommendation” is a process, process start from user want to recommend his/ her raw data, go through a module data mining, user will receive recommended data “Recommendation System or Recommender System” is a system can help user get recommend from raw data Go with increasing popularity of digital consumption of content, recommender systems are assisting users in the process identifying items that may be can fulfill their wishes and needs, give a recommendation on an item that user has no experience, to the user Recommender system usually classified into categories represent are: Content-based recommendations, the user will be recommended items like the ones the user preferred in the past Collaborative recommendations, the user will be recommended items that people with similar tastes, and preferences liked in the past And finally, Hybrid approaches, these methods combine collaborative and content-based methods [1] The recommender system can be described as: C is the set of users (customers), S is the set of items (products) in the recommender system u(c,s) measures the expression of user c on item s i.e u: C x S  R where every item of R is nonnegative integers or real numbers within a certain range For each user c ∈ C, the need is to find the item s’ ∈ S that maximize the user’s expression ∀𝑐 ∈ 𝐶 𝑠 ′ = 𝑎𝑟𝑔𝑚𝑎𝑥 (𝑢(𝑐, 𝑠)) 𝑐 𝑠∈𝑆 The major problem in recommender system lies in that user’s expression u is usually not defined on the whole C x S but only on some of its subsets This means u needs to be extrapolated to the whole space C x S The basic idea of collaborative (CF) recommendation systems is that if users shared the same interests in the past, they may have similar tastes in the future [19] The utility u(c,s) of item s for user c is estimated based n the utilities 𝑢(𝑐𝑗, 𝑠) assigned to item s by users 𝑐𝑗 ∈ 𝐶 who are similar to c There have been many collaborative systems developed in the academia and the industry Basically, they are divided into two traditional approaches: User-Based and Item-Based: - User-Based: In this, let denote sim(𝑥, 𝑦) is the similarity between user x and user y, some of common measures used in recommender systems are Pearson’s correlation coefficient and Cosine similarity In the general case, the prediction accuracy of a recommender system was not affected by the selection of the similarity measure [21] To present them let denote 𝑆𝑥,𝑦 = {𝑠 ∈ 𝑆|𝑟𝑥,𝑠 ≠ ∅, 𝑟𝑥,𝑦 ≠ ∅} be the set of all items rated by both users x and y and 𝑟𝑥 , 𝑟𝑦 are the average rate of user x and y In order to give a predicted rate on item s active user, we use an aggregate of the ratings of some other users who have rated on items s (usually, the N most similar) 𝑝𝑟𝑒𝑑(𝑐, 𝑠) = 𝑟𝑐 + ∑𝑏∈𝑁 𝑠𝑖𝑚(𝑐, 𝑏) ∗ (𝑟𝑏,𝑠 − 𝑟𝑏 ) ∑𝑏∈𝑁 𝑠𝑖𝑚(𝑐, 𝑏) - Item-Based: also use the Rating Matrix However, the main idea of item-based algorithms is to compute predictions using the similarity between items instead of users In order to compute the similarity between two items a and b we can also use Cosine similarity or any similarity measure methods We can predict the rating for user u and a product s as follows: ∑𝑏∈𝑅𝑎𝑡𝑒𝑑𝐼𝑡𝑒𝑚𝑠(𝑐) 𝑠𝑖𝑚(𝑖, 𝑠) ∗ 𝑟𝑐,𝑖 𝑝𝑟𝑒𝑑(𝑐, 𝑠) = ∑𝑏∈𝑅𝑎𝑡𝑒𝑑𝐼𝑡𝑒𝑚𝑠(𝑐) 𝑠𝑖𝑚(𝑖, 𝑠) In this project, we implement Collaborative Filtering to demo because the technology of recommender system which our team research supported very detail about item-based in collaborative filtering, in otherwise there are many applications applies Item-based CF or it’s improvement in their recommender system, for example: Amazon.com recommendation constructed item-to-item CF [20] Market Research We’re already seeing that direct mail and newspaper circulars are playing a diminished role in retail marketing Mass advertising will not disappear overnight, but its influence is certainly waning Ads are shifting toward not just digitization but also personalization, powered by increasingly sophisticated algorithms and predictive models that analyze transaction data and digital-media trends (for example, what topics are hot on social networks) Already, 35 percent of what consumers purchase on Amazon and 75 percent of what they watch on Netflix come from product recommendations based on such algorithms [2] The obvious proof of the important about the recommender system in Personalization Battle is [3] Netflix offered a $1 million prize in an open competition to any research team which could improve on the efficiency of their algorithms, because they are realizing the importance of having the best recommendation engine, recommendation system even saves the Netflix company $1 billion per year 10 Figure 6-2 Configure optional feature Linux Virtual Machine in Azure 190 Figure 6-3 Check and Purchase Linux Virtual Machine in Azure Setup Inbound and Outbound security rules of Linux Virtual Machine Inbound allows external request can access to server and outbound allows internal request goes through server to internet Figure 6-4 Configure Linux Virtual Machine in Azure 191 Figure 6-5 Use PuTTY to access Linux server from Windows Setup Java First, add Oracle's PPA, then update your package repository $ sudo add-apt-repository ppa:webupd8team/java $ sudo apt-get update Oracle Java is the latest stable version of Java at time of writing, and the recommended version to install You can so using the following command: $ sudo apt-get install oracle-java8-installer Install Hadoop on Azure Linux Virtual Machine The step is the same with 6.1.1 for single Node and 6.1.2 for multiple Nodes 192 Install Apache Tomcat You have to install the Apache Tomcat to serve Elephantry web-based application You can easy search how to install apache tomcat on the internet, example for: https://www.digitalocean.com/community/tutorials/how-to-install-apache-tomcat-8-on-ubuntu16-04 Install MySQL You can install MySQL follow: https://www.digitalocean.com/community/tutorials/how-toinstall-mysql-on-ubuntu-16-04 Deploy Elephantry Web-app to Apache Tomcat Figure 6-6 Using maven to build web project to war file 193 Figure 6-7 Upload project war file and deploy to Tomcat Server If your war file to large and can not be deployed to Tomcat Server Solution: Go to the web.xml of the manager application (for instance it could be under /tomcat8/webapps/manager/WEB-INF/web.xml) Increase the max-file-size and max-request-size: 52428800 52428800 0 194 6.2 User’s Guide Admin’s Guide Admin Manage Queue To manage queue admin must sign in on website and the steps below: - In dashboard of admin, click “Queue Recommendation” link on left side navigation bar Sub menu of manage queue will display include: Queue Information and Queue Setting Figure 6-8 Click on Queue management - In Queue Information, you can change (swap) the recommendation position or delete it 195 Figure 6-9 You can swap recommendation in queue - In Queue setting, you can configure queue such as: priority scale number (the queue algorithms parameter), final queue max size, prepare queue size, running recommendation max size, threshold number Figure 6-10 Queue setting - You also can start, stop, pause, reset or resume queue status when need View Analysis & Report To view analysis & report admin must sign in on website and the steps below: 196 - In dashboard of admin, click “View Analysis & Report” link on left side navigation bar Figure 6-11 Click on View Analysis & Report - In statistic chart, you can view the statistical information in chart model 197 Figure 6-12 Statistical Chart Customer’s Guide Sign Up Account To sign up an account on website, users the steps below: - Open browser and enter http://elephantry-corp.com:8080/elephantry/ landing is displayed In the homepage, click on “Sign Up” button in the navigation menu Elephantry application will scroll down to choose package section Figure 6-13 Click Sign up button 198 Figure 6-14 Choose package - Redirect to sign up page You will fill up all mandatory field and click next Figure 6-15 Fill mandatory field in sign up page 199 - You have to confirm the payment to finish sign up process Sign In To sign up an account on website, users the steps below: - Open browser and enter http://elephantry-corp.com:8080/elephantry/ landing is displayed In the homepage, click on “Sign In” button in the navigation menu Elephantry application will redirect to Sign in page Fill Account and Password to sign in Figure 6-16 Sign in page Create Recommendation To create a recommendation on Elephantry System, customer have to sign in and the steps below: - In portal of customer, click “Create Recommendation” link on left side navigation bar Create Recommendation page will display 200 Figure 6-17 Create recommendation page - Upload file must be a csv file extension and each line following format: userID,itemID,preference,[,timestamp] “userID” datatype is long, “itemID” datatype is long, “preference” datatype is float, “timestamp” is optional 201 Figure 6-18 Datasets sample - - After your chosen file is validated, you must name for recommendation and you can choose or not some option such as “Set timer for recommendation” checkbox to timer for recommendation the time dataset will start get recommend, set “Number of Item” to set the number of item you want recommend on each user, set “Threshold” to set Threshold of Recommendation Algorithms (Advance option) Click “Create Recommendation System” Manage Recommendation To manage a recommendation on Elephantry System, customer have to sign in and the steps below: - - In portal of customer, click “Manage Recommendation” link on left side navigation bar Sub menu of manage recommendation will display include: Waiting Recommendations, Submitted Recommendations, Running Recommendations, Completed Recommendations (Recommendation will have four conditions in Elephantry System are: recommendation waiting to submit to queue, recommendation submitted to queue, Recommendation is Running Recommend, Running Recommendation Completed) You can manage recommendation in four sub menus 202 Figure 6-19 Recommendation Management References [1] Adomavicius, G., and Tuzhilin, A., Toward the next generation of recommender systems: A survey of state-of-the-art and possible extensions IEEE Transactions on Knowledge and Data Engineering, 17(6): 734-749, 2005 [2] Ian Mackenzie, Chris Meyer, Steve Noble, “How retailers can keep up with consumers”, October 2013 [3] Shabana Arora, “Recommendation Engines: How Amazon and Netflix Are Winning the Personalization Battle”, Jun 2016 [4] Grahamjenson, “List of Recommender Systems”, Jun 2017, https://github.com/grahamjenson/list_of_recommender_systems [5] “What are the top recommendation engine providers for e-commerce in Asia”, Jun 2017, https://www.quora.com/What-are-the-top-recommendation-engine-providers-for-e-commerce-inAsia#!n=24 [6] “What Is Apache Hadoop”, June 2017, http://hadoop.apache.org/ [7] Spring – MVC Framework, https://www.tutorialspoint.com/spring/spring_web_mvc_framework.htm [8] 22 Web MVC Framework, Part VI The Web, https://docs.spring.io/spring/docs/current/spring-framework-reference/html/mvc.html [9] What Is Apache Hadoop?, Welcome to Apache™ Hadoop®!, http://hadoop.apache.org/ [10] Linux, https://en.wikipedia.org/wiki/Linux 203 [11] https://vi.wikipedia.org/wiki/MySQL [12] Thomas Tran and Robin Cohen Hybrid Recommender Systems for Electronic Commerce, 2000 [13] http://www.oracle.com/technetwork/java/codeconventions-150003.pdf [14] Nguyễn Hùng Dũng Nguyễn Thái Nghe, Tạp chí Khoa học Trường Đại học Cần Thơ Phần A: Khoa học Tự nhiên, Công nghệ Môi trường: 31 (2014): 36-51 [15] Nguyễn Thái Nghe Nguyễn Tấn Phong, Tạp chí Khoa học Trường Đại học Cần Thơ Phần A: Khoa học Tự nhiên, Công nghệ Môi trường: 34 (2014): 81-91 [16] https://www.javatpoint.com/spring-mvc-tiles-example [17] https://projects.spring.io/spring-security/ [18] Anders Abel, Test and Verification in Scrum, https://coding.abel.nu/2012/04/test-andverification-in-scrum, 2012-04-12 [19] Guanwen Yao and Lifeng Cai, User-Based and Item-Based Collaborative Filtering Recommendation Algorithms Design, https://cseweb.ucsd.edu/~jmcauley/cse255/reports/wi15/Guanwen%20Yao_Lifeng_Cai.pdf [20] Greg Linden, Brent Smith, and Jeremy York, Amazon.com Recommendations Item-to-Item Collaborative Filtering, January 2010 [21] Lathia, N., Hailes, S., and Capra, L., The effect of correlation coefficients on communities of recommenders In SAC ’08: Proceedings of the 2008 ACM symposium on Applied computing, pages 2000–2005, New York, NY, USA, 2008 ACM [22] Stephan Spiegel , A Hybrid Approach to Recommender Systems based on Matrix Factorization, [23] http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/ 204 ... application: Web application 15 We built a Elephantry recommendation system Elephantry have two main part are Elephantry engine and Elephantry Web-based Elephantry engine using Mahout library as... library [6] Elephantry web-based is connect with Elephantry Engine by Java API using Hadoop library Elephantry web-based has responsibility controlling connecting between users and Elephantry, ... follow schedule plan Communication with members and make report, hold meeting Support in Analysis and design features Execute test Make Final Report Algorithms research’s Responsible for choosing

Ngày đăng: 05/09/2019, 10:48

TỪ KHÓA LIÊN QUAN

w