Discovery Association Rules

20 161 1
Discovery Association Rules

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Mô hình hóa dữ liệu , đề tìa nghiên cứu về phát hiện ứng dụng của luật khai phá dữ liệu trong phân tích thị trường . Sử dụng luật kết hợp trong khai phá dữ liệu với các thuật toán Apriori , fpgrowth để tìm ra luật kết hợp trong thông tin hàng hóa

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY       GROUP MEMBER: Ngo Xuan Quy 20112017 Do Trong Huy 20111648 HàNội, December 2014 DATA MODELING REPORT Association Rules A. Definition  • Example, huge amounts of customer purchase data are collected daily at the checkout counters of grocery stores. TID Items 1 Bread,Milk 2 Bread,Diapers,Beer,Eggs 3 Milk,Diapers,Beers,Cola 4 Bread,Milk,Diapers,Beer 5 Bread,Milk,Diapers,Cola In this table , each row corresponds to a transaction, which contain a unique identifier labeled TID and set of items bought by customers.Such valuable information can be used to support a variety of business-related applications such as marketing promotions,inventory management and cutomer relationship management.  [...]... International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) [3] R Agrawal, T Imielinski, A Swami, “Mining Association Rules between Sets of Items in Very Large Databases [C]”, Proceedings of the ACM SIGMOD Conference on Management of Data, Washington, USA, 1993-05: 207-216 [4] R Agrawal, T Srikant, “Fast Algorithms for Mining Association Rules in Large Database [C]”, Proceedings of 20th... efficiency of both algorithms is evaluated based on time to generate the association rules From the experimental data presented it can be concluded that the FP-growth algorithm behaves better than the Apriori algorithm E REFERENCES [1] S Chai, J Yang, Y Cheng, “The Research of Improved Apriori Algorithm for Mining Association Rules , 2007 IEEE [2] S Chai, H Wang, J Qiu, “DFR: A New Improved Algorithm... of FP-growth Algorithm is an efficient and scalable method for mining the complete set of frequent patterns CONCLUSION The association rules play a major role in many data mining applications, trying to find interesting patterns in data bases In order to obtain these association rules the frequent sets must be previously generated The most common algorithms which are used for this type of actions are... Itemsets {A} {B} {C} {E} {A C} {B C} {B E} {C E} {B C E} Step 2: Generate strong association rules from the frequent itemsets Lattice Closed Itemset: support of all parents are not equal to the support of the itemset Maximal Itemset: all parents of that itemset must be infrequent Keep in mind: II FP-Growth  Allows frequent itemset discovery without candidate itemset generation Two step approach: • Step 1:... decreased, the execution time for both algorithms is decreased For the 3627 instances of supermarket data set, APriori requires 47 Seconds but FP-growth requires only 3seconds for generating the association rules Figure 1 In the above Figure 1, the performance of Apriori is compared with FP-growth, based on time For each algorithm, three different size of data set were considered with sizes of 3627,... FP-Tree D Comparison between Apriori and FP-Growth algorithms METHODOLOGY The two association rule mining algorithms were tested in WEKA software of version 3.6.1 WEKA software is a collection of open source of many data mining and machine learning algorithms, including preprocessing on data, Classification, clustering and association rule extraction.The performance of Apriori and FP-growth were evaluated... time of Apriori and FP-growth.for various confidence level When Confidence level is high, the time taken for both algorithms is also high While the Confidence level is 0.5, the time taken to generate the association rule is 15seconds in Apriori and 1 second in FP-growth Figure 2 shows the relationship between the time and confidence In this graph, x axis represents the time and y axis represents the Confidence

Ngày đăng: 19/01/2015, 08:45

Mục lục

  • HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

  • A. Definition

    • Problems 1: market basket transactions

    • B. Basics for Discovery Association Rule

    • Solution 

      • Step 1: Find all Frequent Itemsets

        • Frequent Itemsets

        • Step 2: Generate strong association rules from the frequent itemsets

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan