1. Trang chủ
  2. » Giáo Dục - Đào Tạo

FSKYMINE A Faster Algorithm For Mining Skyline Frequent Utility Itemsets

21 2 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

FSKYMINE A Faster Algorithm For Mining Skyline Frequent Utility Itemsets. PowerPoint Presentation FSKYMINE A Faster Algorithm For Mining Skyline Frequent Utility Itemsets Good morning, chair, ladies and gentlemen My name is Cheng Wei, Wu I am a PhD student from National Che.

Trang 1

FSKYMINE A Faster Algorithm For

Mining Skyline Frequent Utility Itemsets

Trang 2

Frequent itemsets mining (FIM)

High-utility itemsets mining (HUIM)

Skyline frequent utility itemsets mining (SFUIM)

Trang 3

1) Difficulty to specify the minSup value

2) Ignoring the item utilities like weight, unit profit, and quantity, meanwhile such aspects are preferable in practical problems

High-utility itemsets mining (HUIM)

Overcome the second limitation of FIM by using both profits and quantities of products in transactions to extract actual utility values of itemsets.

The problem of both FIM and HUIM it is that they require

choosing an threshold for minimum support and utility by users It is very difficul to choice an appropriate

Skyline frequent utility itemsets mining (SFUIM)

Trang 4

Skyline frequent utility itemsets mining (SFUIM)

Trang 5

Utility of an item ip in the transaction Td

u(ip ,Td ) = q(ip, Td ) × p(ip)

High Utility Itemset

An itemset X is called a high utility itemset iff

u(X) > min_utiliy

i.e., min_utility = 30,

{B}: 16 is a low utility itemset ; {BD}: 30 is a high utility itemset

ItemABCDEFGUnit

Transactional Database

Trang 6

Utility of an item ip in the transaction Td

u(ip ,Td ) = q(ip, Td ) × p(ip)

High Utility Itemset

An itemset X is called a high utility itemset iff

u(X) > min_utiliy

i.e., min_utility = 30,

{B}: 16 is a low utility itemset ; {BD}: 30 is a high utility itemset

ItemABCDEFGUnit

Transactional Database

Trang 7

Utility of an item ip in the transaction Td

u(ip ,Td ) = q(ip, Td ) × p(ip)

High Utility Itemset

An itemset X is called a high utility itemset iff

Transactional Database

Trang 8

Utility of an item ip in the transaction Td

u(ip ,Td ) = q(ip, Td ) × p(ip)

High Utility Itemset

An itemset X is called a high utility itemset iff

u(X) > min_utiliy

i.e., min_utility = 30,

{B}: 16 is a low utility itemset ; {BD}: 30 is a high utility itemset

ItemABCDEFGUnit

i.e., u({AD}) = u({AD}, T1) + u({AD}, T3) = 7 +

{BE}:31, {BCE}:37, {ACE}:31{BD}:30, {BCD}:34, {BDE}:36

{BCDE}:40, {ABCDEF}:30

min_utility = 30

T1 (A,1)(C,1)(D,1)T2 (A,2)(C,6)(E,2)(G,5)T3 (A,1)(B,2)(C,1)(D,6)(E,1)(F,5)T4 (B,4)(C,3)(D,3)(E,1)T5 (B,2)(C,2)(E,1)(G,1)

Transactional Database

Trang 9

An itemset X is said to dominate

another itemset Y in D, denoted as X≻Y iff f(X)≥f(Y) and u(X) ≥u(Y).

An itemset is skyline frequent utility itemset iff it is not dominated by any other itemset in the database

ItemABCDEFGUnit

Transactional Database

{C}: sup=5, Util=13;{C, E}: sup=4, Util=27;{B, C, E}: sup=3, Util=31;{B, C, D, E}: sup=2, Util=40.

Có thể thấy ở đây: Util(A)=5+10+5=20Sup(A)=3

Bị dominated bởi {B, C, E}=> {A} không phải là SFUI

Trang 10

SKYMINE2 [9]

Limitations: The algorithm performs numerous operations of joining two utility lists and generates numerous utility lists and potentials SFUIs.

[6] Vikram Goyal, Ashish Sureka, and Dhaval Patel Efficient skyline itemsets

mining In Proceedings of the Eighth International Conferenceon Computer Science & Software Engineering, pages 119–124 ACM, 2015

[9] Jerry Chun-Wei Lin, Lu Yang, Philippe Fournier-Viger, Siddharth

Dawar, Vikram Goyal, Ashish Sureka, and Bay Vo A more efficient

algorithm to mine skyline frequent-utility patterns In International

Conference on Genetic and Evolutionary Computing, pages 127–135.

Springer, 2016.

Trang 11

Proposed Algorithm

FMSFUI (Faster Algorithm For Mining Skyline Utility Itemsets)

Frequent-• We propose:

• a mechanism named remaining transaction-weighted

utility cooccurrence of pair item x, yin a database

SD is denoted as rtwuc(x, y).

• And a data structure name extent utility list of an itemset in a DB

Trang 12

T1 (A,1)(C,1)(D,1)T2 (A,2)(C,6)(E,2)(G,5)T3 (A,1)(B,2)(C,1)(D,6)(E,1)(F,5)T4 (B,4)(C,3)(D,3)(E,1)T5 (B,2)(C,2)(E,1)(G,1)

T3 (F,5) (D,6)(B,2)(A,1) (E,1) (C,1) T4 (D,3)(B,4) (E,1) (C,3)

T5 (G,1)(B,2) (E,1)(C,2)

revisedTransactional Database

Trang 13

Proposed Algorithm

FMSFUI (Faster Algorithm For Mining Skyline Frequent-Utility Itemsets)

The remaining transaction-weighted utility

cooccurrence of pair item x; y in a database SD is

denoted as rtwuc(x, y) and defined as the sum of the remaining transaction-weighted utility co-

occurrence of pair item x, y in all transactions containing both of the item x; y in the database.

rtwuc

Caculate rtwuc(x,y)

Trang 14

(F,5) (D,6)(B,2)(A,1) (E,1) (C,1)

T4 (D,3)(B,4) (E,1) (C,3)T5 (G,1)(B,2) (E,1)(C,2)

revisedTransactional Database

ItemABCDEFGUnit

5212311

Trang 15

Trans(xy).itemSetutil=Trans(x).itemSetutils +Trans(x).itemUtils Trans(xy) itemUtils =Trans(y).itemUtils

Trans(xy).rutils= Trans(y).rutils

Trang 16

Proposed Algorithm

FMSFUI (Faster Algorithm For Mining Skyline Utility Itemsets)

Frequent-The maximal utility of the frequency value r is

denoted as umax[r] and defined as the

maximal utility of itemsets having the same frequency value r.

sumItemUtils) Given an itemset Px having occurrence frequency is r If the sum of sumItemSetutils and sumItemUtils values of extent utility list of Px is higher than or equal to umax(r) then Px is a potential skyline frequent-utility itemset.

Trang 17

Proposed Algorithm

FMSFUI (Faster Algorithm For Mining Skyline Utility Itemsets)

itemsets Px and Py such that Px having occurrence frequency is r, Py having occurrence frequency is r

1

If

min(Px.sumitemsetutils,

than umax(r) or umax(r

1

) then Pxy and all extensions of Pxy are not SFUIs.

Trang 18

Performance Evaluation

Compared Algorithms

FSKYMINE (PROPOSED ALGORITHM)

Platform for Experiment

Intel® Core 5 Quad Processor @ 2.30 GHz 8 Gigabyte Memory

Implement in Java LanguageRunning on Windows 10

Trang 19

Performance evaluation

Trang 20

In this paper, we proposed a very fast algorithm namely

FSKYMINEfor efficiently mining skyline frequent utility itemsets.

We proposed a mechanism named remaining transaction-weighted utility cooccurrence of pair item x, y in a database SD is denoted as rtwuc(x, y) and a data structure name extent utility list of an itemset

in a DB And based on these, we develop strategy of

Pruning) to reduce the number of join operations in mining process skyline frequent utility itemsets.

significantly outperforms SKYMINE2.

System Lab, NCKU, Taiwan

Trang 21

Thanks for your attention

Hung Manh Nguyen

Le Quy Don Technical University

Hanoi, Vietnam

Anh Viet Phan

Le Quy Don Technical University

Hanoi, Vietnamanhpv@mta.edu.vn

Lai Van Pham

Military Science and Technology Institute

Hanoi, Vietnamgarry@cinnamon.is

Ngày đăng: 08/11/2022, 14:02

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN