Aƒpj"ikƒ hi 羽u qu違

Một phần của tài liệu Rút trích các cụm từ khóa dựa trên vai trò và đặc điểm của các cụm từ trong văn bản (Trang 49 - 79)

Khi th穎 nghi羽m t壱p Wiki-42"vt‒p" UgokTcpm."8隠 v k"pj壱n th医y r茨ng h茨ng s嘘 8逢嬰c c瓜pi"vj‒o"x q"8吋 gia c嘘 xck"vt”"Ðc映m t瑛 mj„cÑ"ejq"eƒe"c映m t瑛 mj„c"dcp"8亥u b違o

8違m r茨ng nh英ng c映m t瑛 mj„c"dcp"8亥w"p {"nw»p"e„"ikƒ"vt鵜 ecq"j挨p"j鰯n so v噂k"eƒe"

c映m t瑛e”p"n衣i ucw"mjk"swƒ"vt·pj"8ƒpj"ikƒ"pi英 pij c"m院v"vj¿e0 Ak隠w"p {"e„"pij c"n " ej¿pi"nw»p"n "pj英ng ph亥n t穎e„"pjk隠u kh違p<pi"pj医v"8逢嬰c ch丑p"n o"mj„c0

Zƒe"8鵜nh s嘘n逢嬰ng c映m t瑛mj„c"dcp"8亥u

Pj逢"8«"8隠 c壱p 荏 M映c 4.2."rj逢挨pi"rjƒr"c映m t瑛 tr丑pi"v¤o e„"jck"8嘘i s嘘 c亥n

8逢嬰c kh違q"uƒv<"u嘘 n逢嬰pi"e¤w 8吋 t¿v"vt ej"v壱p eƒe"e映m t瑛 mj„c"dcp"8亥w"x "s嘘 n逢嬰ng c映m t瑛mj„c"dcp"8亥ụ

A吋 e„"vj吋 zƒe"8鵜pj"8逢嬰c s嘘 n逢嬰pi"e¤w"c亥n l医y cho m厩k"v k"nk羽u, l亥p"n逢嬰t t瑛 3

8院p":"e¤w"e„"vt丑ng s嘘 cao nh医v"8逢嬰c ch丑n. 雲 8¤{."fq s嘘 n逢嬰pi"e¤w"e„"8逢嬰c sau ba

d逢噂c ch丑n l詠c"pj逢"vt‒p"n "mjƒe"pjcw"ejq"o厩k"v k"nk羽w."p‒p"s嘘n逢嬰pi"e¤w"sw{"8鵜nh l医y ra ch雨n "eqp"u嘘 t嘘k"8c"8逢嬰c l医ỵ

T衣i m厩k"vt逢運ng h嬰p, nh英pi"e¤w"p {"épi"x噂k"vk‒w"8隠 s胤8逢嬰e"rj¤p"v ej"8吋 ch丑n ra t壱p eƒe"e映m t瑛 mj„c"dcp"8亥w0"A吋 zƒe"8鵜nh s嘘 n逢嬰ng c映m t瑛 mj„c"dcp"8亥w"vj ej"

h嬰p cho m厩k"vt逢運ng h嬰p, l亥p"n逢嬰t m瓜t s嘘 n逢嬰ng c映m t瑛mj„c"dcp"8亥u vjc{"8鰻i t瑛 3

8院p";"8逢嬰c ch丑n. V壱y khi k院t h嬰r"jck"8嘘i s嘘, t鰻ng c瓜pi"e„"64"vt逢運ng h嬰p 8逢嬰c th穎

nghi羽m.

J·pj"7.1 bi吋u di宇n k院t qu違8衣v"8逢嬰c v噂i s嘘n逢嬰pi"e¤w"vjc{"8鰻i t瑛5"8院p":"x "u嘘 n逢嬰ng c映m t瑛 kj„c"dcp"8亥u t瑛 3 8院n 9. M員e"f́"u嘘 n逢嬰pi"e¤w"vjc{"8鰻k"pj逢pi"jk羽u su医t thu 8逢嬰c mj»pi" e„" u詠 mjƒe" dk羽t t " t羽t gi英c" eƒe" vt逢運ng h嬰p. Vt逢運ng h嬰p s嘘 n逢嬰pi"e¤w"8逢嬰c ch丑n b茨ng 5,6,7,8, hi羽u su医t g亥p"pj逢"v逢挨pi"8逢挨pị"8k隠w"p {"ejq"

th医y s嘘n逢嬰pi"e¤w"o噂k"vj‒o"x q"mj»pin o"vjc{"8鰻i nhi隠u t壱p eƒe c映m t瑛mj„c"dcp" 8亥w0" D逢噂c l丑c sau khi ch丑p" e¤w" n " f詠c" vt‒p" u嘘 l亥n l員p l衣i c栄a c映m t瑛 (TF), v壱y nh英ng c映m t瑛 thu 8逢嬰c khi s嘘 n逢嬰pi"e¤w"o噂k"vj‒o"x q"8«"mj»pi"n噂p"j挨p"uq"x噂i nh英ng c映m t瑛8«"8逢嬰c ch丑p"vt逢噂e"8„0"K院t qu違 cao nh医v"8衣t 8逢嬰e"n "mjk"u嘘n逢嬰pi"e¤w"

t嘘k"8c"d茨pi"6"x "ej丑n ra 7 c映m t瑛n o"t壱p mj„c"dcp"8亥ụ

J· pj"5.1. Aげ th biu din hiu sut thu 8⇔ぢc khi s dつpi"rj⇔¬pi"rjƒr"cm t

trがpi"v¤o, sぐn⇔ぢpi"e¤w"vjc{"8ごi tな5"8xp":"x "uぐn⇔ぢng cm tなmj„c"dcp"8Zu thay

8ごi tな5"8xn 9. GS3: e„"5"e¤w"8⇔ぢc chがp"8あt¿v"vt ej"vなmj„c"dcp"8Zw."v⇔¬pi"vば cho

eƒe"IU6å GS8. Trつe"z"n "uぐ n⇔ぢng cm tなmj„c"dcp"8Zụ Trつe"{"n "ikƒ"vtお F1 thu

8⇔ぢe"v⇔¬pi"とng vi mざk"IU"x "uぐn⇔ぢng cm tなmj„c"dcp"8Zu

0 5 10 15 20 25 3 4 5 6 7 8 9 GS3 GS4 GS5 GS6 GS7 GS8

B違ng 5.1 li羽v"m‒"chi ti院t eƒe"hi羽u su医t 8衣v"8逢嬰c khi s穎 d映ng rj逢挨pi"rjƒr c映m t瑛 tr丑pi"v¤o cho SemiRank t衣i s嘘n逢嬰pi"e¤w"v嘘k"8c"n "6"x "u嘘 c映m t瑛mj„c"dcp"8亥u

n "90 Hi羽u su医t v<pi"f亥p"8院n khi s嘘n逢嬰ng c映m t瑛mj„c"dcp"8亥w"n "90 Khi s嘘mj„c"dcp" 8亥u b茨ng 3, n¿e"p {"v壱p eƒe"e映m t瑛mj„c"pj壱p"8逢嬰e"ucw"mjk"8ƒpj"ikƒ"pi英pij c"dcq"

g欝o"eƒe"c映m t瑛 vtqpi"v k"nk羽w"o "vt逢噂e"8„"p„"mj»pi"vjw瓜c t壱r"mj„c"dcp"8亥u, hi羽u su医v"8衣v"8逢嬰c th医r"j挨p"jk羽u su医t c栄a SemiRank khi s穎 d映pi"vk‒w"8隠. Ak隠w"p {"ej泳ng t臼 khi s嘘 n逢嬰ng mj„c"dcp"8亥u b茨ng 3, eƒe"c映m t瑛vtqpi"vk‒w"8隠 v磯n n鰻i tr瓜k"j挨p"so v噂k"eƒe"c映m t瑛vtqpi"eƒe"e¤w"8逢嬰c ch丑n m員v"f́"ikƒ"vt鵜 TF c栄c"ej¿pi"e„"vj吋 nh臼j挨p. T衣i s嘘 n逢嬰pi"mj„c"dcp"8亥u b茨ng 7, hi羽u su医t c栄a SemiRank s穎 d映pi"rj逢挨pi"rjƒr"

c映m t瑛 tr丑pi"v¤o l噂p"j挨p"uq"x噂i vi羽c s穎 d映ng vk‒w"8隠n "406'0 P R F1 SemiRank - Vk‒w"8隠: 2.35 tk-d8 20.9 18.7 19.5 PP c映m t瑛 tr丑pi"v¤o: 1 3 tk-d8 19.5 17.5 18.2 2 4 tk-d8 21.1 19.0 19.7 3 5 tk-d8 21.5 19.2 20.0 4 6 tk-d8 22.5 20.2 21.0 5 7 tk-d8 23.5 21.0 21.9 6 8 tk-d8 22.7 20.3 21.2 7 9 tk-d8 22.5 20.1 20.9

DV pi"5-1. Hiu sut ca SemiRank khi s dつpi"vk‒w"8zx "uぬ dつpi"rj⇔¬pi"rjƒr"

cm t trがpi"v¤o. Hiu suXv"8Tv"8⇔ぢc khi sぐe¤w"8⇔ぢc chn bng 4. P: precision, R: recall.

B違ng 5.2 li羽t m‒"jk羽u su医t chi ti院t c栄a SemiRank khi s穎 d映pi"rj逢挨pi"rjƒr"8員c

8k吋o"vj»pi"vkp0"M院t qu違 v<pi"f亥p"8院n khi s嘘 mj„c"dcp"8亥w"n ":0"E pi"pj逢"rj逢挨pi" rjƒr"c映m t瑛 tr丑pi"v¤o, khi s嘘 n逢嬰pi"mj„c"dcp"8亥u b茨ng 3, hi羽u su医t F1 vjw"8逢嬰c th医r"j挨p"uq"x噂k"rj逢挨pi"rjƒr"UgokTcpk s穎 d映pi"vk‒w"8隠0"Ak隠w"p {"ejq"vj医y t壱r"eƒe"

8亥u l医y t瑛rj逢挨pi"rjƒr"8員e"8k吋o"vj»pi"vkp"mjk"u嘘n逢嬰pi"mj„c"dcp"8亥u b茨ng 3. Hi羽u su医v"8衣v"ikƒ"vt鵜 cao nh医t khi s嘘mj„c"dcp"8亥u b茨pi":0"N¿e"p {"jk羽u su医v"8衣v"8逢嬰c so v噂i SemiRank s穎 d映pi"vk‒w"8隠n "60;'0 P R F1 SemiRank-Vk‒w"8隠: 2.35 tk-d8 20.9 18.7 19.5 RR"8員e"8k吋o"vj»pi"vkp< 1 3 tk-d8 19.6 17.6 18.3 2 4 tk-d8 23.7 21.3 22.1 3 5 tk-d8 23.7 21.3 22.1 4 6 tk-d8 23.9 21.5 22.4 5 7 tk-d8 24.8 22.3 23.2 6 8 tk-d8 26.2 23.3 24.4 7 9 tk-d8 25.8 23.0 24

DV pi"5-2. Hiu sut ca SemiRank khi s dつpi"vk‒w"8zx "uぬ dつpi"rj⇔¬pi"rjƒr" 8pe"8kあo"vj»pi"vkp

Hi羽u qu違 khi k院t h嬰p v噂i m嘘i quan h羽 ng英pij c"vtqpi"UgokTcpm

A吋 zƒe"8鵜nh 違pj"j逢荏ng c栄a y院u t嘘 ng英 pij c"n‒p"jk羽u su医v"8衣v"8逢嬰ẹ"8隠 v k"uq"uƒpj"

t壱r"eƒe"e映m t瑛 mj„c"dcp"8亥u v噂i t壱r"eƒe"e映m t瑛 mj„c"ucw"mjk"vj詠c thi gi違i thu壱t PhraseRank. 5 c映m t瑛 mj„c"dcp"8亥w"e„"vt丑ng s嘘 cao nh医t c栄a c違 hak"rj逢挨pi"rjƒr"

c映m t瑛 tr丑pi"v¤o x "8員e"8k吋o"vj»pi"vkp"8逢嬰c ch丑p"x "eqk"pj逢"n "v壱r"mj„c"ew嘘k"épi" 8吋 so v噂i k院t qu違 8衣v"8逢嬰c c栄c"jck"rj逢挨pi"rjƒr"ucw"mjk"8ƒpj"ikƒ"pi英 pij c. K院t qu違 8衣v" 8逢嬰c pj逢" vtqpi" D違ng 5.3. So v噂i t壱r" mj„c" dcp" 8亥u, SemiRank s穎 d映ng

rj逢挨pi rjƒr"c映m t瑛 tr丑pi"v¤o ik¿r"v<pi"jk羽u su医v"n‒p"30;'"x "x噂k"rj逢挨pi"rjƒr" 8員e" 8k吋o" vj»pi" vkp" ik¿r" v<pi" jk羽u su医v" n‒p" 405'0" V壱r" mj„c" dcp" 8亥u c栄c" rj逢挨pi" rjƒr"8員e"8k吋o"vj»pi"vkp"ejq"ik¿r"mjck"vjƒe"pi英pij c"v嘘v"j挨p"v壱r"mj„c"dcp"8亥u c栄a

P R F1 PP c映m t瑛 tr丑pi"v¤o: C映m t瑛mj„c"dcp"8亥u 21.5 19.2 20.0 Ucw"mjk"8ƒpj"ikƒ"pi英pij c 23.5 21. 21.9 (+1.9%) RR"8員e"8k吋o"vj»pi"vkp< C映m t瑛mj„c"dcp"8亥u 23.7 21.3 22.1 Ucw"mjk"8ƒpj"ikƒ"pi英pij c 26.2 23.3 24.4 (+2.3%)

DV pi"5-3. Hiu sut ca tfr"eƒe"eつm tなmj„c"dcp"8Zu so vi tfr"eƒe"eつm tなmj„c" ucw"mjk"8ƒpj"ikƒ"piのpij c

Uq"uƒpj"x噂k"eƒe"rj逢挨pi"rjƒr"mjƒe

雲8¤{."8隠v kuq"uƒpj"jk羽u su医t c栄a SemiRamk s穎 d映ng jck"rj逢挨pi"rjƒr"8隠 xu医t v噂i hai rj逢挨pi" rjƒr" t¿v" vt ej" v瑛 mj„c" MGC" x " MGC--" f pj" ejq" Ykmk0"KEA [23] n " rj逢挨pi"rjƒr"t¿v"vt ej"t壱r"eƒe"e映m t瑛mj„c"f詠c"vt‒p"rj逢挨pi"rjƒr"j丑e"oƒ{"*ocejkpg"

learning). KEA s穎 d映pi"eƒe"8員e"8k吋o"VH,KFH."HQE"x "mg{rjtcugpguu"8吋z¤{"f詠ng

o»"j·pj"j丑c oƒ{. KEA++ [13] n "o瓜t c違i ti院n c栄a KEA, KEA++ s穎 d映pi"vj‒o"jck" 8員e"8k吋o"8吋z¤{"f詠pi"o»"j·pj"j丑e"oƒ{<"ejk隠w"f k"e栄a c映m t瑛mj„c"x "s嘘n逢嬰ng nk‒p"

k院v"o "o瓜t c映m t瑛e„"8逢嬰c v噂i nh英ng c映m t瑛 mjƒe"vtqpi"x<p"d違n. KEA++ b臼 qua

8員e" 8k吋o" mg{rjtcugpguu" vtqpi" rj逢挨pi" rjƒr" e栄a m·pj0 Do eƒe" e映m t瑛 ti隠o" p<pi"

trong MGC"x "MGC"--"mj»pi"nk‒p"m院t v噂k"d k"xk院t Wikipedia, p‒p"vtqpi"pijk‒p"e泳u c栄c"o·pj"[14] 8«"k院t h嬰r"rj逢挨pi"rjƒr"ej丑n l詠a eƒe"e映m t瑛 mj„c"vi隠m p<pi vj»pi"

qua Wikipedia c栄c"o·pj"x噂i eƒe"8員e"8k吋m c栄c"MGC"x "MGC--0"

[14] 8q" 8衣c hi羽u su医t c栄c" MGC" x " MGC--" f詠c" vt‒p" rjfir" 8q" eqpukuvgpe{0" Pj逢pi"F1 x "eqpukuvgpe{"8隠w"e„"k院t qu違 pj逢"pjcw, p‒p"荏 8¤{"8隠 v k"eqk"ikƒ"vt鵜 o "

[14] vjw"8逢嬰e"pj逢"n "m院t qu違 d詠c"vt‒p"F1.

B違ng 5.4 th吋 hi羽n hi羽u su医t c栄c"UgokTcpm"vt‒p"jck"rj逢挨pi"rjƒr"8隠 xu医t so v噂i

MGC"x "MGC--"rjk‒p"d違n Wikị Hi羽u su医t c栄c"MGC"x "MGC--"8逢嬰c l医y t瑛 [14]. Hi羽u xu医t c栄a SemiRank khi s穎 d映pi"rj逢挨pi"rjƒr"8員e"8k吋o"vj»pi"vkp"ejq"m院t qu違

P R F1

SemiRank-Vk‒w"8隠 20.9 18.7 19.5

KEA cho Wiki - - 20.2

KEA++ cho Wiki - - 22.6

PP c映m t瑛 tr丑pi"v¤o (7 tk-d8) 23.5 21.0 21.9

RR"8員e"8k吋o"vj»pi"vkp"*: tk-d8) 26.2 23.3 24.4

DV pi"5-4. Hiu sut cてc"eƒe"rj⇔¬pi"rjƒr"t¿v"vt ej"eつm tなmj„c"mjƒe"pjcw"vt‒p"vfp d liu Wiki-20

S穎 d映pi"rj逢挨pi"rjƒr"rj¤p"pj„o"Ycnmvtcr

Oqfwnctkv{"vjw"8逢嬰c khi s穎 d映ng gi違i thu壱t [17] e„" ikƒ"vt鵜 th医r."e„"pij c"n "ej医t

n逢嬰pi"rj¤p"pj„o"mj»pi"ecq, mang nhi隠u kh違p<pirj¤p"pj„o"f詠c"x q"pi磯w"pjk‒p.

A隠 v k"8«"vj穎 nghi羽o"vt‒p"v壱p Wiki-20 m瓜t s嘘 rj逢挨pi"rjƒr"rj¤p"pj„o"mjƒe"8逢嬰c hi羽n th詠e" vtqpi" kitcrj." vtqpi" 8„" rj逢挨pi" rjƒr" Ycnmvtcr" [20] mang l衣k" ikƒ" vt鵜

modularity cao nh医t [0.19, 0.38], h亥u h院v"n "vtqpi"mjq違ng 0.3.

PP c映m t瑛 tr丑pi"v¤o RR"8員e"8k吋o"vj»pi"vkp P R F1 P R F1 1 3 tk-d8 19.6 17.6 18.3 20.7 18.5 19.3 2 4 tk-d8 21.8 19.6 20.4 22.5 22 22.9 3 5 tk-d8 21.5 19.2 20.0 23.7 21.3 22.2 4 6 tk-d8 21.9 19.7 20.5 24.7 22.2 23.2 5 7 tk-d8 21.5 19.3 20.1 25.3 22.7 23.6 6 8 tk-d8 20.9 18.7 19.5 25.1 22.4 23.4 7 9 tk-d8 19.3 17.3 18.0 23.5 21.2 22

DV pi"5-5. Hiu suXv"8Tv"8⇔ぢc khi s dng gii thufv"rj¤p"pj„o"Ycnmvtcr.

Vq p b瓜swƒ"vt pj"t¿v"vt ej"e映m t瑛mj„c"8逢嬰c ti院p"j pj"n衣i pj逢pi"vjc{"rj逢挨pi" rjƒr"rj¤p"pj„o"Newman b茨pi"rj逢挨pi"rjƒr Walktrap [20]. B違ng 5.5 th吋 hi羽n hi羽u su医v" 8衣v" 8逢嬰c c栄c" UgokTcpm" ejq" jck" rj逢挨pi" rjƒr"c映m t瑛 tr丑pi" v¤o x " rj逢挨pi" rjƒr"8員e"8k吋o"vj»pi"vkp0"U嘘n逢嬰pi"e¤w"8逢嬰c ch丑p"vtqpi"rj逢挨pi"rjƒr"c映m t瑛 tr丑ng

Walktrap nh臼j挨p"uq"x噂k"rj逢挨pi"rjƒr Newman0"Rj逢挨pi"rjƒr"c映m t瑛 tr丑pi"v¤o8衣t

8逢嬰e"n "4207'"uq"x噂k"430;'"vtqpi"vj "pijk羽o"8亥w"x "rj逢挨pi"rjƒr"8員e"8k吋o"vj»pi" vkp"8衣v"8逢嬰e"n "4508'"uq"x噂k"4606'"vtqpi"vj "pijk羽m dcp"8亥u s穎 d映ng rj逢挨pi"rjƒr"

Ej逢挨pi"80 TNG KT

Ej逢挨pi" 8" vt·pj" d k" v„o" v逸t k院t qu違 8衣v" 8逢嬰c c栄c" 8隠 v k" x " j逢噂pi" rjƒv" vtk吋n cho nh英pi"pijk‒p"e泳u ti院p theọ

6.1 Eƒe"8„pi"i„r

Pj逢"8«"vt·pj"d {"荏 vt‒p."8隠 v k"8«"8隠 xu医v"jck"rj逢挨pi"rjƒr"z¤{"f詠ng t壱p t瑛 mj„c" dcp"8亥w."rj逢挨pi"rjƒr"c映m t瑛 tr丑pi"v¤o x "rj逢挨pi"rjƒr"u穎 d映pi"8員e"8k吋o"vj»pi"

tin c栄c"eƒe"e映m t瑛mj„c0"Rj逢挨pi"rjƒr"c映m t瑛 tr丑pi"v¤ov·o"mk院m nh英pi"e¤w"fk宇n t違

n瓜i dupi"ej pj"e栄c"v k"nk羽w"x "rj逢挨pi"rjƒr"8員c di吋o"vj»pi"vkp"m院t h嬰r"jck"8員e"8k吋m

VH"x "HQE"8吋 8ƒpj"ikƒ"v亥m quan tr丑ng c栄c"eƒe"e映m t瑛 ttqpi"v k"nk羽ụ Jck"rj逢挨pi" rjƒr"p {"n o"v<pi"jk羽u su医t c栄a SemiRank so v噂i vi羽c s穎 d映pi"vk‒w"8隠0"Vtqpi"8„." rj逢挨pi"rjƒr"u穎 d映pi"8員e"8k吋o"vj»pi"vkp"v臼 ra t嘘v"j挨p"uq"x噂k"rj逢挨pi"rjƒr"c映m t瑛

tr丑pi"v¤o k吋 c違vt逢噂e"x "ucw"mjk"8ƒpj"ikƒ"pi英pij c0

Vtqpi"pijk‒p"e泳u c栄c"o·pj"8隠v k"8«"m院t h嬰r"eƒe"8員e"vt逢pi"mjƒe"pjcw"e栄a c映m t瑛mj„c"8吋 t¿v"vt ej"v壱r"eƒe"v瑛 mj„c"dcp"8亥u, t瑛8„"8ƒpj"ikƒ"n衣i t亥m quan tr丑ng c栄a

ej¿pi"vj»pi"swc"o嘘i quan h羽 ng英 pij c"ik英c"ej¿pi. A隠 v k"ejq"vj医y r茨ng vi羽c s穎

d映ng k院t h嬰p nh英pi"8員e"vt逢pi"mjƒe"x噂i ng英 pij c"p¤ng cao hi羽u su医v"t¿v"vt ej t壱p

eƒe"e映m t瑛mj„c"8衣i di羽p"ejq"x<p"d違n.

6.2 J逢噂pi"rjƒv"vtk吋n

A隠v k"8«"ik違 8鵜nh vi羽c gia c嘘 c栄a m瓜t c映m t瑛 mj„c"dcp"8亥w"n‒p"ej pj"p„"mj»pi"違nh

j逢荏pi"8院p"swƒ"vt·pj"ej丑n l詠c"eƒe"e映m t瑛mj„c"ew嘘k"épi, sau khi th詠c thi gi違i thu壱t PhraseRank. V·"ikƒ"vt鵜 t詠 gia c嘘p {"vjc{"8鰻i cho t瑛ng c映m t瑛mj„c"dcp"8亥w"x "p„"

ph映 thu瓜e"x q"o嘘i quan h羽 v噂k"eƒe"e映m t瑛mjƒe"vtqpi"v k"nk羽w."p‒p"e亥n m荏 r瓜ng xem

A隠 v k"ej雨 m噂k"ƒr"f映ng m瓜t s嘘 8員e"vt逢pi"e栄a c映m t瑛 mj„c"8吋 t衣o ra t壱r"eƒe"

c映m t瑛 mj„c"dcp"8亥u, v磯p"e”p"pjk隠w"8員e"vt逢pi"e„"vj吋 vj‒o"x q"8吋 p¤pi"ecq"ej医t

n逢嬰ng c栄a t壱r"eƒe"e映m t瑛mj„c"dcp"8亥w"p {0"

E„"vj吋 tej"j嬰r"jck"rj逢挨pi"rjƒr"8隠 xu医t 8«"p‒w."rj逢挨pi"rjƒr c映m t瑛 tr丑ng

v¤o x "rj逢挨pi"rjƒr"8員e"8k吋o"vj»pi"vkn x q"eƒe"o»"j·pj"e„"u穎 d映ng t壱r"eƒe"e映m t瑛 mj„c"dcp"8亥w"mjƒe"8吋 kh違q"uƒv"jk羽u su医t c栄a vi羽e"v ej"j嬰r"eƒe"8員e"8k吋o"mjƒe"pjcw0

THAM KHO

1. Bordea, G., Buitelaar, P.: DERIUNLP: A Context Based Approach to Automatic Keyphrasẹ In: Proc. of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp. 146Î149 (2010)

2. D'Avanzo, Ẹ, Magnini, B., Vallin, Ạ: Keyphrase Extraction for Summarization Purposes: the LAKE System at DUC2004. In: Proc. of the Document Understanding Conference, Boston, USA (2004)

3. El-Beltagy, S. and Rafea, Ạ: KP-Miner: Participation in SemEval-2. In: Proc. of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp. 190Î193 (2010)

4. Grineva, M., Grinev, M., Lizorkin, D.: Extracting Key Terms from Noisy and Multitheme Documents. In: Proc. of the 18th International Conference on World Wide Web, YYY"Ó2;."Madrid, Spain, pp. 661-670 (2009)

5. Hulth, Ạ: Combining Machine Learning and Natural Language Processing for Automatic Keyword Extraction. PhD Thesis, Stockholm University, USA (2004)

6. Jones, S., and Mahoui, M.: Hierarchical Document Clustering Using Automatically Extracted Keyphrases. In: Proc. of the Third International Asian Conference on Digital Libraries, ICADL2000, Seoul, Korea, pp. 113Î

120 (2000)

7. Kim, S. N., Medelyan, Ọ, Kan, M.-Ỵ, Baldmin, T.: Semeval-2010 Task 5: Automatic Keyphrase Extraction from Scientific Articles. In: Proc. of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp. 21-26 (2010)

8. Li, D., Li, S., Li, W., Wang, W., Qu, W.: A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases Through a Document Semantic Network. In: Proc. of the ACL 2010 Conference Short Papers, Uppsala, Sweden, pp. 296Î300 (2010).

9. Li, Q., Wu, Ỵ-F., Bot, R., Chen, X.: Incorporating Document Keyphrases in Search Results. In: Proc. of Proceedings of the Tenth Americas Conference on Information Systems, AMCIS, New York, USA, p.410 (2004)

10. Litvak, M., Last, M., Kande, Ạ: DegExt: a Language-Independent Keyphrase Extractor. Journal of Ambient Intelligence and Humanized Computing , Vol. 4, Ị 3, pp. 377-387 (2013)

11. Lopez, P., Romary, L:. HUMB: Automatic Key Term Extraction from Scientific Articles in GROBID. In: Proc. of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp. 248-251 (2010) 12. Octm„." M0." Hahn, Ụ, Schulz, S., Daumke, P., Nohama, P.: Interlingual

Indexing across Different Languages. In: Proc. of Computer-Assisted Information Retrieval, RIAO 2004, 7th International Conference, Avignon, France, pp. 82 - 99 (2004)

13. Medelyan, Ọ: Automatic Keyphrase Indexing with a Domain-Specific Thesaurus. Master Thesis, Albert-Ludwig University, Germany (2005)

14. Medelyan, Ọ: Human Competitive Automatic Topic Indexing. Ph.D. Thesis , University of Waikato, New Zealand (2009)

15. Michalcea, R., and Tarau, P.: TextRank: Bringing order into texts. In: Proc. of the Cornference on Empirical Methods in Natural Language Processing,

GOPNRÓ24, Barcelona, Spain, pp. 404Î411 (2004)

16. Mihalcea, R., and Csomai, Ạ: Wikify!: Linking Documents to Encyclopedic. In: Proc. of the 16th ACM Conference on Information and Knowledge Management, EKMOÓ29, New York, USA, pp. 233-242 (2007)

17. Newman, M. Analysis of Weighted Networks. Physical Review Ẹ Vol. 70, 056131, 2004

18. Nguyen, D., & Luong, T. (2010). WINGNUS: Keyphrase Extraction Utilizing Document Logical Structurẹ In: Proc. of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp.166-169

19. Paynter, G., Cunningham, S., Witten, ỊH.: Evaluating Extracted Phrases and Extending Thesaurị In: Proc. of the Third International Asian Conference on Digital Libraries, ICADL2000, Seoul, Korea, pp. 131Î138 (2000)

20. Pons, P., and Latapy, M.: Computing Communities in Large Networks Using Random Walks. Physical Review E, 0512106 (2005).

21. Rolling, L.: Indexing Consistency, Quality and Efficiencỵ Information Processing and Management, Vol. 17, Nọ 2, pp. 68-76 (1981)

22. Turdakov, D., and Velikhov, P.: Semantic relatedness metric for wikipedia concepts based on link analysis and its application to word sense disambiguation. In proceedings of the Fifth Spring Young Researchers

Eqnnqswkwo"qp"Fcvcdcugu"cpf"Kphqtocvkqp"U{uvgoụ"U[TEqFKUÓ422: (2008) 23. Witten, Ị, Paynter, G., Frank, Ẹ, Gutwin, C., Nevill-Manning, C.: Kea:

Practical automatic keyphrase extraction. In: Proc. of the fourth ACM conference on Digital libraries, Berkeley, CA, pp. 254Î255 (1999)

24. [qw." Y0." Hqpvckpg." F0." Dctvjflụ" L.: An Automatic Keyphrase Extraction System for Scientific Documents. Knowledge and Information System, Vol. 34, pp. 691-724 (2012)

25. Zhou, D., Huanị" L0." (" Uej nmqrh." D0< Beyond Pairwise Classification and Clustering Using Hypergraphs. MPI Technical Report, Nọ 143, V¯dkpigp."

EèE"EðPI"VTîPJ"AÊ"EðPI"D渦

1) Nguyen, H.K., Cao, T.H.: Using Core Phrases and Information Features for Keyphrase Extraction. In: Proceedings of the Second Asian Conference on Information Systems, ACIS 2013, Phuket, Thailand, pp. (2013)

PH井P"N""N卯EJ"VTëEJ"PICPI

H丑x "v‒p<"PIW[右N KIM HUY陰N

Pi {."vjƒpị"p<o"ukpj<"38/07/1983

P挨k"ukpj: TP. H欝Ej "Okpj

A鵜a ch雨<"6;317"M6."rj逢運ng Th嘘ng Nh医v."vj pj"rj嘘Dk‒p"J”c."v雨pj"A欝ng Naị

SWè"VTîPJ"AÉQ"V萎O

‚ T瑛p<o"4223"8院p"p<o"4227<"ukpj"xk‒p"Vt逢運pi"A衣i H丑e"Dƒej"Mjqc"VR0JEỌ"mjqc"Mjqc"j丑c

("M "vjw壱v"Oƒ{"v pj0"

‚ T瑛p<o"4233"8院n nay: h丑e"xk‒p"ecq"j丑e"Vt逢運pi"A衣i H丑e"Dƒej"Mjqc."mjqc"Mjqc"j丑e"("M "

thu壱v"Oƒ{"v pj".ejw{‒p"pi pj"Mjqc"J丑e"Oƒ{"V pj0

SWè"VTîPJ"EðPI"VèE

Một phần của tài liệu Rút trích các cụm từ khóa dựa trên vai trò và đặc điểm của các cụm từ trong văn bản (Trang 49 - 79)

Tải bản đầy đủ (PDF)

(79 trang)