Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 70 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
70
Dung lượng
1,58 MB
Nội dung
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY MASTER’S GRADUATION THESIS Improving Evolutionary Algorithm for Document Extractive Summarization NGUYEN THI HOAI hoai.nt202195m@sis.hust.edu.com Major: Data Science and Artificial Intelligence Thesis advisor: Dr Nguyen Thi Thu Trang Institute: School of Information and Communication Technology HA NOI, 2021 HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY MASTER’S GRADUATION THESIS Improving Evolutionary Algorithm for Document Extractive Summarization NGUYEN THI HOAI hoai.nt202195m@sis.hust.edu.com Major: Data Science and Artificial Intelligence Thesis advisor: Dr Nguyen Thi Thu Trang Institute: School of Information and Communication Technology HA NOI, 2021 CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM Độc lập – Tự – Hạnh phúc ************ BẢN XÁC NHẬN CHỈNH SỬA LUẬN VĂN THẠC SĨ Họ tên tác giả luận văn: Nguyễn Thị Hồi Đề tài luận văn: Cải tiến thuật tốn tiến hóa tóm tắt trích rút văn Chun ngành: Khoa học liệu Trí tuệ nhân tạo Mã số SV: 20202195M Tác giả, Người hướng dẫn khoa học Hội đồng chấm luận văn xác nhận tác giả sửa chữa, bổ sung luận văn theo biên họp Hội đồng ngày 24/12/2021 với nội dung sau: Chương 2: Chuyển phần Related works từ Chương sang Chương (tên chương Theoretical Background and Related works) Chương 4: • Kết thực nghiệm (trang 38) bổ sung phần giải thích cách lựa chọn đặc tính câu cho hàm Fitness, tham số giải thích lý chọn độ đo Recall hai liệu DUC2001, DUC2002 F1 CNN/Daily Mail • Bổ sung kết TextRank cho hai liệu DUC2001 DUC2002 Phụ lục: Mô tả rõ nghiên cứu GA-Features PSO-Harmony Search - Sau sửa: Nội dung luận án gồm chương là: • Chương 1: Giới thiệu, • Chương 2: Lý thuyết tổng quan nghiên cứu gần đây, • Chương 3: Mơ hình lai PSO-GA đề xuất, • Chương 4: Thực nghiệm, • Chương 5: Kết luận nghiên cứu tương lai Ngày tháng năm 2021 Tác giả luận văn Giáo viên hướng dẫn TS Nguyễn Thị Thu Trang Nguyễn Thị Hoài CHỦ TỊCH HỘI ĐỒNG Graduation Thesis Assignment Name: Nguyen Thi Hoai Phone: +84 384 830 357 Email: hoai.nt202195m@sis.hust.edu.vn Class: 20BKHDL-E Affiliation: Hanoi University of Science and Technology I – Nguyen Thi Hoai - hereby warrants that the work and presentation in this thesis are performed by myself under the supervision of Dr Nguyen Thi Thu Trang All results presented in this thesis are truthful and are not copied from any other work All references in this thesis - including images, tables, figures, and quotes - are clearly and fully documented in the bibliography I will take full responsibility for even one copy that violates school regulations Hanoi, 25th November 2021 Author Nguyen Thi Hoai Acknowledgement It is a genuine pleasure to express my deep sense of thanks and gratitude to my advisor and guide, Dr Nguyen Thi Thu Trang I am highly grateful because she managed and supported me in completing my thesis Her unwavering enthusiasm for science kept me constantly engaged with my research, and her personal generosity helped make my time at HUST enjoyable Furthermore, she has taught me the methodology to carry out the study and present the research works as clearly as possible It was a great privilege and honor to work and study under her guidance Also, I would like to thank Dr Bui Thi Mai Anh for her keen interest in me in every research Her prompt inspirations, timely suggestions with kindness, motivation, and dynamism have enabled me to accomplish this task throughout my study period Most importantly, my sincere thanks and appreciation also go to my family for their constant source of loving, caring and inspiration They are such vital parts of my life that I cannot imagine a life without them Last but not least, I gratefully acknowledge my university and the people who have willingly helped me out with their abilities I greatly appreciate the support received through the collaborative work undertaken with them Abstract The task of text summarization is to generate main ideas that cover the content of the whole text with the aim of shortening reading time There are two approaches to summarizing, extractive and abstractive summarization Numerous methods have been researched in this domain, in which using heuristic algorithms is a more effective and straightforward way than applying machine learning or deep learning A genetic algorithm (GA) is a search heuristic that is inspired by Charles Darwin’s theory of natural evolution This algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation However, using traditional GA alone may suffer from a weak local search capability and slow convergence speed Another algorithm, Particle Swarm Optimization (PSO), is a population-based optimization technique inspired by the motion of bird flocks and schooling fish The premature convergence of PSO is prevented by applying GA on a small population Besides, the local optimum phenomenon of PSO can also be avoided with GA The goal of the thesis is to investigate the effectiveness of a hybrid method combining GA and PSO based attribute selection in improving the performance of classification algorithms solving the automatic text summarization task To assess the effectiveness of the proposed algorithm, I have conducted the experimentation on three common datasets, DUC2001, DUC2002, which are typically used for extractive summarization, and CNN/Daily Mail The experiment results have shown that the hybrid PSO-GA outperforms all the state-of-the-art works on all three ROUGE point metrics for these datasets The solution presented in this thesis was accepted at The 35th Pacific Asia Conference on Language, Information and Computation (PACLIC 35) in 2021 Content Abstract i Content .ii List of Figures v List of Tables vi List of Equations vii Acronyms viii Chapter Introduction 1.1 Motivation 1.2 Objective and scope 1.3 Structure of thesis Chapter 2.1 Theoretical Background and Related works Text summarization 2.1.1 Abstractive summarization 2.1.2 Extractive summarization 2.2 Genetic Algorithm 2.2.1 Initialization 2.2.2 Crossover a) N-Point Crossover .9 b) Uniform Crossover: 10 2.2.3 Mutation 11 2.3 Particle swarm optimization 11 2.3.1 Particle and swarm 12 2.3.2 Objective function .12 2.4 Term Frequency– Inverse Document Frequency 13 2.4.1 Term Frequency 14 2.4.2 Inverse Document Frequency 14 2.5 Cosine Similarity 15 2.6 Related works 16 Chapter Proposed hybrid PSO-GA 18 ii 3.1 Overview 18 3.2 General solution 18 3.3 Problem Representation 20 3.3.1 Document representation 20 3.3.2 Summary presentation 20 3.4 Proposed of fitness function 21 3.4.1 Sentence features 21 a) Similarity to the topic sentence 21 b) Sentence length 22 c) Sentence position 22 d) Number of proper nouns 23 e) Coverage 23 3.4.2 Fitness function 24 3.5 Proposed strategies for operators 25 3.5.1 Initialization 25 3.5.2 Selection 26 3.5.3 Crossover 27 3.5.4 Mutation 28 3.5.5 Evaluation Convergence 28 3.5.6 Adaptive PSO strategy 29 Chapter 4.1 Experimentation 32 Evaluation metrics 32 4.1.1 Precision, recall and f-score 32 a) Recall 32 b) Precision 32 c) F-score 33 4.1.2 Rouge measure 33 a) Rouge 33 b) Rouge 34 c) Rouge L 35 4.2 Preparation for experiment 35 iii 4.2.1 Datasets 35 4.2.2 Environment setup 37 4.3 Chapter Experimental result 38 Conclusion and future works 42 5.1 Conclusion 42 5.2 Future works 42 References 43 Appendix 48 iv List of Figures Figure 2.1 Two approaches to summarization Figure 2.2 The basic structure of Genetic Algorithm [8] Figure 3.1 The steps of the proposed hybrid PSO-GA for single documents 19 Figure 3.2 Velocity and Position Updating of adaptive PSO 30 Figure 3.3 Flow chart of the adaptive PSO 31 Figure 4.1 Rouge Scores (Recall) (%) on DUC2001 39 Figure 4.2 Rouge Scores (Recall) (%) on DUC2002 40 Figure 4.3 Rouge Scores (F1) (%) on CNN/Daily Mail 41 v References [1] Hahn, U., & Mani, I (2000) The challenges of automatic summarization Computer, 33(11), 29–36 https://doi.org/10.1109/2.881692 [2] Wong, K.-F., Wu, M., & Li, W (2008) Extractive summarization using supervised and semi-supervised learning Proceedings of the 22nd International Conference on Computational Linguistics - COLING ’08, 1, 985–992 https://doi.org/10.3115/1599081.1599205 [3] Liu, Y (2019) Fine-tune BERT for Extractive Summarization ArXiv:1903.10318 [Cs] http://arxiv.org/abs/1903.10318 [4] Yang, L., Cai, X., Zhang, Y., & Shi, P (2014) Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization Information Sciences, 260, 37–50 https://doi.org/10.1016/j.ins.2013.11.026 [5] Mihalcea, R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions -, 20-es https://doi.org/10.3115/1219044.1219064 [6] Batcha, N K., & Zaki, Ahmed M (2010) Algebraic reduction in automatic text summarization – the state of the art International Conference on Computer and Communication Engineering (ICCCE’10), 1–6 https://doi.org/10.1109/ICCCE.2010.5556770 43 [7] Mendoza, M., Bonilla, S., Noguera, C., Cobos, C., & León, E (2014) Extractive single-document summarization based on genetic operators and guided local search Expert Systems with Applications, 41(9), 4158–4169 https://doi.org/10.1016/j.eswa.2013.12.042 [8] Anh, B T M., My, N T., & Trang, N T T (2019) Enhanced Genetic Algorithm for Single Document Extractive Summarization Proceedings of the Tenth International Symposium on Information and Communication Technology-SoICT 2019, 370–376 https://doi.org/10.1145/3368926.3369729 [9] Foong, O.-M., & Oxley, A (2011) A hybrid PSO model in Extractive Text Summarizer 2011 IEEE Symposium on Computers & Informatics, 130–134 https://doi.org/10.1109/ISCI.2011.5958897 [10] Lin, C.-Y (2004) ROUGE: A Package for Automatic Evaluation of Summaries 74–81 [11] Zhang, J., Zhao, Y., Saleh, M., & Liu, P J (2020) PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization ArXiv:1912.08777 [Cs] http://arxiv.org/abs/1912.08777 [12] Meena, Y K., & Gopalani, D (2015) Evolutionary Algorithms for Extractive Automatic Text Summarization Procedia Computer Science, 48, 244– 249 https://doi.org/10.1016/j.procs.2015.04.177 [13] Thede, S M (2004) AN INTRODUCTION TO GENETIC ALGORITHMS 20(1), 115–123 44 [14] Eberhart, R., & Kennedy, J (1995) A new optimizer using particle swarm theory MHS’95 Proceedings of the Sixth International Symposium on Micro Machine and Human Science, 39–43 https://doi.org/10.1109/MHS.1995.494215 [15] Soliman, S A.-H., & Mantawy, A.-A H (2012) Mathematical Optimization Techniques In S A.-H Soliman & A.-A H Mantawy, Modern Optimization Techniques with Applications in Electric Power Systems (pp 71–77) Springer New York https://doi.org/10.1007/978-1-4614-1752-1_2 [16] D Manning, C., Raghavan, P., & Schütze, H (2008) Introduction to Information Retrieval Cambridge University Press [17] Vázquez, E., Arnulfo García-Hernández, R., & Ledeneva, Y (2018) Sentence features relevance for extractive text summarization using genetic algorithms Journal of Intelligent & Fuzzy Systems, 35(1), 353–365 https://doi.org/10.3233/JIFS-169594 [18] Sivanandam, S N., & Deepa, S N (2008) Introduction to genetic algorithms Springer Publishing Company, Incorporated [19] Ardizzon, G., Cavazzini, G., & Pavesi, G (2015) Adaptive acceleration coefficients for a new search diversification strategy in particle swarm optimization algorithms Information Sciences, 299, 337–378 https://doi.org/10.1016/j.ins.2014.12.024 [20] Suanmali, L., Salim, N., & Binwahlan, M S (2009) Fuzzy Logic Based Method for Improving Text Summarization 9(1), 175–179 45 [21] Al-Abdallah, R Z., & Al-Taani, A T (2017) Arabic Single-Document Text Summarization Using Particle Swarm Optimization Algorithm Procedia Computer Science, 117, 30–37 https://doi.org/10.1016/j.procs.2017.10.091 [22] Mandal, S., Singh, G K., & Pal, A (2019) PSO-Based Text Summarization Approach Using Sentiment Analysis In B Iyer, S L Nalbalwar, & N P Pathak (Eds.), Computing, Communication and Signal Processing (Vol 810, pp 845–854) Springer Singapore https://doi.org/10.1007/978-981-13-1513-8_86 [23] Aliguliyev, R M (2009) A new sentence similarity measure and sentence based extractive technique for automatic text summarization Expert Systems with Applications, 36(4), 7764–7772 https://doi.org/10.1016/j.eswa.2008.11.022 [24] Hyma, J., Jhansi, Y., & Anuradha, S (2010) A new hybridized approach of PSO & GA for document clustering International Journal of Engineering Science and Technology, 2(5), 1221-1226 [25] Cui, X., & Potok, T E (2005) Document clustering analysis based on hybrid PSO+ K-means algorithm Journal of Computer Sciences (special issue), 27, 33 [26] Mohamed, M., & Oussalah, M (2019) SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis Information Processing & Management, 56(4), 1356–1372 https://doi.org/10.1016/j.ipm.2019.04.003 [27] Hermann, K M., Kočiský, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., & Blunsom, P (2015) Teaching Machines to Read and Comprehend ArXiv:1506.03340 [Cs] http://arxiv.org/abs/1506.03340 46 [28] of Over, P., & Liggett, W (2002) Introduction to DUC: An Intrinsic Evaluation Generic News Text Summarization Systems http://www- nlpir.nist.gov/projects/duc/pubs/2002slides/overview.02.pdf [29] Wan, X (2010) Towards a Unified Approach to Simultaneous Single- Document and Multi-Document Summarizations Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), 1137–1145 [30] Nallapati, R., Zhai, F., & Zhou, B (2016) SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents ArXiv:1611.04230 [Cs] http://arxiv.org/abs/1611.04230 [31] Mihalcea, R., & Tarau, P (2004) TextRank: Bringing Order into Texts In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (pp 404–411) Association for Computational Linguistics https://aclanthology.org/W04-3252 [32] Zhong, M., Liu, P., Chen, Y., Wang, D., Qiu, X., & Huang, X (2020) Extractive Summarization as Text Matching ArXiv:2004.08795 [Cs] http://arxiv.org/abs/2004.08795 47 Appendix A Details of previous research I did some experiments with other state-of-the-art works on extractive single-document summarization listed in section 4.3 to compare my proposed method Below are the details and overview of their models: Table 0.1 Rouge Scores (Recall) (%) of the Hybrid PSO-GA compared with other research The experiment was carried out on 309 test documents of the DUC2001 corpus Method Rouge -1 Rouge - Rouge - L Hybrid PSO-GA (my hybrid) 54.79 26.50 50.20 OnlyGA 52.78 24.39 48.21 OnlyPSO 53.02 24.71 48.47 Differential Evolution [23] 47.85 18.53 UnifiedRank [29] 45.38 17.65 TextRank [31] 44.50 18.67 GA-Features [8] 50.26 21.63 MA-SingDocSum (published) [7] 44.86 20.14 MA-SingDocSum (re-implement) 44.17 20.68 38.67 PSO-Harmony Search (re-implement) [9] 40.19 13.06 36.52 42.10 48 Table 0.2 Rouge Scores (Recall) (%) of the Hybrid PSO-GA compared with other research The experiment was carried out on 567 test documents of the DUC2002 corpus Method Rouge -1 Rouge - Rouge - L Hybrid PSO-GA (my hybrid) 57.24 27.95 52.94 OnlyGA 55.26 26.26 50.37 OnlyPSO 54.25 25.84 49.57 Differential Evolution [23] 46.69 12.37 UnifiedRank [29] 48.49 21.46 TextRank [31] 48.79 21.52 44.81 GA-Features [8] 52.01 23.25 44.57 MA-SingDocSum (published) [7] 48.28 22.84 MA-SingDocSum (re-implement) 48.33 20.60 41.93 PSO-Harmony Search (re-implement) [9] 42.85 15.45 38.89 49 Table 0.3 Rouge Scores (F1) (%) of the Hybrid PSO-GA comparing with other research The experiment was carried out on 11490 test documents of CNN/Daily Mail corpus Method Rouge -1 Rouge - Rouge - L Hybrid PSO-GA (my hybrid) 38.21 16.26 34.65 OnlyGA 34.76 12.87 31.28 OnlyPSO 38.19 16.13 34.59 TextRank [31] 34.32 15.68 33.54 MA-SingDocSum (re-implement) [4] 33.63 13.91 31.98 SummaRunner (Supervised) [30] 39.60 16.20 35.30 MatchSum (Supervised) [32] 44.41 20.86 40.55 • Differential Evolution [22]: This GA method optimizes the allocation of sentences to groups by using differential evolution In particular, the model is based on sentence clustering and extraction The strategy is divided into two steps Sentences are first clustered, and then representative sentences are defined for each cluster To optimize the objective functions, they created a discrete differential evolution algorithm • UnifiedRank [29]: This work proposes a unified graph-based approach for single and multiple document summarization Consequently, the mutual implications between the two tasks have been considered The mutual influences between the two tasks are incorporated into a graph-based model The ranking scores of sentences for single-document summarization and the ranking scores of sentences for multi-document summarization can 50 boost each other, and they can be obtained simultaneously in a unified graph-based ranking process • GA-Features [8]: This work is to evaluate the role of some sentence features in fitness function and genetic operators in GA extraction summary The research demonstrates that the most critical features are sentence length and the number of proper nouns; after that are sentence position within the document and similarity to the topic sentence Besides, the effectiveness of different crossover and mutation strategies is evaluated according to the ROUGE measures on generated summaries First, the uniform technique outperforms the one-point strategy in terms of convergent speed Second, the guided mutation mechanism GA converges faster than the random one Thus, the authors choose the uniform crossover and guided mutation mechanisms for conducting the GA operators This work was carried out on the CNN/Daily Mail corpus • MA-SingleDocSum [7]: This method is a combination of genetic operators and guided local search I had to re-implement this model to determine the ROUGE-L score This memetic algorithm (MA) proposed in this research combines a population-based global search with a local search heuristic made by each agent, i.e., it combines genetic evolution and the learning that individuals acquire during their existence The authors evaluate the model on DUC2001 and DUC2002 datasets • PSO-Harmony Search [9]: A hybrid Harmony PSO model in an Extractive Text Summarizer is testing and validating on a total of electronic documents downloaded from the internet The hybrid consists of the harmony search (HS) and PSO algorithms In this HS algorithm, three optimization operators, namely harmony memory (HM), harmony memory considering rate (HMCR), and the pitch adjusting rate (PAR) were explored to investigate the performance of the text summarizer The pitch adjustment algorithm of HS further optimizes the PSO weight assigned to each sentence by PSO using HS parameters The adjustment is made using the HMCR and PAR parameters • TextRank [31]: The technique is a graph-based ranking algorithm A graph is an illustration constructed to represent text, with nodes representing words (or other text entities) interconnected by vertices with meaningful relationships The objective of the sentence extraction task is to qualify entire sentences and sort them from highest to lowest rating As a result, each sentence in the text is represented by a vertex in the network To 51 establish connections (cycles) between sentences, define a similarity relationship in which the relationship between two sentences can be viewed as a process of "recommendation": a sentence pointing to a concept in the text provides the reader with a "recommendation" to refer to other sentences in the text pointing to the same concepts, and thus a link can be established between any two sentences that share a common content • SummarRunner [30]: Nallapati et al presents the SummaRunNer model, which is a recurrent neural network-based sequence model for extractive summarization The authors use two recurrent neural networks to extract word-level representational features from documents The sentence features then were extracted from the word-level by using another two recurrent neural networks • MatchSum[32]: The research considers the extractive summarization task as a semantic text matching problem and proposes a novel summary-level framework to match the source document and candidate summaries in the semantic space They use a SiameseBERT architecture for computing the similarity between the source document and candidate summary Siamese BERT utilizes a pre-trained BERT within a Siamese network structure to generate semantically relevant text embeddings that can be compared using cosinesimilarity A good summary is the one that has the highest degree of similarity to a collection of candidate summaries B Examples summary generated by proposed model The examples below are the summaries obtained from my hybrid PSO-GA on DUC2001, DUC2002 and CNN/Daily Mail The red sentences will be in the summary that are extracted from input document by my model DUC2001 Input document: A coalition of members of Congress announced Wednesday that they plan to sue the Census Bureau in an effort to force the agency to delete illegal aliens from its count in 1990 Some 40 members of the House joined the Federation for American Immigration Reform in announcing that the suit would be filed Thursday in U.S District Court in Pittsburgh, spokesmen said at a news conference here The group contends that including the estimated million or more illegal aliens in the national head count, which is 52 used to distribute seats in the House of Representatives, will cause unfair shifts of seats from one state to another Census officials say they are required to count everyone by the U.S Constitution, which does not mention citizenship but only instructs that the House apportionment be based on the ``whole number of persons'' residing in the various states That approach was upheld by a federal court in a similar suit, brought by the same immigration reform group, before the 1980 Census Nonetheless, Dan Stein of the immigration reform federation contended that illegal aliens should not be allowed to be part of determining the political structure of the United States Rep Tom Ridge, R-Pa., said the Census Bureau should actually count everyone but that it should develop a method to determine how many people are illegally in the country, and them deduct that number from the figures used for reapportioning Congress Rep Jan Meyers, R-Kan., suggested including a question on the Census form asking whether respondents are U.S citizerns Gold summary: Some 40 members of Congress have joined with the Federation for American Immigration Reform in announcing that a suit will be filed in an effort to stop the Census Bureau from counting the estimated million illegal aliens living in the United States Rep Tom Ridge, R-Pa., said everyone should be counted but the Census Bureau should develop a method to determine how many people are illegal and deduct that number from the figures used for reapportioning Congress Rep Jan Meyers, R-Kan., suggested that a question should be put on the Census form asking whether respondents are U.S citizens Model summary: A coalition of members of Congress announced Wednesday that they plan to sue the Census Bureau in an effort to force the agency to delete illegal aliens from its count in 1990 Some 40 members of the House joined the Federation for American Immigration Reform in announcing that the suit would be filed Thursday in U.S District Court in Pittsburgh, spokesmen said at a news conference here The group contends that including the estimated million or more illegal aliens in the national head count, which is used to distribute seats in the House of Representatives, will cause unfair shifts of seats from one state to another Census officials say they are required to count everyone by the U.S Constitution, which does not mention citizenship but only instructs that the House apportionment be based on the ``whole number of persons'' residing in the various states 53 Rep Tom Ridge, R-Pa., said the Census Bureau should actually count everyone but that it should develop a method to determine how many people are illegally in the country, and them deduct that number from the figures used for reapportioning Congress DUC2002 Input document: WASHINGTON President Bush plans to name United Nations Ambassador Thomas Pickering as U.S envoy to India, and appoint the current head of the foreign service to succeed him The White House, according to a U.S official, hasn't yet informed the Senate, which must confirm the nominations of Mr Pickering and Foreign Service Director General Edward Perkins, a former U.S envoy to South Africa With the end of the Cold War, the U.N is playing an increasingly important role in working to resolve regional disputes It has led efforts to end civil wars in El Salvador and Cambodia, and it also helped monitor elections in Namibia Last Friday, during an unprecedented U.N Security Council meeting of heads of state, President Bush reaffirmed the U.N.'s leading role in peacemaking The assignment in India could be an important one India and its neighbor Pakistan remain embroiled in a long, bitter feud that has the potential of becoming one of the world's few remaining nuclear standoffs Mr Pickering was one of President Bush's first ambassadorial appointments He served at the U.N during the time the administration was involved in the delicate negotiations that led to the formation of the coalition against Iraq in the Persian Gulf War and the U.N sanctions against Saddam Hussein's regime But some senior State Department officials have been irritated by what they call Mr Pickering's propensity for the media spotlight During the 1980s, when the Reagan administration drew the line against communist-backed insurgences in Central America, Mr Pickering played an active role in that effort as U.S envoy to El Salvador Mr Perkins was the U.S ambassador to South Africa in 1986, when the U.S imposed economic sanctions against that nation's white-minority government to pressure it to end apartheid Mr Perkins, who is black, traveled widely across South Africa during this period, making sure he was seen in black townships, attending church services and funerals, as well as in white communities Mr Pickering, reached over the weekend, refused to comment on his future plans, and Mr Perkins couldn't be reached 54 Gold summary: President Bush plans to name UN Ambassador Thomas Pickering as US envoy to India The UN role in resolving regional disputes is growing in importance, and the feud between India and Pakistan could become the next nuclear standoff Pickering served in the UN during the delicate negotiations that led to the coalition against Iraq, and served as a US envoy to El Salvador in the 1980s, during Reagan's opposition to communist- backed Central American insurgencies He is sometimes criticized for liking media attention Foreign Service Director Edward Perkins, US ambassador to South Africa in 1986 when the US imposed economic sanctions to end apartheid, is to succeed Pickering at the UN Model summary: WASHINGTON President Bush plans to name United Nations Ambassador Thomas Pickering as U.S envoy to India, and appoint the current head of the foreign service to succeed him The White House, according to a U.S official, hasn't yet informed the Senate, which must confirm the nominations of Mr Pickering and Foreign Service Director General Edward Perkins, a former U.S envoy to South Africa Last Friday, during an unprecedented U.N Security Council meeting of heads of state, President Bush reaffirmed the U.N.'s leading role in peacemaking During the 1980s, when the Reagan administration drew the line against communist-backed insurgences in Central America, Mr Pickering played an active role in that effort as U.S envoy to El Salvador Mr Perkins was the U.S ambassador to South Africa in 1986, when the U.S imposed economic sanctions against that nation's white-minority government to pressure it to end apartheid CNN/DailyMail Input document: Hundreds of thousands of football fans are facing travel misery this weekend because of the Easter rail shut down Despite more than 30 Premiership and Championship games scheduled for the long weekend , large parts of the rail network are set to be closed for repair works More than half the games are set to be affected by delays and disruption , forcing fans onto the roads This weekend 's crunch Premiership fixture , Arsenal versus Liverpool in London , will be hit by the rail chaos Labour has accused the Government of failing to learn from the Boxing Day chaos , when nearly a million football fans faced nightmare journeys to follow their team Ministers were warned of the potential problems in advance Labour has claimed but they 55 failed to scrutinise the planned level of maintenance work Every single major artery on Britain 's railways was shut down No trains ran between England and Scotland or Wales on the East Coast , West Coast or Great Western mainlines The Midland , Cross Country and East Anglia were also shut Despite the chaos the Government has refused to step in this weekend to stop large parts of the rail network shutting down again The planned engineering works over the Easter weekend will hit all routes to and from London Euston -hitting the busy west coast services between the capital and Birmingham , Manchester and Liverpool There will be no mainline trains at all from Paddington out towards Swindon , Bristol and Cardiff and further disruptions on lines from London Bridge All trains out of London Euston - serving the main west coast rail line between the capital and Birmingham , Manchester and Liverpool - have been hit Trains between Milton Keynes Central and Clapham Junction are also affected as well as those traveling on the line between London Paddington and Greenford , Slough and Windsor & Eton Central , Maidenhead and Marlow , Twyford and Henleyon-Thames All routes to and from Reading are also hit , with Charing Cross trains to Kent and routes to East Anglia from London Liverpool Street also affected It means there will no trains for fans wanting to travel to the Reading versus Cardiff City match in Berkshire forcing supporters onto the roads on replacement bus services instead There will also be no trains available for the thousands of Liverpool fans wanting to get home after the Arsenal versus Liverpool match on Saturday Michael Dugher , the Shadow Transport Secretary , said accused ministers or being too ` out of touch ' with ordinary football fans to understand the chaos the rail shutdown will cause Shadow transport minister Michael Dugher said the Government had not learned from past mistakes He said : ` Just like on Boxing Day , hundreds of thousands of people will want to travel to see family and also to follow their football team on one of the busiest fixture lists of the season ` Ministers were asleep on the job over the Christmas period and there is little to suggest they have learnt from their mistakes ` We have seen no evidence that Ministers have adequately scrutinised the level of planned engineering works or ensured that the necessary contingency plans are in place ` As usual , it will be hard-pressed passengers - who 've already seen their fares go up by more than 20 per cent on average since 2010 - who will suffer ' Transport Secretary Patrick McLoughlin has admitted there will be ` frustrations ' on rail services this weekend and urged people to ` think twice ' before travelling Mr McLoughlin said : ` There will be alternatives and we 've lifted almost all motorway roadworks to help ` But if you are travelling between Friday and Monday night please check your journey first , it may be that you 'll think twice about how you travel ` I 'm sorry if it is more difficult - but my promise is that the work is essential and when it 's done the benefits will be worthwhile 56 ' A spokesman for Network Rail said over the past three months they have contacted both the FA and every individual Premiership club to ensure they 're aware about the disruption A Conservative spokesman said the Easter weekend was a ` sensible ' time to carry out the repair work He said : ` Thanks to this government 's successful management of our economy we have been able to commit # 38billion to upgrading our national rail infrastructure , after 13 years of Labour neglect ` The # 100million of upgrades this Easter is the biggest set of works ever carried out over a long weekend With passenger numbers down 40 per cent it is a sensible time to these works , they will unblock current bottlenecks and deliver a better railway for passengers ` Labour need to get real , they can either support the long overdue upgrade of our national rail network or plunge it back into long term decline ` On the 7th of May voters will have a clear choice between the Conservative Party securing our country 's long term future or a Labour party that will bring chaos ' Gold summary: Over 30 Premiership and Championship games scheduled this weekend More than half the games are set to be affected by delays and disruption Engineering works affect routes to London Euston , hitting west coast line Labour say ministers have failed to learn from the Boxing Day chaos Model summary: Hundreds of thousands of football fans are facing travel misery this weekend because of the Easter rail shut down Despite more than 30 Premiership and Championship games scheduled for the long weekend , large parts of the rail network are set to be closed for repair works This weekend 's crunch Premiership fixture , Arsenal versus Liverpool in London , will be hit by the rail chaos 57 ... Đề tài luận văn: Cải tiến thuật tốn tiến hóa tóm tắt trích rút văn Chun ngành: Khoa học liệu Trí tuệ nhân tạo Mã số SV: 20202195M Tác giả, Người hướng dẫn khoa học Hội đồng chấm luận văn xác nhận... text summarization is based on extractive summarization The thesis focuses on improving evolutionary algorithms (EA) in extractive summarization Many previous works proposed some evolutionary algorithm, ... and multi -document summarization In single -document summarization, the summary of only one document is built, while in multi -document summarization, the summary of a whole collection of documents