0

medical document clustering using ontology based term similarity measures

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Ensemble Document Clustering Using Weighted Hypergraph Generated by NMF" docx

Báo cáo khoa học

... results using random initialization, and selected the cluster1 We used the clustering toolkit CLUTO for clustering the hypergraph 79 Î Conclusion This paper proposed a new ensemble document clustering ... the ensemble method using a standard hypergraph and the ensemble method using a weighted hypergraph Our method achieved the best results NMF Ñ Ò Í The NMF decomposes the ¢ term -document matrix to ... our ensemble method Any number is available for each clustering Experience shows that the ensemble clustering using k-means succeeds when each clustering has many clusters, and they are combined...
  • 4
  • 393
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Multi-Document Summarization using Sentence-based Topic Models" docx

Báo cáo khoa học

... proposed for document clustering and summarization by making use of both term -document matrix Y and term- sentence matrix B The FGB model computes two matrices U and V by optimizing number of document ... model leads to better summarization results term -document matrix term- sentence matrix the number of latent topics sentence-topic matrix auxiliary document- topic matrix 1: Randomly initialize ... than LexPageRank Note that FGB model makes use of both term -document and term- sentence matrices Our BSTM model outperforms FGB since the document- topic allocation is marginalized out in BSTM and...
  • 4
  • 381
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering" docx

Báo cáo khoa học

... distinctions between document level and profile based cross document coreference Document level CDC makes a simplifying assumption that a named entity (and its variants) in a document has one underlying ... We have presented a profile -based Cross Document Coreference (CDC) approach based on a novel fuzzy relational clustering algorithm KARC In contrast to traditional hard clustering methods, KARC produces ... pointer to another entity) The profile based CDC method generates a partition of E, 2.2 CDC Using Fuzzy Relational Clustering 2.2.1 Preliminaries Traditionally, hard clustering algorithms (where uij...
  • 9
  • 207
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities" docx

Báo cáo khoa học

... 3.2 Clustering The algorithm for clustering multilingual documents based on cognate NEs is of heuristic nature It consists of main phases: (1) first clusters creation, (2) addition of remaining documents ... event); then, the documents are represented by a set of terms (keywords or named entity types) In addition, they use document frequency to select relevant features among the extracted terms Finally, ... multilingual clustering; the multilingual clustering takes input from the monolingual clusters The authors select different type of features depending on the clustering: for the monolingual clustering...
  • 8
  • 421
  • 0
Medical image analysis using statistical shape model based on subdivision surface wavelet

Medical image analysis using statistical shape model based on subdivision surface wavelet

Cao đẳng - Đại học

... topology of biological objects) based on the subdivision surface wavelet transform, termed Statistical Surface Wavelet Model (SSWM) And besides, a framework of using SSWM for model-guided segmentation ... appropriate mapping [41, 44] is determined by iteratively solving a constrained optimization problem based on the diffusion equation Next, to express a surface using spherical harmonics, the 20 ... caudate nucleus used for model-guided segmentation is built based on a training set using the proposed method 3.1 The Shape Representation Based on Subdivision Surface Wavelets In this section, we...
  • 122
  • 362
  • 0
Document clustering on target entities using persons and organizations

Document clustering on target entities using persons and organizations

Tổng hợp

... Common Document Clustering Algorithms Document Clustering algorithms attempt to identify groups of documents that are similar to each other more than the rest of the collection Here each document ... follows Section introduces related work and Section discusses named entity based, link -based, content -based and structure -based document features and presents the algorithm to identify DPs and seeds ... to perform clustering to deliver IDPs for the corresponding Target entities PnO page clustering is a special case of web document clustering, which attempts to identify groups of documents that...
  • 90
  • 274
  • 0
Thermal error modelling of machine tools based on ANFIS with fuzzy c means clustering using a thermal imaging camera

Thermal error modelling of machine tools based on ANFIS with fuzzy c means clustering using a thermal imaging camera

Tổng hợp

... homepage: www.elsevier.com/locate/apm Thermal error modelling of machine tools based on ANFIS with fuzzy c-means clustering using a thermal imaging camera Ali M Abdulshahed ⇑, Andrew P Longstaff, Simon ... machine tools using data obtained from a thermal imaging camera is introduced Different groups of key temperature points were identified from thermal images using a novel schema based on a Grey ... c-means clustering Grey system theory a b s t r a c t Thermal errors are often quoted as being the largest contributor to CNC machine tool errors, but they can be effectively reduced using error...
  • 17
  • 818
  • 0
Why are US firms using more short-term debt

Why are US firms using more short-term debt

Tài chính doanh nghiệp

... negatively related to the term spread The interpretation is that managers time the market and prefer to issue shortterm debt when short -term interest rates are low compared with long -term rates In contrast, ... Founding age Taxes Term spread Short -term rate Inflation Real short -term rate Default spread Recession dummy Bank stock index return Government share Definition Ratio of long -term debt (DLTT) minus ... information asymmetry will issue short -term debt to avoid locking in their cost of financing with long -term debt because they expect to borrow at more favorable terms later Consistent with the asymmetric...
  • 31
  • 586
  • 1
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Predicate Argument Structure Analysis using Transformation-based Learning" pdf

Báo cáo khoa học

... Mitchell Marcus 1995 Text chunking using transformation -based learning In Proc of the third workshop on very large corpora, pages 82–94 Dan Shen and Mirella Lapata 2007 Using semantic roles to improve ... Conclusion We performed experiments for Japanese predicate argument structure analysis using transformationbased learning and extracted rules that indicate the tendencies annotators have We presented ... Transformation -based error-driven parsing In Proc of the Third International Workshop on Parsing Technologies Time Loc 51.5 38.0 59.6 1.7 55.8 37.4 Eric Brill 1995 Transformation -based error-driven...
  • 6
  • 496
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

Báo cáo khoa học

... paper we use both monolingual syntaxbased approaches and multilingual alignmentbased approaches and compare their performance when using the same similarity measures and evaluation set ferent ... results from the two synonym extraction approaches based on distributional similarity: one using syntactic context and one using translational context based on word alignment and the combination of ... less than times Distributional Similarity Based on Syntactic Relations This section contains the description of the synonym extraction approach based on distributional similarity and syntactic relations...
  • 8
  • 516
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Improving Pronoun Resolution Using Statistics-Based Semantic Compatibility Information" doc

Báo cáo khoa học

... pronoun resolution systems on the same data set Web -based feature vs Corpus -based feature The third column of the table lists the results using the web -based compatibility feature for neutral pronouns ... corpus -based semantic feature However, the increase is not as large as using the web -based feature: Under the two learning models, the success rate of the best system with the corpus -based feature ... the utility of the statistics -based semantic feature is more salient under TC than under SC for N-Pron resolution: the best gains using the corpus -based and the web -based semantic features under...
  • 8
  • 377
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "AN EXTENDED LR PARSING ALGORITHM FOR GRAMMARS USING FEATURE-BASED SYNTACTIC CATEGORIES " pot

Báo cáo khoa học

... Sag 1987 Information -Based Syntax and Semantics VoI.1 CSLI Lecture Notes 13 Stanford: CSLI Shieber, S 1985 "Using Restriction to Extend Parsing Algorithms for ComplexFeature -Based Formalisms" 23rd ... u r e - B a s e d Categories: A Preliminary Modification Fig.3 is an example production using feature -based syntactic categories The notations are adapted from Pollard and Sag (1987) and Shieber ... includes the item < v P ~ v NP > The ACTION/GOTO table used in the above example can be constructed using the procedures given in Fig.2 (adapted flom Aho and Uliman (1987)) The procedure CLOSURE coml~utes...
  • 6
  • 334
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments" ppt

Báo cáo khoa học

... Several measures are compared, including knowledge -based and corpusbased measures, with the best results being obtained with a corpus -based measure using Wikipedia combined with a “relevance feedback” ... subgraph-subgraph) matches Of these, 36 are based upon the semantic similarity [0 3] of four subgraphs defined by Nx All eight WordNet -based similarity measures listed in Section 3.3 plus the LSA ... scores obtained from semantic similarity measures Following Mihalcea et al (2006) and Mohler and Mihalcea (2009), we use eight knowledgebased measures of semantic similarity: shortest path [PATH],...
  • 11
  • 478
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models " pptx

Báo cáo khoa học

... corpus, existing crossdocument coreference approaches could not be applied to this dataset However, since a majority of related work consists of using clustering after defining a similarity function ... supervised clustering, and Haghighi and Klein (2010) use entity profiles to assist within -document coreference Since many related methods use clustering, there are a number of distributed clustering ... evaluation of clustering with SubSquare (Bshouty and Long, 2010), a scalable, distributed clustering method Subsquare takes as input a weighted graph with mentions as nodes and similarity between...
  • 11
  • 319
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "An Ontology–Based Approach for Key Phrase Extraction" docx

Báo cáo khoa học

... semantic similarity concept in the ViO ontology as Step After that, the key phrase extracting process will go to phase • Step 3: The idea of the most specific category identification process based ... the semantic similarity concept for each concept t that is still unknown after phase 2, we traverse the ontology hierarchy from its root to find the best node We choose the semantic similarity ... the current node c while traversing, the similarity values between t and all children of c are calculated If the maximum of similarity values is less than similarity value between t and c, then...
  • 4
  • 429
  • 3
Báo cáo khoa học:

Báo cáo khoa học: "Paraphrase Recognition Using Machine Learning to Combine Similarity Measures" ppt

Báo cáo khoa học

... (s1 , s2 ), where fj (1 ≤ j ≤ 9) are the string similarity measures Finally, we locate the s1 with the best average similarity (over all similarity measures) to s2 , namely s1∗ : 10 S2 : Fewer than ... of nouns, the voice of verbs etc.; this increases the similarity of positive s3 , s3 pairs A common problem is that the string similarity measures may be misled by differences in the lengths of ... treating the two verbs as the same token during the calculation of the string similarity measures would yield a higher similarity The second method, called INIT + WN, treats words from S1 and S2...
  • 9
  • 402
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Topic-Focused Multi-document Summarization Using an Approximate Oracle Score" doc

Báo cáo khoa học

... query terms Table shows a list of query terms for our two illustrative topics The number of query terms extracted in this way ranged from a low of terms for document set d360f to 20 terms for document ... d324e 6.2 The second collection of terms we use to estimate P (t|τ ) are signature terms Signature terms are the terms that are more likely to occur in the document set than in the background ... give rise to query terms and the latter to signature terms 6.1 Signature Terms 6.3 An estimate of P (t|τ ) To estimate P (t|τ ), we view both the query terms and the signature terms as “samples”...
  • 8
  • 339
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Enriching the Output of a Parser Using Memory-Based Learning" potx

Báo cáo khoa học

... used simple pattern -based heuristics to detect conjuncts and mark all conjuncts as heads of a conjunction After the conversion, every resulting dependency structure is modified deterministically: ... We then learned a mapping from the parser’s labels to those in the dependency corpus, using TiMBL, a memory -based classifier (Daelemans et al., 2003) The features used for the relabelling were similar ... its head, dependent, and label are correct For traces, this corresponds to the evaluation using the head -based antecedent representation described in (Johnson, 2002), and for empty nodes without...
  • 8
  • 379
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Generalised PP-Attachment Disambiguation using Corpus-based Linguistic Diagnostics" pot

Báo cáo khoa học

... this discrimination is not amenable to a corpus -based treatment In recent work, however, we succeed in distinguishing arguments from adjuncts using evidence extracted from a parsed corpus (Merlo ... linguistic diagnostics to determine whether a PP is an adjunct or an argument We illustrate here those countable diagnostics that can be approximated statistically and estimated using corpus counts, ... disambiguating the noun-verb attachment reaches an accuracy of 80.2% (baseline 71.6% using only the preposition) using information about argumenthood and 77.2% if the decision tree induction is performed...
  • 8
  • 299
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Word classification based on combined measures of distributional and semantic similarity" docx

Báo cáo khoa học

... weight proportional to its distributional similarity to the test word ("distributional similarity weighting") The weight in the third version was determined according to Equation 3, whereby A ... distributional similarity values, "semantic similarity weighting") Figure describes the precision demonstrated by these three weighting possibilities on the BNC data (for "semantic similarity weighting", ... versions of KNN in terms of precision: (1) without weighting of neighbors ; (2) with weighting by their distributional similarity to the test word and (3) with weighting by their semantic similarity...
  • 4
  • 345
  • 0

Xem thêm

Tìm thêm: hệ việt nam nhật bản và sức hấp dẫn của tiếng nhật tại việt nam xác định các mục tiêu của chương trình xác định các nguyên tắc biên soạn khảo sát các chuẩn giảng dạy tiếng nhật từ góc độ lí thuyết và thực tiễn khảo sát chương trình đào tạo gắn với các giáo trình cụ thể xác định thời lượng học về mặt lí thuyết và thực tế tiến hành xây dựng chương trình đào tạo dành cho đối tượng không chuyên ngữ tại việt nam điều tra đối với đối tượng giảng viên và đối tượng quản lí khảo sát thực tế giảng dạy tiếng nhật không chuyên ngữ tại việt nam khảo sát các chương trình đào tạo theo những bộ giáo trình tiêu biểu nội dung cụ thể cho từng kĩ năng ở từng cấp độ xác định mức độ đáp ứng về văn hoá và chuyên môn trong ct phát huy những thành tựu công nghệ mới nhất được áp dụng vào công tác dạy và học ngoại ngữ mở máy động cơ rôto dây quấn các đặc tính của động cơ điện không đồng bộ đặc tuyến hiệu suất h fi p2 đặc tuyến mômen quay m fi p2 đặc tuyến dòng điện stato i1 fi p2 động cơ điện không đồng bộ một pha từ bảng 3 1 ta thấy ngoài hai thành phần chủ yếu và chiếm tỷ lệ cao nhất là tinh bột và cacbonhydrat trong hạt gạo tẻ còn chứa đường cellulose hemicellulose