Pro Full-Text Search in SQL Server 2008 ppt

297 553 5
Pro Full-Text Search in SQL Server 2008 ppt

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Pro Full-Text Search in SQL Server 2008 ■■■ Michael Coles with Hilary Cotter www.it-ebooks.info Pro Full-Text Search in SQL Server 2008 Copyright © 2009 by Michael Coles and Hilary Cotter All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher. ISBN-13 (pbk): 978-1-4302-1594-3 ISBN-13 (electronic): 978-1-4302-1595-0 Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1 Trademarked names may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, we use the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. Lead Editor: Jonathan Gennick Technical Reviewer: Steve Jones Editorial Board: Clay Andres, Steve Anglin, Mark Beckner, Ewan Buckingham, Tony Campbell, Gary Cornell, Jonathan Gennick, Michelle Lowman, Matthew Moodie, Jeffrey Pepper, Frank Pohlmann, Ben Renow-Clarke, Dominic Shakeshaft, Matt Wade, Tom Welsh Project Manager: Denise Santoro Lincoln Copy Editor: Benjamin Berg Associate Production Director: Kari Brooks-Copony Production Editor: Laura Esterman Compositor/Artist: Octal Publishing, Inc. Proofreader: Patrick Vincent Indexer: Broccoli Information Management Cover Designer: Kurt Krames Manufacturing Director: Tom Debolski Distributed to the book trade worldwide by Springer-Verlag New York, Inc., 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax 201-348-4505, e-mail orders-ny@springer-sbm.com, or visit http://www.springeronline.com. For information on translations, please contact Apress directly at 2855 Telegraph Avenue, Suite 600, Berkeley, CA 94705. Phone 510-549-5930, fax 510-549-5939, e-mail info@apress.com, or visit http://www.apress.com. Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Special Bulk Sales–eBook Licensing web page at http://www.apress.com/info/bulksales. The information in this book is distributed on an “as is” basis, without warranty. Although every precaution has been taken in the preparation of this work, neither the author(s) nor Apress shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in this work. The source code for this book is available to readers at http://www.apress.com. www.it-ebooks.info For Devoné and Rebecca —Michael www.it-ebooks.info v Contents at a Glance About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii About the Technical Reviewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix ■CHAPTER 1 SQL Server Full-Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 ■CHAPTER 2 Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 ■CHAPTER 3 Basic and Advanced Queries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 ■CHAPTER 4 Client Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 ■CHAPTER 5 Multilingual Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 ■CHAPTER 6 Indexing BLOBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 ■CHAPTER 7 Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 ■CHAPTER 8 Thesauruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 ■CHAPTER 9 iFTS Dynamic Management Views and Functions . . . . . . . . . . . . 185 ■CHAPTER 10 Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 ■CHAPTER 11 Advanced Search Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 ■APPENDIX A Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 ■APPENDIX B iFTS_Books Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 ■APPENDIX C Vector-Space Searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 ■INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 www.it-ebooks.info vii Contents About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii About the Technical Reviewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix ■CHAPTER 1 SQL Server Full-Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Welcome to Full-Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 History of SQL Server FTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Goals of Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Mechanics of Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 iFTS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Indexing Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Query Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Search Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Measuring Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Synonymy and Polysemy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 ■CHAPTER 2 Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Initial Setup and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Enabling Database Full-Text Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Creating Full-Text Catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 The New Full-Text Catalog Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 The CREATE FULLTEXT CATALOG Statement . . . . . . . . . . . . . . . . . . 23 Upgrading Full-Text Catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Creating Full-Text Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 The Full-Text Indexing Wizard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 The DocId Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 The CREATE FULLTEXT INDEX Statement . . . . . . . . . . . . . . . . . . . . . 33 www.it-ebooks.info viii ■CONTENTS Full-Text Index Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Full Population. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Incremental Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Update Population. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Additional Index Population Options . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Catalog Rebuild and Reorganization. . . . . . . . . . . . . . . . . . . . . . . . . . 37 Scheduling Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 SQL Profiler Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 System Procedures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 ■CHAPTER 3 Basic and Advanced Queries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 iFTS Predicates and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 FREETEXT and FREETEXTTABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Adding a Language Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Returning the Top N by RANK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 CONTAINS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Phrase Searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Boolean Searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Prefix Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Generational Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Proximity Searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Weighted Searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 CONTAINSTABLE Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Advanced Search Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Using XQuery contains() Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Column Rank-Multiplier Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Taxonomy Search and Text Mining . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 ■CHAPTER 4 Client Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Hit Highlighting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 The Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Calling the Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Search Engine–Style Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Defining a Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 www.it-ebooks.info ■CONTENTS ix Extended Backus-Naur Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Implementing the Grammar with Irony. . . . . . . . . . . . . . . . . . . . . . . . 88 Generating the iFTS Query. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Converting a Google-Style Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Querying with the New Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 ■CHAPTER 5 Multilingual Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 A Brief History of Written Language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 iFTS and Language Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Writing Symbols and Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Bidirectional Writing and Capitalization . . . . . . . . . . . . . . . . . . . . . . 103 Hyphenation and Compound Words . . . . . . . . . . . . . . . . . . . . . . . . . 104 Nonalphanumeric Characters and Accent Marks . . . . . . . . . . . . . . 105 Token Position Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Generational Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Storing Multilingual Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Storing Plain Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Storing XML. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Storing HTML Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Storing Microsoft Office Documents. . . . . . . . . . . . . . . . . . . . . . . . . 112 Storing Other Document Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Detecting Content Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Designing Tables to Store Multilingual Content . . . . . . . . . . . . . . . . . . . . 112 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 ■CHAPTER 6 Indexing BLOBs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 LOB Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Character LOB Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 XML LOB Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Binary LOB Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 FILESTREAM BLOB Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Efficiency Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 FILESTREAM Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 T-SQL Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Storage Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 OpenSqlFilestream API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 www.it-ebooks.info x ■CONTENTS ■CHAPTER 7 Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 System Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Creating Custom Stoplists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Managing Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Upgrading Noise Word Lists to Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Stoplist Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Stoplists and Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Stoplists and Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 ■CHAPTER 8 Thesauruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Thesaurus Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Editing and Loading Thesaurus Files . . . . . . . . . . . . . . . . . . . . . . . . 167 Expansion Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Replacement Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Global and Local Thesauruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 A Practical Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Word Bags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Additional Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Accent and Case Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Nonrecursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Overlapping Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 General Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 ■CHAPTER 9 iFTS Dynamic Management Views and Functions . . . . . . . 185 iFTS and Transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 DMVs and DMFs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Looking Inside the Full-Text Index . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Parsing Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Accessing Full-Text Index Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Retrieving Population Information. . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Services and Memory Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 www.it-ebooks.info ■CONTENTS xi Catalog Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Listing Full-Text Catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Retrieving Full-Text Index Metadata. . . . . . . . . . . . . . . . . . . . . . . . . 198 Revealing Stoplists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Viewing Supported Languages and Document Types . . . . . . . . . . 204 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 ■CHAPTER 10 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Introducing Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Standard Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Third-Party Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Custom Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Custom Filter Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Filter Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Custom Filter Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Filter Class Factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Filter Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Compiling and Installing the Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Testing the Filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Gatherer and Protocol Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Word Breakers and Stemmers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 ■CHAPTER 11 Advanced Search Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Spelling Suggestion and Correction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Hamming Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Spelling Suggestion Implementation . . . . . . . . . . . . . . . . . . . . . . . . 241 Name Searching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Phonetic Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Soundex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 NYSIIS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 String Similarity Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Longest Common Subsequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Edit Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 N-Grams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 www.it-ebooks.info xii ■CONTENTS ■APPENDIX A Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 ■APPENDIX B iFTS_Books Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Installing the Sample Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Installing the Phonetic Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 ■APPENDIX C Vector-Space Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Documents As Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 ■INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 www.it-ebooks.info [...]... looking for —Bono Vox, U2 F ull-text search encompasses techniques for searching text-based data and documents This is an increasingly important function of modern databases SQL Server has had full-text search capability built into it since SQL Server 7.0 SQL Server 2008 integrated full-text search (iFTS) represents a significant improvement in full-text search functionality, a new level of full-text search. .. process: The SQL Server process contains both the SQL Server query processor, which compiles and executes SQL queries, and the full-text engine, which compiles and executes full-text queries This tight integration of the SQL Server and full-text query processors in SQL Server 2008 is a significant improvement over prior versions of SQL Server full-text search, allowing SQL Server to generate far more... adds the indexable words to inverted index fragments The last step of the indexing process is the master merge, which combines all of the index fragments into a single master full-text index The indexing process in general and the master merge in particular can be resource- and I/O-intensive Despite the intensity of the process, the indexing process doesn’t block queries from occurring Querying a full-text. .. degree in information technology and multiple Microsoft and other certifications Michael has published dozens of technical articles online and in print magazines, including SQL Server Central, ASPToday, and SQL Server Standard Michael is the author of the books Pro SQL Server 2008 XML (Apress, 2008) and Pro T -SQL 2008 Programmer’s Guide (Apress, 2008) , and he is a contributor to Accelerated SQL Server 2008. .. seen in FTS’s dependence on components that implement Indexing Service’s programming interfaces For instance, in SQL Server, document-specific filters are tied to filename extensions Though powerful for its day, the initial implementations of FTS in SQL Server 7.0 and 2000 proved to have certain limitations, including the following: • The DBMS itself made storing, manipulating, searching, and retrieving... enterprise-class database management system (DBMS) History of SQL Server FTS Full-text search has been a part of SQL Server since version 7.0 The initial design of SQL Server full-text search provided for reuse of Microsoft Indexing Service components Indexing Service is Microsoft’s core product for indexing and searching files and documents in the file system The idea was that FTS could easily reuse... interested in Having all key words in an index returns hits substantially faster than looking through every document you’re storing to find the user’s search phrase SQL Server uses an inverted index structure to store full-text index data The inverted index structure is built by breaking searchable content into word-length tokens (a process known as tokenizing) and storing each word with relevant metadata in. .. of international character sets and multilingual searches We also provide best practices around multilingual searching Chapter 6 SQL Server 2008 provides greater flexibility and more options for storing large object (LOB) data in your databases Chapter 6 discusses the options available for storing, managing, and indexing LOB data in your database In this chapter, we take a look at how SQL Server indexes... This book is intended for SQL Server developers and DBAs who want to get the most out of SQL Server 2008 Integrated Full-Text Search (iFTS) To get the most out of this book, you should have a working knowledge of T -SQL, as most of the sample code in the book is written in SQL Server 2008 T -SQL Sample code is also provided in C# and C++, where appropriate Although knowledge of these programming languages... of SQL Server the full-text index in SQL Server 2008 is stored in the database instead of the file system We will discuss setup, configuration, and population of full-text indexes in detail in Chapter 2 • Stoplist: The stoplist is simply a list of stopwords, or words that are considered useless for the purposes of full-text search The indexer consults the stoplist during the indexing and querying process . Pro Full-Text Search in SQL Server 2008 ■■■ Michael Coles with Hilary Cotter www.it-ebooks.info Pro Full-Text Search in SQL Server 2008 Copyright. an increasingly important function of modern databases. SQL Server has had full-text search capability built into it since SQL Server 7.0. SQL Server 2008

Ngày đăng: 07/03/2014, 18:20

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan