www.it-ebooks.info www.it-ebooks.info SECOND EDITION Learning SPARQL Querying and Updating with SPARQL 1.1 Bob DuCharme Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo www.it-ebooks.info Learning SPARQL, Second Edition by Bob DuCharme Copyright © 2013 O’Reilly Media. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Editors: Simon St. Laurent and Meghan Blanchette Production Editor: Kristen Borg Proofreader: Amanda Kersey Indexer: Bob DuCharme Cover Designer: Randy Comer Interior Designer: David Futato Illustrator: Rebecca Demarest August 2013: Second Edition. Revision History for the Second Edition: 2013-06-27 First release See http://oreilly.com/catalog/errata.csp?isbn=9781449371432 for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Learning SPARQL, the image of an anglerfish and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein. ISBN: 978-1-449-37143-2 [LSI] 1372271958 www.it-ebooks.info For my mom and dad, Linda and Bob Sr., who always supported any ambitious projects I attempted, even when I left college because my bandmates and I thought we were going to become big stars. (We didn’t.) www.it-ebooks.info www.it-ebooks.info Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1. Jumping Right In: Some Data and Some Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The Data to Query 2 Querying the Data 3 More Realistic Data and Matching on Multiple Triples 8 Searching for Strings 12 What Could Go Wrong? 13 Querying a Public Data Source 14 Summary 17 2. The Semantic Web, RDF, and Linked Data (and SPARQL) . . . . . . . . . . . . . . . . . . . . . . 19 What Exactly Is the “Semantic Web”? 19 URLs, URIs, IRIs, and Namespaces 21 The Resource Description Framework (RDF) 24 Storing RDF in Files 24 Storing RDF in Databases 29 Data Typing 30 Making RDF More Readable with Language Tags and Labels 31 Blank Nodes and Why They’re Useful 33 Named Graphs 35 Reusing and Creating Vocabularies: RDF Schema and OWL 36 Linked Data 41 SPARQL’s Past, Present, and Future 43 The SPARQL Specifications 44 Summary 45 3. SPARQL Queries: A Deeper Dive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 More Readable Query Results 48 Using the Labels Provided by DBpedia 50 Getting Labels from Schemas and Ontologies 53 vii www.it-ebooks.info Data That Might Not Be There 55 Finding Data That Doesn’t Meet Certain Conditions 59 Searching Further in the Data 61 Searching with Blank Nodes 68 Eliminating Redundant Output 69 Combining Different Search Conditions 72 FILTERing Data Based on Conditions 75 Retrieving a Specific Number of Results 78 Querying Named Graphs 80 Queries in Your Queries 87 Combining Values and Assigning Values to Variables 88 Creating Tables of Values in Your Queries 91 Sorting, Aggregating, Finding the Biggest and Smallest and 95 Sorting Data 96 Finding the Smallest, the Biggest, the Count, the Average 98 Grouping Data and Finding Aggregate Values within Groups 100 Querying a Remote SPARQL Service 102 Federated Queries: Searching Multiple Datasets with One Query 105 Summary 107 4. Copying, Creating, and Converting Data (and Finding Bad Data) . . . . . . . . . . . . . . 109 Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT 110 Copying Data 111 Creating New Data 115 Converting Data 120 Finding Bad Data 123 Defining Rules with SPARQL 124 Generating Data About Broken Rules 127 Using Existing SPARQL Rules Vocabularies 131 Asking for a Description of a Resource 133 Summary 134 5. Datatypes and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Datatypes and Queries 135 Representing Strings 141 Comparing Values and Doing Arithmetic 142 Functions 145 Program Logic Functions 146 Node Type and Datatype Checking Functions 150 Node Type Conversion Functions 153 Datatype Conversion 158 Checking, Adding, and Removing Spoken Language Tags 164 String Functions 171 viii | Table of Contents www.it-ebooks.info Numeric Functions 175 Date and Time Functions 177 Hash Functions 179 Extension Functions 182 Summary 183 6. Updating Data with SPARQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Getting Started with Fuseki 186 Adding Data to a Dataset 188 Deleting Data 194 Changing Existing Data 196 Named Graphs 201 Dropping Graphs 204 Named Graph Syntax Shortcuts: WITH and USING 206 Copying and Moving Entire Graphs 209 Deleting and Replacing Triples in Named Graphs 210 Summary 215 7. Query Efficiency and Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Efficiency Inside the WHERE Clause 217 Reduce the Search Space 218 OPTIONAL Is Very Optional 219 Triple Pattern Order Matters 220 FILTERs: Where and What 221 Property Paths Can Be Expensive 225 Efficiency Outside the WHERE Clause 226 Debugging 227 Manual Debugging 227 SPARQL Algebra 229 Debugging Tools 231 Summary 232 8. Working with SPARQL Query Result Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 SPARQL Query Results XML Format 238 Processing XML Query Results 241 SPARQL Query Results JSON Format 244 Processing JSON Query Results 247 SPARQL Query Results CSV and TSV Formats 249 Using CSV Query Results 250 TSV Query Results 251 Summary 252 Table of Contents | ix www.it-ebooks.info 9. RDF Schema, OWL, and Inferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 What Is Inferencing? 254 Inferred Triples and Your Query 256 More than RDFS, Less than Full OWL 257 SPARQL and RDFS Inferencing 258 SPARQL and OWL Inferencing 263 Using SPARQL to Do Your Inferencing 269 Querying Schemas 271 Summary 273 10. Building Applications with SPARQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Applications and Triples 277 Property Functions 277 Model-Driven Development 279 SPARQL and Web Application Development 282 SPARQL Processors 291 Standalone Processors 292 Triplestore SPARQL Support 292 Middleware SPARQL Support 293 Public Endpoints, Private Endpoints 294 SPARQL and HTTP 295 GET a Graph of Triples 298 PUT a Graph of Triples 300 POST a Graph of Triples 300 DELETE a Graph of Triples 301 Summary 301 11. A SPARQL Cookbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Themes and Variations 303 Exploring the Data 306 How Do I Look at All the Data at Once? 306 What Classes Are Declared? 308 What Properties Are Declared? 310 Which Classes Have Instances? 313 What Properties Are Used? 314 Which Classes Use a Particular Property? 316 How Much Was a Given Property Used? 317 How Much Was a Given Class Used? 320 A Given Class Has Lots of Instances. What Are These Things? 321 What Data Is Stored About a Class’s Instances? 324 What Values Does a Given Property Have? 326 A Certain Property’s Values Are Resources. What Data Do We Have About Them? 328 x | Table of Contents www.it-ebooks.info [...]... documentation does require permission xvi | Preface www.it-ebooks.info We appreciate, but do not require, attribution An attribution usually includes the title, author, publisher, and ISBN For example: Learning SPARQL, 2nd edition, by Bob DuCharme (O’Reilly) Copyright 2013 O’Reilly Media, 978-1-449-37143-2.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to... identifiers, to store first and last names as property values, and to put the data values in their own separate http://learningsparql.com/ns/data# namespace, we get this set of sample data: # filename: ex012.ttl @prefix ab: @prefix d: d:i0432 d:i0432 d:i0432 d:i0432 ab:firstName ab:lastName ab:homeTel ab:email "Richard"... after the comment about the filename, is also a triple ending with a period It tells us that the prefix “ab” will stand in for the URI http://learningsparql.com/ns/addressbook#, just as an XML document might tell us with the attribute setting xmlns:ab="http://learningsparql.com/ns/addressbook#" An RDF triple’s subject and predicate must each belong to a particular namespace in order to prevent confusion... known as web addresses, are one kind of URI A locator helps you find something, like a web page (for example, http://www.learningsparql.com/resources/index.html), and an identifier identifies something So, for example, the unique identifier for Richard in my address book dataset is http://learningsparql.com/ns/addressbook#richard A URI may look like a URL, and there may actually be a web page at that address,... prefixed names It’s essentially the same query, and gets the same answer from ARQ: Querying the Data | 5 www.it-ebooks.info # filename: ex006.rq SELECT ?craigEmail WHERE { ?craigEmail } The differences between this query and the first one demonstrate two things: • You don’t need to use prefixes in your query,... http://www.youtube.com/oreillymedia Acknowledgments For their excellent contributions to the first edition, I’d like to thank the book’s technical reviewers (Dean Allemang, Andy Seaborne, and Paul Gearon) and sample audience reviewers (Priscilla Walmsley, Eric Rochester, Peter DuCharme, and David Germano) For the second edition, I received many great suggestions from Rob Vesse, Gary King, Matthew Gibson, and... 1: Jumping Right In: Some Data and Some Queries www.it-ebooks.info | person | ============================================= | | - If I really want to know who called me, “http://learningsparql.com/ns/data#i0432” isn’t a very helpful answer Although the ex008.rq query doesn’t return a very human-readable answer from the ex012.ttl... development or your queries more efficient A warning about a common problem or an easy trap to fall into Using Code Examples You’ll find a ZIP file of all of this book’s sample code and data files at http://www learningsparql.com, along with links to free SPARQL software and other resources This book is here to help you get your job done In general, if this book includes code examples, you may use the code in... tools that use this data model make it possible to expose diverse sets of data (including, as we’ll see, relational databases) with a common, standardized interface Accessing this data doesn’t require learning new APIs because both open source and commercial software (including Oracle 11g and IBM’s DB2) are available with SPARQL support that lets you take advantage of these data sources Because of this... position are OK to match this triple pattern, the values that show up there get stored in the ?craigEmail variable so that we can use them elsewhere in the query: # filename: ex003.rq PREFIX ab: Querying the Data | 3 www.it-ebooks.info SELECT ?craigEmail WHERE { ab:craig ab:email ?craigEmail } This particular query is doing this to ask for any ab:email values . www.it-ebooks.info www.it-ebooks.info SECOND EDITION Learning SPARQL Querying and Updating with SPARQL 1.1 Bob DuCharme Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo www.it-ebooks.info Learning SPARQL, Second Edition by. An attribution usually includes the title, author, publisher, and ISBN. For example: Learning SPARQL, 2nd edition, by Bob DuCharme (O’Reilly). Copyright 2013 O’Reilly Media, 978-1-449-37143-2.” If. Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Learning SPARQL, the image of an anglerfish and related trade dress are trademarks of O’Reilly Media,