Semantic Web takes big step forward

The Semantic Web got a critical boost Tuesday from the World Wide Web Consortium (W3C).

A term that has been tossed around for years, the Semantic Web is envisioned as a Web extension to make it easier to find and group information. The W3C gave that concept a push forward when it announced publication of SPARQL (pronounced "sparkle") query technology, a Semantic Web component designed to enable people to focus on what they want to know rather than on the database technology or data format used to store data.

The potential of the Semantic Web cannot be underestimated. By scanning the Web on behalf of users, even Google's ad-based business model could be impacted, an analyst said.

SPARQL queries express high-level goals and are easier to extend to unanticipated data sources. The technology overcomes limitations of local searches and single formats, according to the W3C.

"[SPARQL is] the query language and protocol for the Semantic Web," said Lee Feigenbaum, chair of the W3C's Resource Description Framework (RDF) Data Access Working Group, which is responsible for SPARQL.

Already available in 14 known implementations, SPARQL is designed to be used at the scale of the Web to support queries over distributed data sources independent of format. It also can be used for mashing up Web 2.0 data.

The Semantic Web, the W3C said, is intended to enable sharing, merging and reusing of data globally. "The basic idea of the Semantic Web is take the idea of the Web, which is effectively a linked set of documents around the world, and apply it to data," Feigenbaum said.

"One way to think about the Semantic Web is the Web as one big database," said W3C spokesman Ian Jacobs. A database, he said, enables querying and manipulation of data. More database-like Web sites are emerging, he said.

Comparing the Semantic Web to search sites such as Google, Jacobs noted that Google allows for searching through document text, essentially. The Semantic Web, meanwhile, allows for automation and combining of data, he said.

While the Semantic Web concept has been talked about for several years, Feigenbaum said momentum is building. He cited DBpedia, which extracts structured information from Wikipedia, as an example of a Web site based on the Semantic Web.

With the Semantic Web's ability to home in on just the information a user needs, companies like Google that rely on a Web search advertising model may have to reconsider their plans, said analyst Jonas Lamis, executive director of SciVestor Corp., an Austin-based research and advisory firm that focuses on emerging technology.

"They may need to rethink their business model, because if I have an agent that acts on my behalf and finds things that are interesting for me, it's not necessarily going to be reading Google ads to do that," Lamis said.

The goal of the Semantic Web is to serve as a giant set of databases that can be integrated, Jacobs said. The Semantic Web has seen a lot of uptake in the health care and life sciences, he said. The drug discovery and pharmaceutical fields can use it to take clinical results and learn from data, according to Jacobs.

Pharmaceutical company Eli Lilly and Co. is using Semantic Web technology for research.

"We're using it for our targeted assessment tools, which helps us to find out as much information or find out lots of information about drug targets of interest," said Susie Stephens, principal research scientist at Eli Lilly and chair of the W3C Semantic Web Education and Outreach Working Group. A drug target is a protein in the body that is to be modified with a particular drug.

"We use Semantic Web technologies to help us link to lots of information about the drug targets," she said.

The SPARQL specification works with other W3C Semantic Web technologies, including the following: RDF, for representing data; RDF Schema; Web Ontology Language (OWL), for building vocabularies; and Gleaning Resource Descriptions from Dialects of Languages (GRDDL), for automatic extraction of Semantic Web data from documents.

SPARQL also can use other W3C standards such as Web Services Description Language.

The W3C RDF Data Access Working Group has produced three SPARQL recommendations, which were issued Tuesday: the SPARQL Query Language for RDF, SPARQL Protocol for RDF, and SPARQL Query Results for XML Format.

Participants in the working group include people from companies such as Agfa-Gevaert Group, Hewlett-Packard Co., IBM, Matsushita and Oracle Corp. The W3C released statements of support for SPARQL from numerous parties, including HP and Oracle.

"SPARQL is a key element for integrated information access across information silos and across business boundaries. HP customers can benefit from better information utilization by employing semantic Web technologies," said Jean-Luc Chatelain, CTO of HP Software Information management, in the company's statement.

"HP's Jena Semantic Web framework has a complete implementation of query language, protocol, and result set processing," Chatelain said.

"As an active participant in this working group, Oracle believes the standardization of SPARQL will play an instrumental role in achieving the vision of the Semantic Web," said Don Deutsch, vice president of Standards Strategy at Oracle, in his company's statement.

This story, "Semantic Web takes big step forward" was originally published by InfoWorld.

Copyright © 2008 IDG Communications, Inc.

  
Shop Tech Products at Amazon