Semantic Web

The Semantic Web is a visionary project that aims to enhance the usability and usefulness of the Web by enabling computers to find, read, understand and use the content of Web documents to accomplish tasks via automated agents and Web-based services.

Listen to the Computerworld TechCast: Semantic Web.

The vast amounts of computerized data contained on the World Wide Web (including the Deep Web) would appear to be the largest body of information ever assembled. Certainly, the Web is a uniquely valuable tool for both research and the dissemination of ideas and knowledge. But the fact remains that the Web has been remarkably resistant to direct, effective, efficient use by computers. Web pages are designed to be read by people, not machines; therefore, the meaning of the content must be inferred by people who look at Web pages, read HTML documents and view the labels of hyperlinks.



Tim Berners-Lee -- the Oxford University graduate who invented the Web in 1989, wrote the first Web browser and server in 1990 and currently directs the World Wide Web Consortium -- has a much grander vision for the Web of the future, which he calls the Semantic Web. The Semantic Web adds a metadata infrastructure of tags to define elements of information within Web pages, linking them so computers can extract meaning from widely separated data as easily as the Internet currently links individual documents. The Semantic Web will make it possible for machines, as well as people, to find, read, understand and use data over the Web to accomplish useful tasks. The Semantic Web will extend, not replace, the Web as we know it today.

In some instances, we already use specialized software to work with carefully identified Web data, but this is the exception, not the rule. It takes people to surf the Web, shop online, make sense of search-engine results and decide which additional links to follow. The Semantic Web, once it becomes a functioning reality, will let a user launch an agent or process that will then proceed on its own, perhaps checking back with the user periodically as the work progresses.

The Internet was originally created as a way for researchers to easily exchange computer data with one another. People could transmit files using fundamental methods such as file transfer protocol. Although data traveled across the Internet in the form of bits and bytes, the basic unit of meaning, as far as the computer systems were concerned, was the file.

That changed when the Web came into being. Berners-Lee built his Web around pages, which are documents written in HTML. A versatile language, HTML combines interactive forms, text and multimedia objects -- such as images and sound -- and it describes how these elements should be presented and what the overall page should look like. Unfortunately, HTML has a very limited ability to classify the blocks of text on a page, apart from the roles they play in a typical document's organization and in the chosen graphical layout.

As Web use grew, HTML's limitations led to the development of XML and XHTML, which began to offer mechanisms for adding meaning to Web pages. The Simple Object Access Protocol and Web services became a reality, making it easier for users and even automated processes to gather specific information or perform specialized functions across the Web. When the Semantic Web comes to fruition, software will be able to locate information within Web pages, thus breaking through the document level and accessing real data that it can use directly. In one sense, the Semantic Web will become a kind of global database.

Making Machines Smarter

In an age when grandmothers and kindergartners use computers and surf the Web, it's sometimes hard to recall just how much direction or guidance a user has to give a computer to accomplish anything. Machines can't use partial information, they don't know what's inside an image or graphic, they're not much good at making analogies or combining information from different sources, and they don't have a big vocabulary.

We can easily use the Web to look up a Computerworld article or blog, locate a piece of music heard on the radio, buy a book, locate an eye doctor near our workplace or put out a question on a chat forum or bulletin board. But ask your computer to do the same thing, and it won't know where to start unless you give it a detailed, correctly spelled series of commands and responses in the proper sequence.

For example, using HTML and a Web browser, one can create and present a catalog page of items for sale. But HTML has no inherent capability to know that, say, Item No. JG1896 is an Acme widget with a retail price of $9.95. All HTML can specify is that the text "JG1896" should be positioned near the text "Acme widget" and "$$" HTML has no way to express or know that "Acme widget" is a kind of consumer product, that "$$" is a price, or that these pieces of information describe an item that is distinct from other items listed on the same page.

The Semantic Web will address that by enabling computers and software to find, read, understand and use information contained inside Web documents to accomplish useful tasks via automated agents and Web-based services.

Kay is a Computerworld contributing writer in Worcester, Mass. You can contact him at

Related articles

Are there technologies or issues you'd like to learn about in QuickStudy? Send your ideas to

See additional Computerworld QuickStudies

Copyright © 2006 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon