Header Ziff Davis Enterprise
Advertisement
Advertisement
Wednesday, May 30, 2007 1:12 PM/EST

Weaving the Semantic Web

Tim Berners-Lee

eWEEK's Emerging Technology Looks at the Semantic Web


Semantic Web Technology Gains Steam - eWEEK Labs tests out tools for building the Semantic Web


Using the Semantic Web in the Real World - A visual look at several real-world deployments of Semantic Web technology and how they compare to current methods


Podcast: The Challenges of the Semantic Web - In this podcast I speak to Tim Berners-Lee about the current status of the Semantic Web, the challenges it faces and its future. I also speak to Eric Miller of Zepheira and to Stephen Downes, a researcher at the National Research Council's Institute for Information Technology in Canada.

And more to come. Check back later this week as I'll be posting my full interview with Tim Berners-Lee.




In 1991 Tim Berners-Lee created the World Wide Web and forever changed business, education, and the way people interact. A few years after that, he began speaking about his next vision for the Web, one which would do for data what the original Web had done for unstructured content.

Berners-Lee called this new vision the Semantic Web. Put simply, the Semantic Web would make it possible to treat the entire Web as if it were a database. In the same way that a developer can query data in a standard database and build applications that use that data, people would be able to query data from across the entire web and build as-needed applications that pulled related but diverse data from multiple sources.

On the Semantic Web, it wouldn't be necessary to infer what something was about through the use of text searches and guesswork since the information would be specifically tagged and marked up to clearly say what it was. More importantly, the Semantic Web would make it possible to easily link to and find similar and related data.

However, it has taken many years to get all of the pieces of the Semantic Web into place. Several core pieces, including the querying language, only recently came close to being standardized by the World Wide Web Consortium, of which Berners-Lee is the Chairman. To many it seemed as if the Semantic Web was an emerging technology that was taking an awfully long time to emerge.

The Semantic Web is finally taking shape. Businesses, sites and Web applications are beginning to define, link to, and create data models that take advantage of Semantic Web technologies to provide new types of functionality. Now is the time for businesses, developers and web users to get ready, the Semantic Web is finally here.

eWEEK Labs interviewed the man himself, Tim Berners-Lee, to get his take on the status and outlook of the Semantic Web. We also spoke to Eric Miller, the longtime head of the Semantic Web initiative at the World Wide Web Consortium and the President of Zepheira, a company that helps businesses deploy and leverage Semantic Web technologies.

In order to shed light on the real world implications of the Semantic Web, we've also evaluated several examples of publicly accessible real-world implementations of Semantic Web technologies.

Finally, we've studied the challenges facing the Semantic Web, from security risks to hype curves to proprietary data islands and spoke to Stephen Downes, a researcher at the National Research Council's Institute for Information Technology in Canada, who believes that the Semantic Web will ultimately fail because of proprietary data protections.

With this information, we hope you'll gain a better understanding of what the Semantic Web is, where it stands now, it's outlook for the future, and, most importantly, how it will impact your business.

Under Construction

To a large degree, the Semantic Web has been a work in progress for as long as the term has exisited. Berners-Lee said that, "The last ten years we've been building the foundation of the Semantic Web in the sense of building the data formats and building the ontology language and all the things related to them."

The Semantic Web relies on several key technologies to make content data aware. The first is something that was part of the original Web, namely URIs (Uniform Resource Identifiers). Anytime you use the Web you are using lots of URIs as they are the core addressing method of the Web (every standard URL web address is a type of URI). URIs are important for the Semantic Web as you must be able to address and identify data sources in order to access them, just like a website.

Even more core to the Semantic Web is RDF (Resource Description Framework) which was pretty much the first Semantic Web standard that was defined. RDF makes it possible to describe Web-based content so that it is understandable to machines. A good example of an RDF file are FOAF (Friend of a Friend) files, which are essentially Semantic Web files about people.

For instance the FOAF file for "Jim Rapoza" makes it possible for a program to understand that there is a person whose name is Jim Rapoza, who has specific websites, business and educational affiliations and friends. Most importantly, those friends have their own FOAF and RDF files, and the machine can follow those links, which is in itself a core aspect of the Semantic Web, in that data leads to other related and relevant data.

For a while RDF was pretty much the only Semantic Web standard, and while this led to some interesting RDF implementations the Semantic Web stayed stalled. Then the W3C released the Web Ontology Language or OWL, which was especially core for business use as the ability to define ontologies is key for categorizing and classifying groups of related data.

Still the Semantic Web had a key weakness in that it had no querying language. As Berners-Lee said, "Imagine trying to develop relational databases without SQL." However, this was addressed with SPARQL, which brings SQL-like querying capabilities to RDF and the Semantic Web.

So there's your alphabet soup of standards and technologies--but how are people using Semantic Web technologies and how to they differ from traditional Web resources? Really the best way to understand the Semantic Web is through examples.

Click here to see more on how Semantic Web technology is used in the real world

Semantic Web in Action

There are lots of classic examples of Semantic Web technologies that can help with really thorny problems, such as life science applications that help researchers search, access and understand medicines and diseases that can be identified and referred to using multiple names. But there are also examples that apply to everyday Web usage.


DBpedia.org is a project that takes Semantic Web technology and applies it to the vast amount of data inside the popular Wikipedia.org Internet encyclopedia. Using DBpedia, it is possible to use SPARQL to query Wikipedia in a much more powerful way than is possible using standard search tools. For example, using the Wikipedia search engine to look for television sitcoms set in New York produces a pretty much useless set of results where only one sitcom even appears on the initial results page. However, using the Semantic Web powered DBpedia, a fully accurate list of popular TV shows set in New York is returned, almost as if you had queried a SQL database rather than a website that had been made semantically aware.


Another example is the much hyped Joost online television service, which uses Semantic technology on the backend to help users better understand the relationships between particular pieces of content, which in, turn, helps users find the content that they most want.


Helping businesses overcome the hurdles to understanding and deploying Semantic Web technologies was one of the reasons why the Semantic Web Initiative's Eric Miller started the company Zepheira. Miller said, "There are lots of good standards and technologies out there but the gap between the standards and technologies was still quite large."


One key aspect that Miller has identified after deploying Semantic Web technologies in a wide variety of businesses is that most companies already have a great deal of rich semantic data in many existing systems, from mail applications to calendaring tools to databases to company LDAP directories. He said, "Enterprises are realizing that they have huge intellectual capital that they are not harnessing effectively."


Miller said that much of the work being done now in businesses revolves around freeing data from proprietary systems so that this data can be used in Semantic Web applications. He also said that more and more Semantic Web technologies are being used for traditional business integration. This is in-line with comments from Berners-Lee, who told us that "The number one role of Semantic Web technologies is data integration across applications."


Adoption Roadblocks


While the potential of the Semantic Web is very high, there are definitely plenty of issues and potential gotchas that face this emerging technology. Since it is a web-based technology, the Semantic Web will be vulnerable to scammers and bad guys who will try to use the technology to their advantage. For example, just as there are phishing sites that try to look like other legitimate sites, it is possible that similar techniques will be used to trick users with false data that appears to come from a legitimate source.


Also, access control is an important issue for Semantic Web applications, especially in business implementations, where it will be important to make sure data doesn't go to people who don't have the right to see it. Berners-Lee said that this is an area of focus for the Semantic Web community and pointed to the Policy Aware Web project, which is working towards creating access control rules for emerging web technologies.


Another challenge facing the Semantic Web is hype. The Semantic Web has recently become in vogue for many vendors hoping to gain attention for their products, with some marketers already using the term Web 3.0 to describe Semantic Web products and technologies.


Typically what happens when a technology gets hyped is that lots of products start to claim that they are part of this new in-crowd, even if they really aren't. We've already received pitches of products claiming to be Semantic Web technologies that clearly have nothing to do with the Semantic Web. Often these types of hype cycles can actually slow down the progress of an emerging technology as they confuse potential customers and distract developers.


Berners-Lee said that there's one simple way to determine if a product is actually a Semantic Web technology: Look for the standards support. If the product doesn't support core standards like RDF, OWL or SPARQL, then it isn't a Semantic Web product.


However, to some observers the biggest challenge facing the Semantic Web isn't security or hype or standards support, it's greed. One argument is that businesses, software vendors and large commercial websites won't want to expose their data, that they'll develop their own proprietary formats in order to keep people on their products and sites.


This is the argument made by researcher Stephen Downes, who wrote a blog essay entitled "Why the Semantic Web Will Fail". In our interview with him, Downes said, "Companies first and foremost attempt to secure a monopoly over a particular format or a particular standard."


Downes, who has worked with Semantic Web and similar technologies in his work in online learning, pointed out that while technologies like RDF have been around for years, many large companies have avoided using them in projects where they should have made sense. And it isn't hard to see his point, from public sites like Flickr and even Google to corporate products like IBM's Lotus Connections, which has lots of semantic capabilities but doesn't use RDF or other Semantic Web technologies.


However, both Berners-Lee and Miller pointed out to us the many ways that proprietary data can be easily converted into Semantic Web data, for example pointing out how sites like Flickr are already machine readable. Also, Berners-Lee said that in order for sites and products to remain competitive they will have to make their proprietary data Semantic Web aware. He said that people won't give sites and companies their data (which is what makes most sites valuable) unless they can re-use it, "All of these sites, no matter how fancy they are, they are going to have to realize that the users will want their data back."


The Semantic Future


So what is the outlook for the Semantic Web? Will we continue to see lots of islands of proprietary semantic data that don't integrate well together? Will we soon see the giant all-the-web as a database scenario where the Semantic Web makes possible all kinds of new and exciting types of applications?


Our take is that the Semantic Web will eventually succeed, as it holds too many benefits for too many people to fall by the wayside. But it's also likely that the Semantic Web won't come about in the exact same way that many people envision. The lesson of the Web 2.0 technologies is that users are often surprising in the way they utilize new technologies. Miller told us that he had already seen businesses using Semantic Web technologies in interesting and unexpected ways.


But one thing is certain. The way that information is found, data is analyzed and web applications are built is going to change radically because of these new technologies. Businesses should start investigating these technologies and figuring out how to best leverage them in their infrastructure.


Or to use the words of Tim Berners-Lee, "It's time to get Semantic Web wise."


TrackBack

TrackBack

http://etech.eweek.com/cgi-bin/mte/mt-tb.cgi/11074

Comments (15)

The leader in NLP, Ontologies and the Semantic Web, check them out.

http://www.landcglobal.com

Paul Snyder :

I hope this is a typo in the article - "where it will be important to make sure data does go to people who don't have the right to see it."

Jim Rapoza :

Thanks for catching the typo Paul. Great thing about blog platforms is that it's easy to fix this kind of problem.

Very good post. Keep up the good work!

Thank you, James Rapozo, for writing an excellent article on the current state of roll out of the long-in-development Semantic Web.

Good Work!

By the way, who do you think at the best annotators, watching writers such as yourself, and posting insightful annotated surveys (Michael Vizard does one, who are the others, and what is there url?)?

http://www.landcglobal.com/pages/tessi.php

Here is a link to one product for semantic
processing that Sylvestro pointed to.

Thanks Jim for the well balanced and insightful perspective. Would like to hear more on how you think the market will evolve around this technology. Seems to me there is a huge opportunity on the consumer side.

Justin Kestelyn :

http://otnsemanticweb.oracle.com is another good example - use of RDF and OWL.

Brian Crook :

Here are a few more examples of enterprise vendors adopting the technology...

  1. IBM
  2. Microsoft
  3. webMethods


Excellent article. Well researched and informative. Keep uo the good work.

test :

aaa

123

This is a great article. We are definitely nearing a "tipping point" on semantic technologies. Tools are the key. Good tools are just starting to emerge.

Semantic Bridge Technologies (located in Austin, TX) is creating a tool set and the supporting infrastructure for the implementation of the Semantic Web. We are taking a very pragmatic approach. Our target audience is comprised of web designers and software engineers who build Internet and enterprise applications not theorists who study semantic structures. We are building a bridge, not an ivory tower.

One of the key aspects of the Semantic Bridge Project is the creation of the �Semantic Ontology Repository�. This repository will be the nexus for managing ontologies (including microformats). It will be the official �hall of records� for the collaborative efforts made by virtual ontology groups. In essence, this repository will be the source for the organization and structure of knowledge, goods and services. The �Semantic Ontology Repository� will be established as a vendor neutral non-profit corporation.

In its simplest implementation, a web author or web designer will be able to use tools to Interact with the "Semantic Ontology Repository" and bring semantic structure to the information he or she is creating. In its eventual application, the �Semantic Ontology Repository� will transform enterprise management systems.

We will develop standards for the fair and objective management of the repository and dynamic interactions with the repository. We intend to create a management system that will enable the organic development of ontologies. This is an incredibly grandiose vision - nothing less than managing the organization, structure and growth of all knowledge. This will most likely be the greatest collaborative endeavor in human history.

While existing lists of ontologies may seem overwhelming, the basic ontologies for e-commerce applications and most Internet sites will be quite manageable. It will be interesting to see how the statistics evolve, but our initial guess is that less than five per cent of the ontologies will be applied to more than ninety-five percent of semantic classifications on the Internet.

We recognize that, "..central control is stifling, and increasing the size and scope of such a system rapidly becomes unmanageable." We believe a non-bureaucratic approach that pushes control down to the level of the virtual ontology groups will result in an organic self-regulating system.

We also recognize that some organizations may wish to manage their own ontologies. For example, the ontology for molecular bio-chemistry might be maintained by a leading research university; specific ontologies for Business Process Management Systems (BPM) might be maintained by the system provider; organizations may wish to maintain their own private internal ontologies; etc. One of the most significant aspects of the Semantic Bridge Project will be the creation of an open-source ontology management framework that can be utilized by any organization. Where applicable, there will be a mapping of independently created ontologies to the Semantic Ontology Repository.

In Phase II of the Semantic Bridge Project we will create tools for mapping equivalent concepts between different ontologies.

We are very aware that a collaborative approach and the implementation of fair practices are essential to the realization of this vision. We wish to avoid the possibility of fragmentation (e.g., "The Google Ontology Repository", "The Microsoft Ontology Repository", etc.) as is seen with several competing Linux distributions. Our goal is to create a consortium where all members participate equitably.

We think we have patent rights that will enable us to enforce some degree of discipline amongst the major players.

As a point of clarification, the �Semantic Ontology Repository� is not a place for storing semantic triples (data in a semantic format). The repository is for storing meta-data which tools can access to create semantic triples. The repository is focused on the structure of knowledge not data elements.

The creation of a dynamic and interactive, �Semantic Ontology Repository�, along with the tools that will allow web designers and software engineers to easily interact with this repository will have a profound impact on the rapid deployment of the Semantic Web.

The technologies of the Semantic Bridge Project could truly transform the world.

For complete details regarding The Semantic Bridge Project, please visit our website: http://www.semanticbridgetechnologies.com

Please share your thoughts.

We hope you will consider participating in this endeavor. It is going to be an incredible intellectual adventure.

Sincerely,

Mike Duffy
CEO / CTO
Semantic Bridge Technologies
mduffy@austin.rr.com

Csaba Veres :

Interesting to see that this site offers RSS 2.0, which is the NON RDF version of RSS!

A cognitive approach for semantic web.

http://www.cortex-intelligence.com/tech

It may very well be that other players state that they are in the semantic space and they are not considered here as such. For example, it is not necessarily true that the currently proposed structures for semantic web are the best. In fact we believe that they are not. The underlying theory and forms of expression still has to evolve in order to create workable semantic web.
For example, our enterprise ThoughtExpress will launch this year somthing we call semantic human interface that will allow people to express deep semantics to such an extent that one will be able to run in this space financial services, social life etc. Yet it has nothing to do with RDF, OWL etc. etc.
Pawel Lubczonok

Post a Comment

 
 



Most Recent Blogs

Emerging Technology
SEARCH
Google Labs
Testing Out Google Labs 
Review: Several new and interesting projects have been added to Google Labs.

WEB TECHNOLOGY
Firefox
Firefox 3: The Next-Gen Web Browser 
Review: Firefox 3 has new capabilities that will change the way that the Web is used.

Advertisement
Advertisement