Co-written by John Cass and Aaron strout. First posted on ReadWriteWeb on March 12, 2009.
A little background may be helpful first in understanding what the semantic web is before we talk about why it’s important. Tim Berners-Lee, the man best known for his role in “inventing” the World Wide Web, is credited with coining the term “semantic web.” In fact, as early as 1999, Tim is quoted as saying:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.
Heady stuff, to say the least. An easier way to think about the semantic web is to boil it down to a few baseline concepts:
- The web as we know it is mainly comprised of HTML documents, or web pages, as opposed to data repositories. Sure, mega-sites such as Wikipedia, Bigyellow.com, Amazon and YouTube sit on mountains of data, but by and large most sites have little to no real connectivity with each other.
- Because most web pages and websites were built for people (to browse and search) rather than machines (to crawl, collect, and interact with), there is very little “meta-data,” or information that actually describes the data on an HTML page. For instance, most HTML tells a web browser where to put text, images, and video on a page but beyond that doesn’t do a good job of categorizing the information required for search engine optimization.
- In that sense, search engines don’t actually understand what they read; they see only patterns or primitive contextual pairings of words. For instance, searching for “semantic web” will lead most search engines to scour billions of documents for those two words (preferably near each other) and then return results based on set SEO criteria. What they won’t return is a list of companies using semantic technologies, unless those companies’ websites scream it in the title, header, or body text.
- Until more sites are built in semantic-friendly formats such as XML, OWL, and RDF, intelligently collecting, compiling, and connecting the billions of web pages out there will be nearly impossible. This becomes increasingly problematic as more and more consumer-generated content (CGC) is created on blogs and social networks such as Facebook and LinkedIn.
With this baseline, we can now dive into the two particular ways that the semantic web is beginning (and will continue) to help marketers like us. The first, natural-language search, is implicit in nature insofar as it will help companies consume, digest, and interpret terabytes of conversations. The second, content enhancement, is more explicit because it makes existing content more valuable by reaching out to the vast resources of data available on the web.
Consumer-generated content gives companies an opportunity to understand their customers’ concerns and conversations. Yet because so much content is out there, companies need filters to find the most relevant conversations. Natural-language processing can provide this function by automatically summarizing online content for useful analysis by filtering compiled conversations.
Natural-language processing is the process of analyzing web content for meaning. Using sophisticated linguistic technologies, large volumes of content would first be collected into a database. Then, identifying information, perhaps the sources or authors of the content, would be tagged. All of the data would be standardized into one relational database. Lastly, key metrics would be drawn from the raw data. The metrics might include the specific issues being discussed or the “sentiment” of a conversation (that is, whether it is favorable or not).
Semantic technology enables companies to understand the meaning of content and, hence, determine how people feel about their brand. Natural-language processing can help determine how much conversation is happening around an issue, the importance of that issue, and the growth rate of new issues. Natural-language processing can also help determine who is influential on a given issue and if a company’s marketing communications engage and resonate with customers.
As companies become more sophisticated in their understanding of what it means to engage customers, they recognize that the entire company needs to be involved in the process of engaging customers and community online. Semantic web technology vendors have developed workflow processes that copy the manual systems developed by companies to triage online opportunities. These workflow processes are CRM tools. In the process, semantic technologies have moved from just search and monitoring tools to engagement tools that allow sophisticated response management across the enterprise.
Examples of companies that are exploring ways to help businesses tap into the power of true natural-language search are Visible Technologies, Radian6, Nielsen Buzzmetrics, Cymfony, and BuzzGain. (Disclosure: Aaron Strout serves on BuzzGain’s Advisory Board.)
While natural-language search helps companies interpret data and see deeper into the trends in the conversations of their customers and prospects, think of content enhancement as a way for companies to make their existing content more valuable. As “social marketing” — or the practice of deeply engaging customers through content and social tools — becomes increasingly important, so too is finding ways of giving that content life and context.
Companies can pursue content enhancement in two primary ways. The first is to find out more about the explicit likes and dislikes of their customers — think favorite music, books, products, movies, activities — and then to find related pieces of content that are semantically tagged and bring them back for users to interact with. Companies like Twine (in private beta) promise to deliver on this concept.
The second way is to take existing content — think c
ompany blogs, press releases, product descriptions — and add in “semantically charged links.” If you created a blog post, podcast, or video a couple of months ago about the credit crisis, technology such as the kind provided by AdaptiveBlue can add suggested links to it after the fact.
As the treasure trove of consumer-generated content on the web gets richer, these types of semantic technology could go a long way (with the right filters and human oversight) towards helping companies better allocate scarce resources. Content will not only last longer but increase in value exponentially from the contributions of billions of other virtual contributors.
Semantic technology enables consumers and companies to find information that is difficult to discover using traditional search technology. Companies can use the results of this technology to improve their marketing intelligence and provide more relevant content to their customers.
With the cost of monitoring and providing relevant value to consumers lowered, the stage is now set for the development of semantic technology: building out a customer engagement infrastructure. Technology for finding relevant data may still be new, but the deployment of semantic technology is giving a boost to the next stage of development for mapping the engagement workflow to customers, in which opportunities that appear on the web are brought to people who can take advantage of them, whether marketers or consumers.
In essence, semantic technology will help marketers listen easily to the increasing volume of content, sort through the clutter, and find what’s relevant to companies and consumers.