Archive

Posts Tagged ‘Semantic Web’

Jim Hendler at the INSEMTIVE 2008 Workshop

October 27th, 2008

Along with a number of my colleagues, I’m currently attending the ISWC 2008 conference in Karlsruhe, Germany. Yesterday I attended the INSEMTIVE workshop (”Incentives for the Semantic Web”) which aimed to explore incentives for the creation of semantic web content, i.e. encourage the creation of more structured metadata. The workshop papers are available to browse online or you can download the complete proceedings. There were a real mix of papers, covering specific issues such as extraction of semantics from tagging, and identifying information needs of a community by analysing search patterns, through to position papers that attempted to highlight shortcomings in current semantic web applications that deter people from creating metadata.

I found the position papers most interesting if only because they provided confirmation of something that I’ve been thinking for a while now: that people will (and do) create metadata when there are obvious and immediate benefits in them doing so. No-one really consciously sits down to share or create metadata: they sit down to do a specific task and metadata drops out as a side-effect. For me this makes much of the problem highlighted by the workshop one of interaction design: how do we build good task-oriented user interfaces that encourage the creation of semantic web metadata, and how can we illustrate the benefits of semantic web technologies in an incremental fashion? In my opinion solving this will require close collaboration between semantic web researchers and developers, and interaction designers.

The end of the workshop was a discussion session chaired by Jim Hendler. Hendler chose to do a retrospective of some older presentations to explore how thinking has evolved (or not!) with respect to drivers towards the development of the semantic web.

Starting in 1999, Hendler showed some slides from DAML strategy talks that emphasised the need for a number of different areas to align before a real marketplace can be created for semantic web content and applications. These areas were tools, users, and languages (e.g. OWL, etc). Hendler noted that the Semantic Web community had mistakenly focused too heavily on languages and not enough on the other areas. He also thought that “Web 2.0″ had focused primarily on the users, to a lesser extent on the tools, and very little on the language aspects. Hendler thought that this alignment was now taking place.

Moving forward in time to show some slides from 2001-2002, Hendler introduced the idea that the development of the web itself will “force” the evolution of the semantic web, i.e. that internal pressures, such as the need to better manage and extract value from the massive amounts of online information, will require the semantic web to solve specific problems. Hendler observed that the web has demonstrated that people will do more work to share information with others than they will do to help themselves; i.e. people are lazy. When people want to, need to, or are rewarded for sharing information and content then they will work much harder than they would do to manage and organize information purely for their own uses. Hendler noted that there is a tendency to say “we’ll solve the data creation problem at the individual level, as solving it at a group level is harder to manage”, but a look at web history illustrates that the opposite is in fact the case.

Hendler also shared what he thought was the best piece of advice he’d been given by Tim Berners-Lee: start small but viral and you can change many things. Hendler’s slides characterized this as: “My friend sees it, wants one; My competitor sees it, needs one”.

Looking at slides from 2002, Hendler introduced the “Value proposition” supporting the creation of semantic web data & content, i.e. that there has to be some immediate return on the investment in creating metadata.

Hendler finished his retrospective with a slide from a 2008 talk that showed the range of commercial companies, government projects and vertical sectors that were now heavily engaged in the Semantic Web (I was happy to see Talis mentioned in the list!). In Hendler’s opinion there is a growing excitement, that the “next big thing” is going to come from the Semantic Web; not a “Google Killer”, but the next big revolutionary idea or service. The incentives here being the obvious one: money.

Hendler noted that there is a huge amount of data out there and that finding anything in the mess can be a win. So even a little semantics can make a difference here and could provide some competitive advantages. We don’t need perfect answers or solutions, just incremental improvements on what we have now.

I was also happy to see Hendler encourage researchers to “compete in the real world”, noting that they have to work within the context of a real world that is moving very fast, that they can’t really compete with the resources of commercial firms in creating semantic web applications and demonstrators and should instead try and work within that context to demonstrate real value from the technology. Hendler encouraged them to focus on issues of scalability. Does the fundamental technology scale? Do the concepts and ideas scale to a real user base? As an illustration Hendler noted that he was working with a number of companies that were using some simple OWL constructs in order to add semantics to applications, but that none of them were using a formal reasoner just “little pieces of procedural code that scale really well”.

Overall, an interesting workshop!

Paul Miller did a podcast with Jim Hendler back in March if you want to hear more about his thoughts on the Semantic Web.

English , , , , , , , , , , , , , , , , ,

ISWC 2008 Proceedings now available as Springer LNCS 5318

October 23rd, 2008

Springer has released the ISWC2008 Proceedings on their site.

LNCS 5318, Proceedings of the Seventh International Semantic Web Conference The Semantic Web - ISWC 2008, Proceedings of the Seventh International Semantic Web Conference, A.P. Sheth, S. Staab, M. Dean, M. Paolucci, D. Maynard, T. Finin, and K. Thirunarayan (Eds.), Karlsruhe, Germany, 26-30 October 2008, LNCS Volume 5318, Springer Berlin/Heidelberg.

This book constitutes the refereed proceedings of the 7th International Semantic Web Conference, ISWC 2008, held in Karlsruhe, Germany, during October 26-30, 2008.

The volume contains 43 revised full research papers selected from a total of 261 submissions, of which an additional 3 papers were referred to the semantic Web in-use track; 11 papers out of 26 submissions to the semantic Web in-use track, and 7 papers and 12 posters accepted out of 39 submissions to the doctoral consortium.

The topics covered in the research track are ontology engineering; data management; software and service engineering; non-standard reasoning with ontologies; semantic retrieval; OWL; ontology alignment; description logics; user interfaces; Web data and knowledge; semantic Web services; semantic social networks; and rules and relatedness. The semantic Web in-use track covers knowledge management; business applications; applications from home to space; and services and infrastructure.

English , , , , , , ,

All the better to hear us with….

October 18th, 2008
[inline]
SpringWidgets
Nodalities » Podcast
Talking with Talis - Nodalities From Semantic Web to Web of Data
var flashVars = {param_param:'http%3A%2F%2Fblogs.talis.com%2Fnodalities%2Fcategory%2Fpodcast%2Ffeed', param_compactView:'-1', param_blurbLength:'512', param_style_borderColor:'0xBEBEBE', param_style_brandUrl:'http%3A%2F%2Fdownloads.thespringbox.com%2Fhosted_content%2Fimages%2F9745a6d280c897419da82f6b4ee0e8fc.jpg'};var params= {wmode:'transparent', quality:'high', allownetworking:'all', allowscriptaccess:'always', allowfullscreen:'true', bgcolor:'0x000000'};swfobject.embedSWF('http://downloads.thespringbox.com/web/wrapper.php?file=61830.sbw', 'springwidgets_61830', '250', '318', '8.0.0', 'http://downloads.thespringbox.com/web/expressInstall.swf', flashVars, params); [/inline]

Talis watchers will be well aware of the significant number of podcasts that my colleagues and I produce here at Talis.  Apart from the semantic web focused podcasts here on Nodalities, there are the more library focused ones on our sister blog Panlibus, education focused ones on the Project Xiphos Blog, and of course the Library 2.0 and Semantic Web gangs.

Keeping up with these streams of podcast output can be a bit of a challenge, so we have taken a few steps to make things easier.

Firstly, on Panlibus, Nodalities, and Xiphos we have created a ‘podcast’ category.   By selecting podcast in the category selector, you can view only podcast postings for that at blog.

Next we have implemented a feed aggregator which brings all the Talis podcasting output in to a single feed under the Talking with Talis brand.  The displayed version of this feed is not as elegant as each dedicated blog feed, but all the information is there and it is a great place to select the aggregated feed for your favourite RSS reader.

iTunes is tool that many use to track download and listen to podcasts.   The Talking with Talis iTunes feed has now been updated to include all of our podcasts.  If you don’t already have this free feed set up in your iTunes, click here to do it now.

Last but not least, you will have noticed the RSS feed widget at the top of this blog post.  This widget is freely available from SpringWidgets for you to add to your favourite environment such as Pageflakes, Facebook, Wordpress, iGoogle and many others including [after a small software download] your PC desktop.

I have set up these widgets for the following podcast feeds – to get one in your environment follow the link and the click the ‘Click Here to Get the Code!’ link.

SpringWidgets have loads more in their Widget Gallery that have been created by their community, and I must give credit to one of their number, Minerva, who created the first Panlibus podcasts feed.

English , , , , ,

All the better to hear us with….

October 18th, 2008
[inline]
SpringWidgets
Panlibus » Podcast
Talking with Talis podcasts from the Panlibus Blog
var flashVars = {param_param:'http%3A%2F%2Fblogs.talis.com%2Fpanlibus%2Farchives%2Fcategory%2Fpodcast%2Ffeed', param_compactView:'-1', param_blurbLength:'512', param_style_borderColor:'0xBEBEBE', param_style_brandUrl:'http%3A%2F%2Fdownloads.thespringbox.com%2Fhosted_content%2Fimages%2Fb47e75792724979d445e2eba681e3039.jpg'};var params= {wmode:'transparent', quality:'high', allownetworking:'all', allowscriptaccess:'always', allowfullscreen:'true', bgcolor:'0x000000'};swfobject.embedSWF('http://downloads.thespringbox.com/web/wrapper.php?file=61829.sbw', 'springwidgets_61829', '250', '318', '8.0.0', 'http://downloads.thespringbox.com/web/expressInstall.swf', flashVars, params); [/inline]

Talis watchers will be well aware of the significant number of podcasts that my colleagues and I produce here at Talis.  Apart from the mainly library focused podcasts here on Panlibus, there are the more semantic web based ones on our sister blog Nodalities, education focused ones on the Project Xiphos Blog, and of course the Library 2.0 and Semantic Web gangs.

Keeping up with these streams of podcast output can be a bit of a challenge, so we have taken a few steps to make things easier.

Firstly, on Panlibus, Nodalities, and Xiphos we have created a ‘podcast’ category.   By selecting podcast in the category selector, you can view only podcast postings for that at blog.

Next we have implemented a feed aggregator which brings all the Talis podcasting output in to a single feed under the Talking with Talis brand.  The displayed version of this feed is not as elegant as each dedicated blog feed, but all the information is there and it is a great place to select the aggregated feed for your favourite RSS reader.

iTunes is tool that many use to track download and listen to podcasts.   The Talking with Talis iTunes feed has now been updated to include all of our podcasts.  If you don’t already have this free feed set up in your iTunes, click here to do it now.

Last but not least, you will have noticed the RSS feed widget at the top of this blog post.  This widget is freely available from SpringWidgets for you to add to your favourite environment such as Pageflakes, Facebook, Wordpress, iGoogle and many others including [after a small software download] your PC desktop.

I have set up these widgets for the following podcast feeds – to get one in your environment follow the link and the click the ‘Click Here to Get the Code!’ link.

SpringWidgets have loads more in their Widget Gallery that have been created by their community, and I must give credit to one of their number, Minerva, who created the first Panlibus podcasts feed.

English , , , , ,

RDFa, SearchMonkey, Drupal (and SIOC)

October 16th, 2008

It’s been an exciting week in terms of developments and announcements for the Semantic Web and search.

Firstly, Yahoo! SearchMonkey has published a list of recommended vocabularies for developers of SearchMonkey applications, including FOAF, SIOC, DC, vCard, vCalendar, hReview, GoodRelations, dbPedia and Freebase. SIOC is recommended for “blogs, discussion forums, Q&A sites”. See the video below for a nice overview of SearchMonkey.

Secondly, Drupal creator Dries Buytaert wrote a very interesting and encouraging post yesterday entitled “Drupal, the semantic web and search” in which he says:

“On a social networking site built with Drupal, [semantic technology] opens up the possibility to do all sorts of deep social searches - searching by types and levels of relationships while simultaneously filtering by other criteria. I was talking with David Peterson the other day about this, and if Drupal core supported FOAF and SIOC out of the box, you could search within your network of friends or colleagues. This would be a fundamentally new way to take advantage of your network or significantly increase the relevance of certain searches. I can has semweb in Drupal core?”

Thirdly, RDFa just became a W3C recommendation! Congratulations to all involved…

Oh, and finally, SIOC is now OWL-DL compliant. This change was motivated by the SWANSIOC initiative in the W3C HLCSIG (Semantic Web for Health Care and Life Sciences Interest Group), which uses the Science Collaboration Framework based on Drupal.

See also:

English , , , , , , , ,

This Week’s Semantic Web, Burningbird style

October 15th, 2008

Sorry I’m a little tardy with this, anyhow last time here I asked for volunteers to give their own take on TWSW . Shelley Powers stepped up to the plate, and the content of her post is below. In this context Shelley’s best known for writing the first book on RDF, over five years ago.

Brian Manley also took the bait, and he’s started publishing The Week In Linked Data - TWILD (more like the style of TWSW but with better descriptions and a pronounceable acronym).

This post will likely push a lot of things below the bottom of the page, so I’d better link to Paul’s recent podcasts to keep him sweet.
Over to Shelley:

I decided to add a slight twist to my own version of This Week’s Semantic Web, focusing not only on the stories, but how I found them. After all, the real purpose of the semantic web technologies is to make information easier to find. How are we, in the semantic web community, doing in this regard?

To start, I subscribe to various feeds including Planet RDF, as a way to keep up with most of the semantic web news. This week, the stories from Planet RDF that caught my eye were the following:

  • Tom Heath wrote How Will We Interact with the Web of Data for IEEE. In his article, Tom proposes that the web homepage, as we know it today, is dead. In its place we’ll have connected pieces of data, pulled together via RDF records (tuples), which are then used to generate the human readable content. So one could have weblog, browser, feeds, friend feeds, and other online “islands of data”, Flickr and other photos, videos on YouTube, etc.—all annotated with metadata and brought together, mechanistically, because of the metadata annotation. It’s interesting, and we already have some of this with various widget-enabled devices, but I’m not sure that most people are “geek enough” to make this a truly viable option. Not yet.
  • Bob DuCharme wrote a follow-up piece to his Leaning more about SPARQL, related to forming SPARQL queries against DBPedia, the site dedicated to making Wikipedia information queriable. No, that’s not a word…but it should be. Bob’s example is important for two reasons. The first, and the most obvious, reason is that it, of course, demonstrates SPARQL against a published source—hopefully spurring on other efforts. More importantly, though, in my opinion, is that Bob is publishing his explorations, his learning experiences, not necessarily a finished, “Ta da!” work. We need more journals of discovery in the semantic web world.

I don’t only get my semantic web information from the Planet RDF feed. I find other entries on this topic, now and again, in other feeds. For instance, I wrote about two other items this week and I’ll repeat links to both because I feel they represent the semantic world “in the wild”.

  • A List Apart featured an article titled Understanding Progressive Enhancement, which discussed the concept of building one’s website from the inside out—focusing on the properly semantically annotated content, first, before tossing in the pretties. I think this article complements some of the discussion about minimal design that was such a popular topic a few months back. The article not only focuses our attention back on the content, and hence the real purpose for the web site, it also drives home that we need to start doing a better job, semantically speaking, with our use of page markup. Speaking of markup…
  • Tina Holmboe’s XHTML—myths and realities is both an important, and timely, look at XHTML, the importance of XHTML for the semantic world (RDFa), and the future of XHTML. It’s timely because it serves to remind us that we now have two divergent markup paths under the W3C leadership—paths that do not share a common model or focus, which seems to me to act counter to the ultimate goal of a truly semantic web.

In my quest for this week’s semantic web goodies, I also searched in Google on “Semantic Web” and then focused on News, not Web, in order to filter items down to recent events. With this approach, I found the following items to pass along:

  • Paul Miller at ZDNet writes Does the Semantic web matter? He believes it does, a view offered up simply and elegantly. What the semantic web isn’t, though, according to Paul, is a goose to be punched and pummeled by the elitist and the avaricious until forced to deliver up the golden egg. To wit:
  • Continuing landgrabs by startups that seek to attract, trap and exploit eyeballs stand unashamedly on the shoulders of Semantic Web promise whilst running counter to its basic tenets of linking and openness. On the other hand, companies ‘just’ doing perfectly reasonable - and valuable - things with the meanings of words, phrases and documents latch on to the Semantic Web’s buzz, whilst being all about Semantics and not at all about the Web.

    New entrants, hopefully building viable and useful businesses upon the Semantic Web’s ideas, are pilloried by stalwarts of the ‘community,’ because the reality of their business model does not permit a whole-hearted embracing of the entire Semantic Web stack from Day One. Intellectual purity clashes with pragmatism and reality on a daily basis. Well-meaning guidelines and best practices morph in the minds of too many to become laws, ‘truths’, and rods with which to beat outsiders. Visions of Orwellian pigs fill my brain, and I don’t like what I see as they rise up onto two feet and gaze disdainfully around.

  • Speaking of punching geese, oh look, Ask.com is back. It’s got mad semantic skillz. So I put Ask.com to the test, and asked it “How can I learn more about SPARQL”, and it responded with, “Did you mean, ‘How can I learn more about sparkle’?”. I paused a moment, and said sure, show me that one. Ummm, Swarovski crystal jewelry. Pretty sparkles. To be fair, before following this sparkly tangent, Ask.com did return the first of Bob Ducharme’s post, mentioned above. In fact, it returned exactly the same result list as Google and Yahoo, when I asked them the same question.
  • Though not exactly “this week”, ReadWriteWeb writes a mean semantic web post, now and again, and had one last week subtitled, “Show me the Money!”—and wasn’t that a great movie moment? I digress, though. The RWW post focuses on a new report by a Semantic web entrepreneur on semantic web companies making money, but just at the moment when I clicked through to read the report, I got distracted by the flock of migrating geese overhead. I must pursue the report at a later time. What I found interesting, though, was the ReadWriteWeb Semantic Web Log search and…ah geez, there goes another flock, circling overhead.

There were other sources I searched for information about the semantic web for this week, but the results were less than optimum. For instance, I searched on “semanticweb” in delicious, but the results show the items that were posted to delicious this week, not necessarily published this week. The problem is that while many services such as delicious have a way to tag items with terms like “semanticweb” the metadata annotation is limited, and doesn’t include information such as when was the posted item first published, nor allow you to search on the same. Most of the “semantics” are flat, simple, and two-dimensional, IE keyword-value pairs.

I next went in the opposite direction, looking for just published items, and then sought to filter on the semantic web. For instance, no other source is better for up-to-date discovery of minutiae than Twitter. However, as far as I can see, there is no way to search on specific topic in Twitter. You can look for people, but other subject material search is extremely limited. If you don’t know that Twitter user Kingsley Idehen exists, and posts frequently on semantic web related items, you may not discover a graph of linked data sources or an animation related to RDF as middleware.

I then turned to the Big Cheese, the Head Semantic Web honcho, Twine, and the twine related to the Semantic Web. Eureka! I finded the Semantic Web! Of course, on closer look, most of the items also could be found on Planet RDF. Still, meat that is both fresh, and relevant. I’ll just pick out a few for my version of This Week’s Semantic Web.

  • Seven OWL 2 Drafts Published at the W3C. OWL 2 is an extension of the OWL, which is the Web Ontology Language. No, don’t try to fit the acronym. OWL is not necessarily directly important to thee and me. OWL is important, though, for designing systems that would understand exactly what I mean when I ask, “How can I learn more about SPARQL”, and that will return the definitive sources meeting my question, without being dependent on either language processing or obscure page ranking algorithms.
  • Speaking of SPARQL, another item in the twine was SPARQL Update a submission to the W3C describing a way to use SPARQL to update graphs (semantically linked data stores). Interesting, considering that SPARQL means Simple Protocol and RDF Query Language. What works in one direction must work all directions, eh? Reminds me a little of HTML5 and JSON—the Swiss Army knives of technology.

And so ends my tenure for This Week’s Semantic Web, Burningbird style. What I discovered in the process of building my list was that we’re not close to the semantic web we seek. Without knowing about the people, such as Bob, Kingsley, or Danny, or the topic-focused resources such as Planet RDF or Twine, I would have had a much more difficult time finding out what is happening, this week, in the semantic web. However, among the results I did find are new technologies, new specifications, new efforts that assure us that though the semantic web doesn’t exist today, it surely will someday.

Surely. Someday.

flack of snow geese

English , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Has the Semantic Web Industry become a reality yet?

October 10th, 2008

Well, no. Or maybe not quite. But an innocent reader might have gathered this from the title of David Provost’s recent publication which promisingly read “On the Cusp. Global Review of the Semantic Web Industry.”

Provost’s review is a nice and readable attempt at evangelizing semantic technologies and their adoption by the industry. Its seeks to spread the news outside of the echo chambers and avoids any community jargon and cryptic acronyms irrelevant to strategic decision makers. He really derserves great credits here.

But in the end Provost’s description of a “Semantic Web Industry” is reductionist. By just analysing the commercial availabilty of technology provided by vendors, the bigger picture of the industry gets blurred. He misses the point when it comes to analysing the actual demands for semantic applications. But they could be easily identified e.g. enabling cost-efficient interoperability and reusability of data. So Provost gets stuck in a supply-driven view of the semantic web industry. And as we have learned from history, supply driven markets - technology markets in special - are extremely vulnerable. Hence concluding that the Semantic Web Industry is on the cusp might seem a little “misworded”.

What might be a nice addition for a follow up study is to look at the commercialization strategies of semantic web technologies and its capitalization logic as a network good. Further on, it might be worth it looking at the value chain of a semantic industry in which vendors just play one (but nontheless important) role and the regulatory aspects involved in rolling out semantic web based business models, i.e. concept advertising. Here you might be easily confronted with antitrust and competition issues very soon, taking into account recent decisions of the German High Court about the bidding on key words in Google AdSense.

Imagine what it would mean if you could bid on concepts and secure them for your private purposes?

(Cordial thanks to Jana Herwig who edited this post for me!)

Reblog this post [with Zemanta]

English , , , , , , , , , , , , , ,

Danny Ayers: “The Semantic Web is the path of least resistance”

October 2nd, 2008

Danny AyersThe Web of Data Practitioners Days are approaching - giving me the opportunity to do an advance interview with Danny Ayers, Semantic Web evangelist, Community Platform manager at Talis, Web of Things everything (I think). I’d just like to extract two or three points here - you can read the whole interview on our website. First something that’s noteworthy to me as it says something about the patterns of technological evolution in general:

Looking back a few years, I don’t think many people working on the Web could have predicted the remarkable rise of blogging, the revival of DHTML and ancient Internet Explorer tricks such as Ajax, online social networks, Wikis, the whole Web 2.0 thing. It’s worth noting that these developments have been consistent with Tim Berners-Lee’s vision of the Web as a system in which people are the key component.

Shifting to the Semantic Web perspective, for a long time I have believed this approach is on track simply because it offers improvements to the Web for which there are no obvious alternative techniques. Personally, I was relatively late to realise what those improvements really were - moving from a Web of Documents to a more general Web of Data. Expressed like that, and looking at existing Web architecture, the Semantic Web is the path of least resistance.

Remember? AJAX, when it cropped up and caused a big buzz in 2005, was nothing new, it was just a new term for an old thing, i.e. the Internet Explorer tricks Danny mentions (see also A Brief History of AJAX: “Browser asynchronous hacks have been possible since 1996, when Internet Explorer introduced the IFRAME tag, passing through a number of techniques such as pixel gifs, Netscape layers, Microsoft Remote Scripting, Java/JavaScript gateways, stylesheet hacks, image/cookies, and most recently the XMLHttpRequest.”)

Sometimes it takes a while until someone (society, industry, what have you) starts to notice that this or that, something, could actually be useful. Sometimes technologies that everybody thinks are silly become a huge sucess - think text messages!

And sometimes you have a great (piece of) technology and it just never really catches on, and if that is the case, then mostly because some forces in the market (trusts, monopolies, corporations who force you to use their software/technology and at ridiculous price, people who would do anyhing they can to undo the natural laws of the digital world) won’t let it happen. What happend to Video 2000 and Betamax? Nixed by JVC’s licensing strategies for VHS. Just wanted to make this point before moving on to the next quote. Danny:

Regarding possible obstacles, there are many ways the Web could suffer, probably most dangerous being interventions from national governments or commercial interests, tilting the table on which we build these systems - such as software patents and threats to net neutrality. The Web works because it’s more or less the same to everyone, everywhere.

So if you think that the Web should continue to be the same to everyone, everywhere, if you would like to liaise with other people interested in the SemWeb and the Web of Data, but most importantly, if you do not know a whole lot about the SemWeb yet but would like to learn more, then please come and do attend the Web of Data Practitioners Days in Vienna, Oct 22-23.

It is going to start with a “Web of Data 101″, i.e. a low-threshold introduction given by Keith Alexander (Talis, UK) and Yves Raimond (Queen Mary University of London, UK) to Semantic Technology in the context of the Web. Here is the full program - please mind that there is a deadline for the registration also (6 Oct 2008!).

Reblog this post [with Zemanta]

English , , , , , , , , , , , , , , , , , , , , , ,

Intro to the Semantic Web

September 21st, 2008

A short introduction to the semantic web. All source material is on the Digital Bazaar wiki. Feel free to translate it in Dotsub page.

Share and Enjoy: Digg Sphinn del.icio.us Facebook Mixx Google BarraPunto blinkbits BlinkList blogmarks BlogMemes Blogsvine connotea De.lirio.us description e-mail Furl LinkaGoGo Live Ma.gnolia Meneame MisterWong NewsVine Pownce Propeller Reddit Slashdot SphereIt Spurl StumbleUpon Technorati TwitThis Wikio YahooMyWeb E-mail this story to a friend! LinkedIn Print this article! Blogosphere News

English

My ants won’t join your storm, I’ve already set them free

September 15th, 2008

AntstormSo, Antstorm. After Appscout’s report that they had “never seen a service that brings social bookmarking and semantic search together the way AntStorm does,” and as I have this little project of listing all available semantic search engines, I thought I might as well check it out (yes, the list is in perpetual need for an udpate - those cementic developers are nearly too fast to keep up with).

Actually, Antstorm has very little to do with the semantic web, and a lot with the social web - but expect no folksonomies. First thing to do for you at Antstorm: They’re asking you to import your bookmarks; ideally, you would already have them in neatly arranged folders, labeled appropriately, and then Antstorm would convert these folders into what they call “trails”, which other users can follow. You can keep trails private, of course, but it doesn’t seem as if you can also keep selected bookmarks within these trails private. Hmpf.

And hey, wait: Is there anybody in the age of del.icio.us who still keeps her bookmarks on a computer? I don’t, except the ones that I need half a dozen or more times a day, e.g. the login to the corporate CMS or webmail, and these are not the links that anybody outside of my work context could benefit from. Importing bookmarks from del.icio.us is, however, not part of the AntStorm package - what you can do is to automatically add new links to del.icio.us as well by checking a box “Add to delicious” - but as you cannot add tags to a link on AntStorm, I wonder of what use an untagged bookmark could be on del.icio.us?

Things might get a little more interesting if you decide to add links to a group as well: A group on AntStorm is a community of editors who collaboratively manage trails related to the interests of their group. Any group member can suggest new links - the group decides by voting for or against it whether these will be added or not. Collaborative filtering, alright - I wonder, however, how many users you’d have to have in a group a.k.a. microniche to receive results that matter.

I failed to find out what the appeal of AntStorm could be - as all my bookmarks are either on del.icio.us, Bibsonomy (imported from del.icio.us) or CiteUlike (for all things academic), I don’t have any browser bookmarks left to get me started on AntStorm. AntStorm’s sales copy - “Have you ever needed a bookmark and realized it was on some other computer? Or have you ever wanted to save a bookmark, but you weren’t on your primary computer?” - would have convinced me in 2004, but I’ve already unleashed all my bookmarks. What they call trails looks all too suspiciously like yet another, difficult to manage folder structure to me. Of course I am biased, but I just don’t see how a collaborative link suggestion tool could work without tagging - or maybe I just didn’t find it?

Anybody with a few stationary bookmarks left - please set them free on AntStorm, maybe you’ll find out what they’re really good for. I clicked around a bit and skim-viewed their How-to-Video (9 min 18 sec!).

They promise that a share of the earnings generated by users will go to charity, and that’s always a good thing. Also, their logo is cute (even of not web 2.0 shiny) and I quite like the idea of a storm of ants.

Reblog this post [with Zemanta]

Uncategorized , , , ,

The Wild vs The Orderly: Folksonomies and Semantics (TRIPLE-I 2008)

September 4th, 2008

This second day of TRIPLE-I 2008 was my personal folksonomy day, even though the theme was already set yesterday, with Andreas Hotho’s invited talk about “Extracting Semantics from Folksonomies” which was the opening lecture of the workshop “Knowledge acquisition from the Social Web.”

Andreas Hotho is directing the Bibsonomy project at Kassel University’s Knowledge and Data Engineering resarch group; Bibsonomy is a social bookmark and publication sharing system catering especially for researchers who, next to bookmarkingm also wish to manage publications. Next to other interesting things, Bibsonomy supports the import of bookmarks from del.icio.us, Firefox bookmarks and local BibTex files. Being a project led by a university’s computer science department, Bibsonomy is at the same time the result, the object and a stimulus for research in the area of tagging and folksonomies. Andreas describes this double appeal of folksonomies to both ordinary people and researchers in a 12 seconds vlog post:


Andreas Hotho’s statement about folksonomies and research (see www.bibsonomy.org) on 12seconds.tv

One of the outcomes of the research into folksonomies is FolkRank, a search algorithm that exploits the structure of folksonomies; the name reveals that it was inspired by PageRank, but as the graph of folksonomy structures does not correspond to the web graph, some adaptations had to be made. The specifics of these adaptations can be found in an online article by Andreas and his colleagues: “FolkRank: A Ranking Algorithm for Folksonomies” (PDF, 268 KB).

Andreas Hotho’s talk more specifically addressed the search for methods to identify tags which describe the same concept (or a more specific / a more general concept respectively) within a folksonomy. He suggested two approaches:

  1. Applying measures directly to folksonomy statistics, allowing to describe tags as a vector; e.g. co-occurrence frequency and FolkRank could serve as a similarity measure (with these two having a tendency towards high-frequency tags) or a cosine method (which is more likely to produce “siblings”)
  2. Looking up tags in an external thesaurus/vocabulary (for instance achieving semantic grounding by mapping a tag and its most similar tags with Wordnet Synsets)

Future areas of interest within folksonomy research Andreas proposed were trend detection, tag recommendation, detecting spam (a major challenge!), logsonomies (i.e. the structure of search engine query log files) and learning synsets, hierarchies, and structures of folksonomies. Andreas Hotho can be contacted via his homepage, if you have any further questions regarding Bibsonomy, FolkRank or this present piece of research.

Another presentation dedicated to folksonomies - and the presentation that won my personal presentation design award - was “Seeding, Weeding, Fertilizing - Different Tag Gardening Activities for Folksonomy Maintenance and Enrichment” by Katrin Weller and Isabella Peters, both from the Dept. of Information Science at Heinrich Heine University in Düsseldorf. The entire presentation was designed to match the CI of Tagcare, a tag gardening tool that is hopefully going to go online soon.

The term “Tag Gardening” was borrowed from James Governor who wrote in a 2006 blogpost:

“Like plants or animals, tags evolve in an emergent fashion, open to hybridisation. Stewardship can help grow and put roots down.

Helping the darwinian process is tag gardening.

Tag gardening is about taking tags in the wild and tending to them, or identifying a wild tag that will do well in your south facing IT

garden. I am talking about domestication here.

Just like there are professional bloggers i am pretty sure some parties will emerge that get paid for their abilities.”

I seriously hope that the latter is going to come true, even though I have the feeling that most providers will continue to consider user input and effort pro bono work!

Katrin Weller’s intro (Isabella Peters had excused herself) focused on the well-known problems with tags and folksonomies, e.g. :

  • spelling variants, synonyms, abbreviations, different natural languages
  • adhoc or personal functions of tags other than content description (e.g. “toread”, “@Henry”, “nicepic”)
  • flatness of tag clouds which allows for browsing by popularity, but not by semantic interrelations

She further distinguished three levels where tag or tag cloud improvement becomes relevant:

  • single document vs document collection level
  • Single user vs collaborative level
  • intra- and cross plattform level (e.g. different tagging conventions, tag separation with comma or blank space, etc)

To push the gardening metaphor even further, Kathrin presented us their ideas of weeding, seeding, fertilizing etc.:

Weeding
The weeds in this case are “bad” tags like spam or misspelled tags (weed: any plant that crowds out cultivated plants)
Aim: enhancing recall and a consistent indexing vocabulary
Achieved by: type-ahead functionality, editing funcionalities, natural language processing, user guidelines for indexing and retrieval, nomination of authorized users as gardeners

Seeding
Seeding in folksonomies means to expand frequently used tags by more specific tags (called “baby tags” or “seedlings” by Katrin Weller; seedling: young plant or tree grown from a seed)

Landscaping
The idea of landscaping here means to create “flower beds” through identifying species of tags, e.g. by similarity.
Aim: enhancing precision and expressiveness

Fertilizing
Fertilizing in this context means to combine folksonomies with other knowledge organization systems (KOS): thesauri, controlled vocabularies, ontologies, etc. (fertilizer: any substance such as manure or a mixture of nitrates used to make soil more fertile). Fertilizing might work both ways, Katrin suggested: a folksonomy might be fertilized with the semantic structure of a KOS, or a KOS enhanced by terms from a folksonomy.

And finally TagCare: The ambitious plan is to have a system that allows to import tag clouds from Flickr, deli.icio.us and Bibsonomy, cleanse out dissimilarities between tags, add hierarchical structure to the tag clouds, allow the user to view tag statistics and probably also to have community features, such calibrating one’s tags with those of the chief gardener or to activate collaborative spam elimination. It is going to be a free service, and if you want to be notified when it goes live, you might want to send an email to Katrin.

This full-service proposal for tag gardening does of course sound brilliant - yet is it going to be feasible, on a technical level? In the post-presentation discussion, somebody mentioned Faviki, which relies on DBpedia concepts to solidify the tag cloud. It didn’t exactly seem as though the TagCare team had already thought along these (semantic web) lines, even though this perfectly corresponded to their ‘Fertilizing’ idea. But if TagCare solely relies on good human gardeners, how long will it take until they have gained a big enough community to stimulate someone’s altruism? The idea of tag gardening of course is beautiful, and I am curious to learn more about the technology it is going to use.

Other folksonomy and tag related presentations that I was unable to attend or am unable to describe now, after the 10th hour of my 2nd day at TRIPLE-I, with a band performing folkore music involving yodeling and probably Schuhplattler right outside of this room:

  • Quality Metrics for Tags of Broad Folksonomies (Celine Van Damme, Martin Hepp, Tanguy, Coenen, University of Brussels, Universität der Bundeswehr München
  • Providing Multi Source Tag Recommendations in a Social Resource Sharing Platform (Martin Memmel, Michael Kockler, Rafael Schirru, German Research Centre for Artificial Intelligence DFKI)
  • Semantic Tagging and Inference in Online Communities, Yildirim Ahmet, Üsküdarli Suzan, Boğaziçi University
  • Using Visual Features to Improve Tag Suggestions in Image Sharing Sites (Mathias Lux, Oge Marques, Arthur Pitman, Klagenfurt University)
  • Harnessing Wikipedia for Smart Tags Clustering (Maria Grineva, Maxim Grinev, Denis Turdakov, Pavel Velikhov, Russian Academy of Sciences)

Please leave a comment if you think that any of the above needs correction.

EDIT: I got the chance to record another 12 seconds definition (and am thinking of setting up a video glossary for the Semantic Web now): Rolf Sint from Salzburg Research explains what folksonomies are and why folksonomies and ontologies go together well in 12 seconds!


Rolf Sint explains folksonomies and their relation to ontologies on 12seconds.tv

Reblog this post [with Zemanta]

Uncategorized , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

The 5 Types of Semantic Applications: Part 1 - Semantic Search

June 3rd, 2008

Last week I spoke at the TechCocktail Conference in Chicago on Understanding the Semantic Web. It was a successful event and my session was well received, accomplishing the goal of communicating a general understanding in a simple way.

Over the coming days I’ll summarize the elements from one section of the presentation: The 5 Types of Semantic Applicaitons.

The semantic web is an emerging market that’s the focus of great interest, hype, and confusion. In an attempt to clarify what’s happening in the space it’s helpful to examine the five basic application types that exist and how companies are entering the market using either the top-down or bottom-up approach to semantics.

Part 1 - Semantic Search: (more…)

English ,

Semantic Tech Conference Round Up

May 30th, 2008

Last week I was at the Semantic tech conference in San Jose. It was an exciting event that exceeded my expectations in many ways. First of all, the conference had a really great vibe. People from different parts of the planet converged to talk about their work and passion - Semantic Web.

From the conversations during lunch to keynotes there was a fluid exchange of intelligent ideas; people genuinely interested in the space and focusing on understanding how semantic technologies benefit us today and where they are headed. There was a consensus that many technologies are nearly ready or ready for prime time and that 2008 is the first year when semantic web is coming out of the stealth mode.

To get a flavor of the conversations and topics covered during this conference, I suggest that you review the 4 posts that I’ve written on ReadWriteWeb:

In addition to these 4 posts, I’ve also written a post on Semantic Search. I highly recommend this post to you as well, it is a result of a lot struggle to crystallize in my mind what is going on in that space.

Finally, as with any great conference, it was a pleasure to meet up people that you work with remotely. I had a great pleasure of talking to Paul Miller, Tom Tague and Greg Boutin from Semantic Web Gang. We’ve done several podcasts together and it was great to see people behind voices and avatars :)

I also had an opportunity to speak on the Rising Stars of Semantic Web panel along with Barney Pell, CEO/CTO Powerset, Nova Spivack CEO of Radar Networks, Ian David CTO of Talis and Tom Grueber from stealth company. Both during the panel and the press conference that followed up, I kept on thinking about incredible amount of energy and brain power and enthusiasm that these folks bring to the space. In my book passion is the #1 recipe for success, so I was excited about the prospects of the space at large and what each of these individual companies is going to contribute.

For additional coverage of the conference, please see excellent round up by Daniela Barbosa.

English , , , , , , , , , , , , , , , , ,

Understanding Semantic Web: TechCocktail Session

May 27th, 2008

Early tomorrow I head to Chicago for the TechCocktail conference. The speaker list that Frank and Eric have amassed is impressive and the sessions are all stellar. Make sure to register if you haven’t already.

techcocktail-052708.pngMy session’s titled Understanding the Semantic Web and takes a top-down (;)) look at everything an entrepreneur needs to know about the space. For semantic ninjas that regularly reader the BlueBlog it may rehash some familiar ground but for those not deeply involved in the space it will be a great 30 min discussion.

Here’s a short summary of what you can expect.

What is the semantic web: for a term that has a simple definition the phrase ’semantic web’ is generally difficult to describe and confusing to understand. It shouldn’t be.

Two approaches to semantic web: the two popular approaches to realizing the potential of semantic web are: the bottom-up approach, which annotates information to describe concepts, terms, and relationships; and the top-down approach, which leverages existing information to infer meaning.

Applications: the real value of the semantic web is not in the technology that creates it but in the applications that the technology enables. From semantic search to contextual browsing we’ll explore what’s being done today and where applications are going.

Making a business of semantic applications the semantic web is an emerging market but what about the business opportunities?

As I said, it should be a great conversation and my goal is to stimulate discussion and debate that carries on throughout the event and into the evening. See you soon Chicago.

English , , , , , ,

Semantic web for the working ontologist

May 21st, 2008