Archive

Posts Tagged ‘Java’

Schedule Tuesday, 28.10

October 27th, 2008

Tuesday Oct 28

8.45 Opening Ceremony (Johannes Brahms hall)
9.00-10.00 Keynote 1: Multimedia Semantic Web (Johannes Brahms hall)
Ramesh Jain, UCI
10.00-10.30 Coffee Break
10.30-12.30 Research 1: Ontology Engineering (Johann Peter Hebel hall)
Chair: Enrico Motta

  • Involving Domain Experts in Authoring OWL Ontologies
    Vania Dimitrova, Ronald Denaux, Glen Hart, Catherine Dolbear, Ian Holt, and Anthony Cohn
  • Supporting Collaborative Ontology Development in Protégé
    Tania Tudorache, Natasha Noy, Samson Tu, and Mark Musen
  • Identifying Potentially Important Concepts and Relations in an Ontology
    Gang Wu, Juanzi Li, Ling Feng, and Kehong Wang
  • RoundTrip Ontology Authoring
    Brian Davis, Ahmad Iqbal, Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, and Siegfried Handschuh
10.30-12.30 Research 2: Data Management (Johannes Brahms hall)
Chair: Kunal Verma

  • NSPARQL: A Navigational Language for RDF
    Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez
  • An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario
    Michael Schmidt, Thomas Hornung, Norbert Kuechlin, Georg Lausen, and Christoph Pinkel
  • Anytime Query Answering in RDF through Evolutionary Algorithms
    Eyal Oren, Christophe Gueret, and Stefan Schlobach
  • The Expressive Power of SPARQL
    Renzo Angles and Claudio Gutierrez
10.30-12.30 In Use: Services and Infrastructure (Alfred Mombert hall)
Chair: Massimo Paolucci

  • A Process Catalog for Workflow Generation
    Michael Wolverton, David Martin, Ian Harrison, and Jerome Thomere
  • Inference Web in Action: Lightweight Use of the Proof Markup Language
    Paulo Pinheiro da Silva, Deborah McGuinness, Nicholas Del Rio, and Li Ding
  • Supporting Ontology-based Dynamic Property and Classification in WebSphere Metadata Server
    Shengping Liu, Yang Yang, Guo Tong Xie, Chen Wang, Feng Cao, Cassio Santos, Robert Schloss, Yue Pan, Kevin Shank, and John Colgrave
  • Towards a Multimedia Content Marketplace Implementation Based on Triplespaces
    David de Francisco Marcos, Lyndon Nixon, and Germán Toro del Valle
12.30-14.00 Lunch break
14.00-15.30 Research 1: Software and Service Engineering (Johann Peter Hebel hall)

  • Integrating Object-Oriented and Ontological Representations: A Case Study in Java and OWL
    Colin Puleston, Bijan Parsia, James Cunningham, and Alan Rector
  • Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery
    Dengping Wei, Ting Wang, Yaodong Chen, and Ji Wang
  • Enhancing Semantic Web Services with Inheritance
    Simon Ferndriger, Abraham Bernstein, Jin Song Dong, Yuzhang Feng, Yuan-Fang Li, and Jane Hunter
14.00-15.30 Research 2: Panel “An OWL 2 Far?” (Johannes Brahms hall)
moderated by Peter F. Patel-Schneider
Panelists: Stefan Decker, Michel Dumontier, Tim Finin, Ian Horrocks
The definition of OWL, the ontology language underlying the Semantic Web, is based on formal representation methods. This provides benefits, in that tools have a firm definition of what they are supposed to do, but can have problems, due to difficulty or expense of building tools or mismatch with needs. The panel will discuss whether the general idea of designing standard Semantic Web languages with steadily increasing power (e.g., the progression from RDF to RDFS to OWL to OWL 2 to …) all based on formal methods is the right way to support the Semantic Web. What level of expressive power does the Semantic Web need? How should standard Semantic Web languages be designed? Does the Semantic Web even need formality?
14.00-15.30 In Use: Business Applications (Alfred Mombert hall)

  • Requirements Analysis Tool: A Tool for Automatically Analyzing Software Requirements Documents
    Kunal Verma and Alex Kass
  • OntoNaviERP: Ontology-supported Navigation in ERP Software Documentation
    Martin Hepp and Andreas Wechselberger
  • Market Blended Insight: modeling propensity to buy with the Semantic Web
    Manuel Salvadores, Landong Zuo, SM Hazzaz Imtiaz, John Darlington, Nicholas Gibbins, and Nigel Shadbolt
15.30-16.00 Coffee break
16.00-17.30 Research 1: Non-standard Reasoning with Ontologies (Johann Peter Hebel hall)
Chair: Paolo Bouquet

  • Using Semantic Distances for Reasoning with Inconsistent Ontologies
    Zhisheng Huang and Frank van Harmelen
  • Statistical Learning for Inductive Query Answering on OWL Ontologies
    Nicola Fanizzi, Claudia d’Amato and Floriana Esposito
  • Optimization and Evaluation of Reasoning in Probabilistic Description Logic: Towards a Systematic Approach
    Pavel Klinov and Bijan Parsia
16.00-17.30 Research 2: Semantic Retrieval (Johannes Brahms hall)
Chair: Dunja Mladenic

  • CANCELLED (no event 16:00-16:30): Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning
    Chaitanya Chemudugunta, America Holloway, Padhraic Smyth and Mark Steyvers
  • Comparing ontology distances: preliminary results
    Jérôme David and Jérôme Euzenat
  • Folksonomy-based collabulary learning
    Leandro Balby Marinho, Krisztian Buza and Lars Schmidt-Thieme
16.00-17.30 In Use: Applications from Home to Space (Alfred Mombert hall)
Chair: Deepali Khushraj

  • DogOnt – Ontology Modeling for Intelligent Domotic Environments
    Dario Bonino and Fulvio Corno
  • Introducing IYOUIT
    Sebastian Boehm, Johan Koolwaaij, Marko Luther, Betrand Souville, Matthias Wagner, and Martin Wibbels
  • A Semantic Data Grid for Satellite Mission Quality Analysis
    Reuben Wright, Manuel Sánchez-Gestido, Asunción Gómez-Pérez, Maria S. Perez, Rafael González Cabero, and Oscar Corcho
18.30-20.30 Poster Session and Welcome Reception (Foyer)

English , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Pellet 2.0 RC1 Release Annoucement

October 27th, 2008

As we announced earlier today on the Pellet mailing list, Pellet 2.0 RC1 is now available for download. It’s been a while since the last public release of Pellet, and we have been working hard on this release and are all very happy about it. The list of changes is extensive and, in our biased view, exciting, both for Pellet and the ongoing viability of OWL.

Dual Licensing

Starting with this release, the licensing terms for Pellet are changing. Pellet will now be available under dual licensing terms. If you’re using Pellet in open source projects, it’s available under AGPL Version 3 terms. However, if the AGPL Version 3.0 license is incompatible with your use of Pellet, alternative license terms are available from Clark & Parsia LLC. You can find more details about licensing here.

Technical Changes

There are various improvements and additions in this new release. We worked on performance improvements over several months, achieving speed ups for the reasoning process and reducing memory consumption too. Several new features have also been added:

OWL2 support. The OWL 1.1 support in Pellet 1.5.2 has now been updated to support OWL2 as described in the latest W3C Working Draft. Reasoning support for the new OWL2 constructs has been improved and has been more robustly tested. There is also a specialized classifier optimized for the OWL2-EL profile. The OWL2-EL classifier provides both better speed and improved memory usage for the classification task when compared to the default classifier. In testing we used it to classify ontologies with half a billion classes (i.e., around one billion RDF triples) on a commodity laptop.

Incremental classification. Pellet 2.0 also provides a new classifier implementation that can update classification results upon ontology changes. The incremental classifier uses the Pellet classifier to compute the initial class hierarchy, but when the ontology is updated (through addition or removal of axioms) only relevant parts of the hierarchy are recomputed. Relevant parts of the class hierarchy are found using the ontology modules automatically extracted by Pellet. For some use cases, this scheme can provide significant performance improvements, including more or less real-time classification of NCI Thesaurus.

Ontology modularity. There is now a stand-alone ontology module extractor that can be used to extract a subset of an ontology relevant for a given a set of terms. The Locality-based modularity concept is used to ensure the logical completeness of the extracted module. The extracted module from a large ontology will typically be much smaller making it easier to understand (for humans) and process (for tools).

SPARQL-DL query engine. A new query engine that can answer SPARQL-DL queries debuts in this release. This query engine can answer mixed ABox/TBox queries and supports some special query predicates, e.g. a special predicate to retrieve direct subclasses rather than all subclasses. The SPARQL-DL engine is also tightly integrated with Jena ARQ such that ARQ handles the SPARQL algebra for complex constructs like OPTIONAL, UNION, FILTER, while Pellet’s SPARQL-DL engine answers the Basic Graph Patterns (BGP).

SWRL rule support. SWRL rule support in Pellet has been improved too. Some known performance issues were fixed and SWRL coverage was significantly extended. Pellet now supports all the built-in functions from the SWRL specification with the exception of rdf:List related functions.

Command-line Interface. Last but not the least Pellet command-line interface (CLI) has gotten a complete makeover. We redesigned the CLI on the model of Subversion’s CLI. The Pellet CLI provides various subcommands that can be used for consistency checking, classification, realization, querying, entailment checking, inference extraction, explanation, and extraction of modules. The Pellint tool has also been integrated into the Pellet CLI.

One feature I like most about the new CLI is that it accepts multiple ontologies as input, so you don’t have to merge them manually (or create one ontology that imports them all). If you have multiple ontologies in one directory, you can even use wildcards (actually: full Java regular expressions) to load all of them at once.

For a more complete description of changes in Pellet 2.0 RC1, see CHANGES.txt in the distribution; or check out the Trac tickets for this release milestone.

Keep in mind that this is a release candidate: it is close to a final 2.0 release, but you might encounter some minor issues while using it. If that happens, please let us know through the pellet-users mailing list and we’ll sort those issues out for the final release.

English , ,

Multimedia in the Web of Data - Annotating and Interlinking Photos, Music, Multimedia [WOD-PD]

October 23rd, 2008

The Web of Data Practitioners Days concluded with the session on Multimedia in the Web of Data, the first part of which was led by Ansgar Scherp (University of Koblenz-Landau, Germany).

Multimedia content, as Ansgar pointed out, is hardly annotated, badly organized, and hardly ever looked at again - just think of the 300 something pics you might take on an average week-end getaway, and which you never touch again. Annotating multimedia content requires a lot of work and dedication - but most of the time, these pictures eventually dissappear in the “digital shoe box” that is your photo management software.

The most obvious remedy is to annotate content as early as possible, ideally when creating the content, ideally already on your portable camera (formerly known as: mobile phone:) Ansgar suggested to provide incentives for people to encourage picture annotation - professionals could for instance receive a higher financial reward if the deliver already annotated pictures. And of course there are ‘Games with a purpose’ such as Google Image Labeler, where players tag images in pairs, with and against each other, and are rewarded with the entertainment factor of the game.

The slide below shows what has happened (or will happen) to the process of creating photo books in the digital age and the age of mashups:

Ansgar Scherp's slides

After all, this is the age of the social semantic web, so why not try and (re-)use the content, structure and contexts that other users have already created on the web? Content augmentation, for the scope that Ansgar is concerned with, consists in the reuse of content and structures (e.g. from sources such as Flickr and Wikipedia, Geonames) made possible through the definition of rules, e.g.:

  • If there are two or less pictures on a page*
  • then automatically augment the page with additional photos using location information.

* Page here means a page in the album you are currently working on - you probably took a picture of yourself and your friend in Paris, and even though you went to the Centre Pompidou, you forgot to actually take a pic of the building itself - well, let the web be your library!

So the goal is clear: develop a procedure for applying automatic content augmentation in the creation of good photo books.

But what makes a ‘good’ photo book anyway? Here are some of the results of a structural analysis of real, human-created photobooks conducted at CeWe Color:

  • % of photos with faces: 36%
  • Number of album pages: 16.96
  • Photos per page: 6.69
  • Text fields per page: 1.45
  • % of pages with text: 87%

There are many rules that can be established from the structural analysis, which can be applied in turn in the creation of photoboooks, e.g. rules like this one,

  • If the text located in the upper third of a page
  • if the font size is equal or larger that 16 points
  • if the number of words is less than 10
  • if there is no caption on the page that has a bigger font size
  • then this page is the title

Ansgar recommended xSmart, which he described as a “context-driven authoring tool for page-based multimedia presentations.”

Ansgar’s presentation was followed by two more: one by Yves Raimond on Interlinking Music on the Web of Data, and one on Interlinking Multimedia - in spite of better intentions, I did not manage to cover these two in detail, but at least I gathered the links to relevant resources from all three sessions…

Links for Ansgar Scherp’s session

Links for Yves Raimond’s session

Links for Michael Hausenblas’ session

  • InterlinkingMultimedia.info - a wiki dedicated to Interlinking multimedia (iM), “a light-weight bottom-up approach to interlink multimedia content on the Web of Data”.
  • Rammx - RDFa-deployed Multimedia Metadata
  • CaMiCatzee - multimedia interlinking concept demonstrator.

Last not least: Ansgar Scherp allowed us a sneak peek of SemaPlorer, a Large-scale Semantic Faceted Browsing Application for Multimedia Data that is going to be revealed on Dec 2, 2008, at the BOEMIE Bootstrapping Ontology Evolution with Multimedia Information Extraction) workshop in Koblenz. Here is an abstract:

Navigating large media repositories is a tedious task, because it requires frequent search for the `right’ keywords, as searching and browsing do not consider the semantics of multimedia data. To resolve this issue, we have developed the SemaPlorer application. SemaPlorer facilitates easy usage of Flickr data by allowing for faceted browsing taking into account semantic background knowledge harvested from sources such as DBpedia, GeoNames, WordNet and personal FOAF files. The inclusion of such background knowledge, however, puts a heavy load on the repository infrastructure that cannot be handled by off-the-shelf software. Therefore, we have developed SemaPlorer’s storage infrastructure based on Amazon’s Elastic Computing Cloud (EC2) and Simple Storage Service. We apply NetworkedGraphs as additional layer on top of EC2, performing as a large, federated data infrastructure for semantically heterogeneous data sources from within and outside of the cloud. Therefore, SemaPlorer is scalable with respect to the amount of distributed components working together as well as the number of triples managed overall.
Steffen Staab, Information Systems and Semantic Web (ISWeb), University of Koblenz-Landau, Germany

Thank you, thank you, thank you, it was a lovely event with an unusually high amount of processable input!

Reblog this post [with Zemanta]

English , , , , , , , , , , , , , , , , , , , , , , , ,

NISO/CLIR/RLG - Technical Metadata Elements for Images Workshop (18-19th April 1999)

October 15th, 2008

Maybe this is a record for delayed blogging. Nine and a half years late, here’s a writeup I found (on a corpsed hard-drive) of an image metadata workshop held by NISO, in Washington. I wrote it up for the JISC JIDI project at ILRT, who funded my trip. I’m sure they won’t mind it being shared here. Eric Miller and Paul Miller were at the workshop too; I remember working on the old Dublin Core RDF spec with them. NISO’s Workshop Report is also available from their site.

In April 1999, NISO organised a two-day invitational meeting whose aim was to gather requirements for technical metadata for digital images. The meeting addressed both architectural issues, and began the work of defining and categorising actual data elements for representing technical information about images. A report has recently been published on the NISO web service providing preliminary overviews of the agreed elements, including an comprehensive overview [report] of the two day meeting. This report for the JIDI group does not attempt to reproduce this work, but instead reports on a number of issues relating to the aims of JISC-funded services such as JIDI and TASI.

The workshop wrestled with a number of problems familiar to many from the non-imaging metadata community. In particular, it reaffirmed the conclusions drawn at the end of the earlier Dublin Core and Images workshop, namely that the requirements (and challenges) of image-oriented metadata applications were in many respects close to those of the bibliographic and text-oriented areas. Where digital imaging applications had specific requirements, these were in many cases related to the high mobility of digital image objects, and the frequency with which these objects were subjected to tranformations (eg. format conversion, editing).

A major concern which arose many times during the meeting was that of metadata loss through transformation. Many major software tools (eg. Photoshop) destroy or scramble embedded metadata. Conversion and editing of digital image objects tends, with currently available software, to damage embedded metadata. Given the high mobility of digital images, it is nevertheless appealling to explore models whereby metadata can travel with the objects from application to application. The meeting explored a number of frameworks whereby this could be achieved, including the use of a ‘container format‘ which might encapsulate both an image and accompanying metadata. The Java ‘JAR’ archive approach, which combines multiple files plus a metadata ‘manifest’ into a single portable object, was raised as a possible approach. A variant of this model was also discussed which used XML/RDF manifest files within a .ZIP or .JAR container to bundle both image data and metadata into a single transportable object. Although there was some interest in these approaches, no consensus was reached on the appropriate way forward, nor on whether this was a specifically image-related challenge or a general problem for the industry which might benefit from a generalised solution.

There was some consensus that digital signatures would need to be deployed over both image metadata, and the image data itself, for applications (eg. JIDI collections) which require some degree of quality assurance regarding the transformational processes that image collections have undergone. For digital signature technology to be applied to such content a canonicalisation algorithm is necessary. RDF and XML were raised as possible solutions in this area, although it was noted that the Signed XML initiative within W3C was not yet underway, and tha t the RDF working groups had deferred work on canonicalisation of RDF metadata until the issues had been more fully explored by the Signed XML group. Although digital signature technology can be applied to any content for which a text-based representation can be derrived, the issue of canonicalisation. Broadly, this means that applications which use digital signatures to make trust decisions, and which transform images, will ultimately benefit from a higher level of abstraction concerning ‘what it is that has been signed’. Current technology allows applications to think of simple textual files (eg. containing metadata) as being verifiable using digital signatures over that content. Future work will allow applications to instead make trust decisions relating to the logical ‘assertions’ represented in the files. The implication here is that trust-based metadata applications are at an early stage, but that we can anticipate within perhaps 2 years there being greater infrastructural support for applications which reason usefully about embedded metadata, eg. concerning the provenence, intellectual property rights and transformational history of some image object.

Why dig this up? Partly because I think a lot of practical work fed into the RDF design and it’s adoption eg. in Dublin Core, and this history is poorly documented. Also to kick myself for making prediction (” anticipate within perhaps 2 years”, phoey :) and because I’m looking again at signed RDF, largely in a FOAF/SPARQL context. Also I’ve just added a ‘foaf4lib’ category to the blog, as an offering to the code4lib aggregator and to accompany the new foaf4lib mailing list we have in the FOAF project.

English , , , , , , , , , , , , , , , , ,

Akshay Java on Mining Social Media Communities and Content

October 14th, 2008

Akshay Java will defend his dissertation, Mining Social Media Communities and Content, at 10:30am this Thursday in ITE 325. Here’s the abstract.

Social Media is changing the way we find information, share knowledge and communicate with each other. The important factor contributing to the growth of these technologies is the ability to easily produce “user-generated content”. Blogs, Twitter, Wikipedia, Flickr and YouTube are just a few examples of Web 2.0 tools that are drastically changing the Internet landscape today. These platforms allow users to produce, annotate and share information with their social network. Their combined content accounts for nearly four to five times that of edited text being produced each day on the Web. Given the vast amount of user-generated content and easy access to the underlying social graph, we can now begin to understand the nature of online communication and collaboration in social applications. This thesis presents a systematic study of the social media landscape through the combined analysis of its special properties, structure and content.

First, we have developed techniques to effectively mine content from the blogosphere. The BlogVox opinion retrieval system is a large scale blog indexing and content analysis engine. For a given query term, the system retrieves and ranks blog posts expressing sentiments (either positive or negative) towards the query terms. We evaluate the system on a large, standard corpus of blogs with available human verified, relevance assessments for opinions. Further, we have developed a framework to index and semantically analyze syndicated feeds from news websites. This system semantically analyzes news stories and build a rich fact repository of knowledge extracted from real-time feeds.

Communities are an essential element of social media systems and detecting their structure and membership is critical in several real-world applications. Many algorithms for community detection are computationally expensive and generally, do not scale well for large networks. In this work we present an approach that benefits from the scale-free distribution of node degrees to extract communities efficiently. Social media sites frequently allow users to provide additional meta-data about the shared resources, usually in the form of tags or folksonomies. We have developed a new community detection algorithm that can combine information from tags and the structural information obtained from the graphs to detect communities. We demonstrate how structure and content analysis in social media can benefit from the availability of rich meta-data and special properties.

Finally, we study social media systems from the user perspective. We present an analysis of how a large population of users subscribes and organizes the blog feeds that they read. It has revealed several interesting properties and characteristics of the way we consume information. With this understanding, we describe how social data can be leveraged for collaborative filtering, feed recommendation and clustering. Recent years have seen a number of new social tools emerge. Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. We present our observations of the microblogging phenomena and user intentions by studying the content, topological and geographical properties of such communities.

The course of this study spans an interesting period in Web’s history. Social media is connecting people and building online communities by bridging the gap between content production and consumption. Through our research, we have highlighted how social media data can be leveraged to find sentiments, extract knowledge and identify communities. Ultimately, this helps us understand how we communicate and interact in online, social systems.

English , , , , , , , , , , , , , , , ,

BarCamp Proposals: Factolex, Social Enhanced Search

October 6th, 2008

Hello Monday! I am a bit tired today as I did not really have a weekend but spent it in a rather intellectually stimulating fashion, attending BarCamp Vienna held on the premises of HP in the 12th district. My head is still buzzing from all the input!

Originally, the plan had been to have a marketing-themed BarCamp, but thanks to the bottom-up approach towards scheduling typical for BarCamps, that didn’t quite come to pass (greatly appreciated also that this wasn’t enforced by the organizers, thank you!). There were two sessions in the ones that I attended that have relevance for the Social Semantic Web:

One was held by Alexander Kirk about the latest improvements in Factolex, a collaborative, micro-content encyclopedia based on facts; I hear that Factolex will receive further semantic enhancements in the near future, so I’ll write a longer blog post about it then. One feature Alex showed and which impressed me considerably was the distributed way in which one can add further facts to Factolex now: On any webpage, highlight a word or phrase (e.g. “President of the European commission”) and then click on the bookmarklet. Factolex is automatically going to check whether it knows the term already and either creates a new one or adds a fact to an existing term. The source will be added automatically - pretty nifty!

Another project that does not yet have a name and that is currently in stealth mode was presented by Christian Zeidler: Social Enhanced Search on del.icio.us. The project addresses a well known del.icio.us problem: You can search your bookmarks, i.e. search the tags and possibly definitions you might have added - yet all too often this only leads to the problem that your search query does not match the tags you once assigned. Being able to search the full text of the saved page would improve the scenario considerably - and this is exactly the approach Christian’s project takes.

To begin with, he built his own search index using Lucene, an open source, full-featured text search engine library written in Java. Of course it doesn’t crawl the whole web - just the pages you have added to your del.icio.us account. Instead of building one index for every user, Christian decided to have one large search index which also takes away the troubles of double indexation - the current index, based on 800 pages, doesn’t exceed a size of 3MB, which seems rather reasonable.

Apart from your own bookmarks, the plan is to also allow searching the bookmarks of your friends on del.icio.us, giving your search perspective. How many friends do you have on Facebook, how many on del.icio.us? It’s about half a dozen on del.icio.us for me, so I guess that “friendship” here really stands for particular topics and interests - this social perspective thing might actually work for enhanced searches, I think.

What other means are there to weight and rank search results? Somebody raised the issue of customization, i.e. let the user define which weight he’d like to give the results of which friend. I completely agree with Christian when he said he doesn’t believe people want customization, as conscious, user-initiated customization efforts are often (considered) too high. Instead, the system must learn from the data, e.g. prefer the results of friends whose results you use the most often.

Another useful feature that is already in place is that you can add any RSS feed to your search index as well - this is indeed very neat. And finally, in addition and as a point of reference, the prototype displayed the Lucene-based results in one column, and Yahoo! Search BOSS results in another column. Not surprisingly, the Search BOSS results were rather general, and the Lucene-based results rather specific - and that specificity is what you’d expect from searching your own bookmarks.

Reblog this post [with Zemanta]

English , , , , , , , , , , , , ,

Danny Ayers: “The Semantic Web is the path of least resistance”

October 2nd, 2008

Danny AyersThe Web of Data Practitioners Days are approaching - giving me the opportunity to do an advance interview with Danny Ayers, Semantic Web evangelist, Community Platform manager at Talis, Web of Things everything (I think). I’d just like to extract two or three points here - you can read the whole interview on our website. First something that’s noteworthy to me as it says something about the patterns of technological evolution in general:

Looking back a few years, I don’t think many people working on the Web could have predicted the remarkable rise of blogging, the revival of DHTML and ancient Internet Explorer tricks such as Ajax, online social networks, Wikis, the whole Web 2.0 thing. It’s worth noting that these developments have been consistent with Tim Berners-Lee’s vision of the Web as a system in which people are the key component.

Shifting to the Semantic Web perspective, for a long time I have believed this approach is on track simply because it offers improvements to the Web for which there are no obvious alternative techniques. Personally, I was relatively late to realise what those improvements really were - moving from a Web of Documents to a more general Web of Data. Expressed like that, and looking at existing Web architecture, the Semantic Web is the path of least resistance.

Remember? AJAX, when it cropped up and caused a big buzz in 2005, was nothing new, it was just a new term for an old thing, i.e. the Internet Explorer tricks Danny mentions (see also A Brief History of AJAX: “Browser asynchronous hacks have been possible since 1996, when Internet Explorer introduced the IFRAME tag, passing through a number of techniques such as pixel gifs, Netscape layers, Microsoft Remote Scripting, Java/JavaScript gateways, stylesheet hacks, image/cookies, and most recently the XMLHttpRequest.”)

Sometimes it takes a while until someone (society, industry, what have you) starts to notice that this or that, something, could actually be useful. Sometimes technologies that everybody thinks are silly become a huge sucess - think text messages!

And sometimes you have a great (piece of) technology and it just never really catches on, and if that is the case, then mostly because some forces in the market (trusts, monopolies, corporations who force you to use their software/technology and at ridiculous price, people who would do anyhing they can to undo the natural laws of the digital world) won’t let it happen. What happend to Video 2000 and Betamax? Nixed by JVC’s licensing strategies for VHS. Just wanted to make this point before moving on to the next quote. Danny:

Regarding possible obstacles, there are many ways the Web could suffer, probably most dangerous being interventions from national governments or commercial interests, tilting the table on which we build these systems - such as software patents and threats to net neutrality. The Web works because it’s more or less the same to everyone, everywhere.

So if you think that the Web should continue to be the same to everyone, everywhere, if you would like to liaise with other people interested in the SemWeb and the Web of Data, but most importantly, if you do not know a whole lot about the SemWeb yet but would like to learn more, then please come and do attend the Web of Data Practitioners Days in Vienna, Oct 22-23.

It is going to start with a “Web of Data 101″, i.e. a low-threshold introduction given by Keith Alexander (Talis, UK) and Yves Raimond (Queen Mary University of London, UK) to Semantic Technology in the context of the Web. Here is the full program - please mind that there is a deadline for the registration also (6 Oct 2008!).

Reblog this post [with Zemanta]

English , , , , , , , , , , , , , , , , , , , , , ,

DAPD special issue on Data Management in Social Media

September 23rd, 2008

We (Tim Finin, Anupam Joshi, and Akshay Java) are editing a special issue of Distributed and Parallel Databases on Data Management in Social Media. Manuscripts must be submitted by January 15, 2009 and should not exceed 25 pages in length. Authors will be notified by April 15 and camera ready copy will be due May 15. Submit papers online specifying article type S.I.: Data Management for the Social Web. For more information, sett the call for papers (pdf) or contact Anupam Joshi at joshi@cs.umbc.edu.

Social Media tools like blogs, wikis and social networking sites are providing new opportunities for us to connect and interact with each other. Many social theories that could once be researched only by conducting expensive surveys can now be studied and modeled due to the easy availability of large scale social annotations and explicit description of social relationships online. The rate at which blogs, videos, bookmarks and many other user generated content is growing presents several interesting research and data management questions. The opportunity to mine social media content for analyzing opinions, sentiments and trend identification has several applications in Web search, personalization, business intelligence and national security. This special issue of the International Journal of Distributed and Parallel Databases invites original research contributions on data management in social media. Topics include but are not restricted to the following.

  • community detection and evolution in social media
  • recommendation systems
  • search in social media
  • event detection, trend identification and tracking in social media
  • influence, trust and reputation in social media
  • opinion/sentiment analysis, polarity identification
  • feed distillation and ranking blogs
  • mining microblogging and real time data
  • folksonomy, tag semantics, clustering and usage
  • advertising models for the social web
  • indexing social media content, index freshness
  • visualizing social network data
  • spam detection, social network spam and profile spam

English , , , , , , , , , , ,

Extending Google: First Look at SemantiFind

September 23rd, 2008

Just stumbled upon SemantiFind via T3N, and then upon the review on ReadWriteWeb from last week Thursday.

What’s it about? Semantifind is an IE and FF browser plug-in that extends Google’s search functionalities, most notably through a typeahead functionality that allows you to refine your search results before hitting ‘enter’. ReadWriteWeb wasn’t too impressed though:

Unfortunately, SemantiFind is one of those tools that’s good in theory, but not so good in practice. When performing some test searches, results were not as precise as they should have been. For example, in the above-mentioned search for “Georgia,” a search for the U.S. state returned Google results for the country as well.

Ambiguities due to homonyms such as Georgia vs Georgia, or Java vs Java are among the faves of people who are trying to pitch a semantic tool to you - but I really wonder whether the effects of homonyms aren’t highly overrated? How often do people really search for these, and in particular search for these without context, i.e. further search terms such as in ‘Georgia Tech’, ‘Georgia war’, ‘Java Coffee’ or ‘Java bugs’?

I must say I was quite impressed by the choice of search terms offered, and if you (like me) are easy prey for the serendipity effect, then SemantiFind can please and distract you endlessly. Here is a preview of what appears if you enter ’serendipity’ - please note the preview of possible descriptions and definitions which you get on the Google homepage with the plugin (click > big):

Once you pick a term it turns into a kind of button (just slightly annoying: you cannot edit a term after it’s turned into a button, but would have to delete the whole thing and type again if you want to change your search query):

And then, what happens? On the search results page, you see results filtered by SemantiFind’s user-generated, user-approved labels on top of the other search results - which irritated me at first as it comes across as a search engine within the search engine. Admittedly: I’d rather sift through 13 results than through 10,900,00 search results (even though I never make it to the end of Google’s search list anyway; does anybody?) - but does the article about trees doing their best work with thermostats at 70° really deserve the second rank in SemantiFind’s list of recommended search results?

So while I agree with RWW that this “just goes to show why search engines that rely on people to filter the results might not work. Human error shouldn’t be a factor in web searches”, I am still quite fond of the suggestions and definition previews. I would probably use SemantiFind regularly if they allowed me to configure the plugin in such a way that I’d get the suggestions on the input page, but not the recommended results on the results page.

What’s the source of these results anway? SemantiFind’s recommended results seem to rely entirely on input generated by users - to add input, you need to install their toolbar and start adding labels to websites; if a website has been labeled before, you can confirm or reject existing labels. What’s nice: a label recommender (only presumably the same one that’s used for search queries) reduces ambiguity. What’s curious: You can also browse the pages you have already labeled in what they call your “catalogue” - which makes the service even more reminiscent of a bookmarking service, and which makes me wonder whether one shouldn’t possibly link this with a del.icio.us/Mr.Wong/Bibsonomy/Faviki account (Faviki would probably be the best, considering their tag recommendations are based on DBpedia, and considering that Faviki just made it past the 1 million tags mark)

Questions that remain: I’d really like to know how they maintain their list of suggested labels - ambiguity, typos, plurals forms, i.e. the usual folksonomy issues must be a big challenge. Also, I’d like to know where they get their definitions in the preview from - from Google? Or are these user-generated as well? There must, after all, be some use for the “request a new definition” form?

Too bad they don’t have a blog to which one could send a track back, and there is nothing much on their company page either.

Reblog this post [with Zemanta]

English , , , , , , , , , , , , , , ,

Working Ontologist at Java One

July 15th, 2008

The JavaOne presentations are out!


I don't think you have to sign up to see them. They did a pretty good job of looping the soundtrack back to the video capture of the screen. There are a few places where the animation is pretty tightly coupled to the speech where you can see the looping errors, but for the most part, it is pretty good.


You can see my presentation about the Working Ontologist here .

I haven't managed to find the transcript of the panel session yet.

Uncategorized ,

Managing XACML Policies: A DL Perspective

July 14th, 2008

Recently, we have been working on a XACML-DL, an approach originally developed by Vlad Kolovski et al. based on his PhD. I’ve spent some time developing an API for the XACML-DL services, i.e. policy verification, policy subsumption, and policy redundancy with the primary aim of improving usability.

In a nutshell, XACML-DL is a translation of a subset of XACML 2.0 to OWL DL, i.e. policy sets and policies are translated into DL concept descriptions. This allows us to use standard DL reasoning services over XACML policies, which brings some useful features. The XACML-DL application is primarily intended to be a support tool for policy management, e.g. policy creation, maintenance, evolution (however, it is not limited to this use).

On the road towards our goal of replicating the XACML InterOp Demo with XACML-DL, we encountered a number of problems. First, Sun’s XACML Implementation (CVS) had to be tweaked to support XACML 2.0 which is required for the InterOp Demo. Second, the XACML-DL mapping does not fully cover XACML 2.0 and unfortunately even not all of the language subset used in the InterOp Demo. Setting aside some straightforward elements, e.g. Environment, the XACML 2.0 Condition element requires some theoretical work to find a mapping to DL. Nevertheless, it is worth talking about the XACML-DL application as its features may be useful in certain environments.

Verifying XACML Policies

So let’s talk a bit about policy verification. The XACML architecture includes a Policy Decision Point (PDP) which essentially takes a XACML (access) Request, evaluates it over policy sets, and returns a XACML Decision — permit and deny are two possible outcomes.

Shallow Testing

In policy management, the idea of evaluating XACML requests can be seen as Unit Testing for policy sets; for example, to (automatically) check system security. The obvious thing to do is “shallow testing”; write tests against policies which include the desired outcome — permit or deny — and let the testing framework tell you whether changes in policies or policy sets have changed the behavior of the policies when faced with access requests.

The analogies with Unit Testing are obvious and intended. It’s a reasonable thing for policy developers to do to verify that changes to policy don’t have unintended consequences.

Deep Testing

Our enhancement to this policy verification strategy is “deep testing”: policy verification in XACML-DL is able to uncover security issues that are not uncovered by testing against a PDP. We use the underlying OWL-DL reasoning services to automate generation of exhaustive tests against the policies and policy sets to provide a much deeper, comprehensive test coverage automatically. And we’re able to discover security issues, primarily around separation of duty constraints, that are not discovered in shallow testing.

Currently, our application is primarily intended to be an API that provides access to the XACML-DL services. As such, we’re only providing a command line application which supports the features of the API. For example, for the policy verification service described above the following command

java com.clarkparsia.policy.PolicyCmd --service verification --input-policy-base /my/policyset.xml --input-test-base /my/request/directory --deep-verification --verbose --explain

will load the policy sets in policyset.xml and the (Unit) tests in the directory are evaluated using “deep verification” (i.e., deep testing). The application will return detailed information (––verbose) about the tests (e.g. the decision for each test) including a counter example for the tests that fail (––explain).

Deep Testing and XACML ROI

Deep testing (or verification — we need to settle on one term for this, clearly!) is one of the features of XACML-DL and consists in the evaluation of XACML requests using an OWL-DL reasoner, in contrast to shallow testing which is the standard PDP XACML Request evaluation. With deep verification we test the satisfiability of a DL concept expression which is constructed from three input parameters: (1) XACML policy file, (2) XACML test condition, and (3) the type of test condition (e.g. permit, deny). Because we can use Pellet to check if there exists a model for a concept expression, deep testing automatically verifies all possible conditions, while with shallow verification we have to explicitly think about and encode them as test cases.

And that’s tedious, error-prone, and exactly the kind of job that policy engineers and developers should do but typically won’t, exactly because it’s annoying.

But using our approach to managing policies, we can generate that stuff automatically, thus radically increasing the confidence one can have (rationally) in the quality and coverage of one’s policies. There’s no point moving to XACML if all you end up doing is building a game-able system for attackers. The ROI to XACML, that is, to externalizing security policy out of application code, is only real if the underlying policies enforce the policy designer’s actual intent.

Uncategorized , ,

Beautiful plumage: Topic Maps Not Dead Yet

June 27th, 2008

Echoing recent discussion of Semantic Web “Killer Apps”, an “are Topic Maps dead?” thread on the topicmaps mailing list. Signs of life offered include www.fuzzzy.com (’Collaborative, semantic and democratic social bookmarking’, Topic Maps meet social networking; featured tag: ‘topic maps‘) and a longer-list from Are Gulbrandsen who suggests a predictable hype-cycle dropoff is occuring, as well as a migration of discussions from email into the blog world. For which, see the topicmaps planet aggregator, and through which I indirectly find Steve Pepper’s blog and an interesting post on how TMs relate to RDF, OWL and the Semantic Web (though I’d have hoped for some mention of SKOS too).

Are Gulbrandsen also cites NZETC (the New Zealand Electronic Tech Centre), winner of The Topic Maps Application of the year award at the Topic Maps 2008 conference; see Conal Tuohy’s presentation on Topic Maps for Cultural Heritage Collections (slides in PDF). On NZETC’s work: “It may not look that interesting to many people used to flashy web 2.0 sites, but to anybody who have been looking at library systems it’s a paradigm shift“.

Other Topic Map work highlighted: RAMline (Royal Academy of Music rewriting musical history). “A long-term research project into the mapping of three axes of musical time: the historical, the functional, and musical time itself.”; David Weinberger blogged about this work recently. Also MIPS / Institute for Bioinformatics and Systems Biology who “attempt to explain the complexity of life with Topic Maps” (see presentation from Volker Stümpflen (PDF); also a TMRA’07 talk).

Finally, pointers to opensource developer tools: Ruby Topic Maps and Wandora (Java/GPL), an extraction/mapping and publishing system which amongst other things can import RDF.

Topic Maps are clearly not dead, and the Web’s a richer environment because of this. They may not have set the world on fire but people are finding value in the specs and tools, while also exploring interop with RDF and other related technologies. My hunch is that we’ll continue to see a slow drift towards the use of RDF/OWL plus SKOS for apps that might otherwise have been addressed using TopicMaps, and a continued pragmatism from tool and app developers who see all these things as ways to solve problems, rather than as ends in themselves.

Just as with RDFa, GRDDL and Microformats, it is good and healthy for the Web community to be exploring multiple similar strands of activity. We’re smart enough to be able to flow data across these divides when needed, and having only a single technology stack is I think both intellectually limiting, socially impractical, and technologically short-sighted.

Uncategorized , , , , , , , , , , , , , , , , ,

SemTech 2008: Nova Spivack (Radar Networks) - “Experience from the Cutting Edge of the Semantic Market”

May 20th, 2008

Nova Spivack of Radar Networks gave a keynote talk at the 2008 Semantic Technologies Conference this morning.

He started off by giving some background to Twine. Twine is a service that lets you share what you know. When Nova pitched the original idea for the underlying platform to VCs in 2003, he was told that it was a technology in search of a problem. Thanks to DARPA and SRI, Nova had carried out some research in this field for a few years. The intial proposal to VCs was to develop next-generation personal assistants based on the Semantic Web. After the initial knock back, Nova went out again to raise funding, and Paul Allen stepped in as the first outside angel with Vulcan Capital.

Radar started working on the first commercial version of the underlying platform and also began work on the Twine application. The platform underneath Twine is not something they’ve talked about much so far, and they will discuss it (not at this conference) in the Fall. Radar also want to allow non-Semantic Web savvy people to build applications that use the Semantic Web without doing any programming.

Twine was announced last October at the Web 2.0 Summit. They began the invite-only beta soon after that. The focus of Twine is interests. It’s a different type of social network. Facebook is often used for managing your relationships, LinkedIn for your career, and Twine is for your interests. He called it “interest networking” as opposed to social networking.

With Twine, you can share knowledge, track interests with feeds, carry out information management in groups or communities, build or participate in communities around your interests, and collaborate with others. The key activities are organise, share and discover.

Twine allows you to find things that might be of interest to you based on what you are doing. The key “secret sauce” is that everything in Twine is generated from an ontology. The entire site - user interface elements, sidebar, navbar, buttons, etc. - come from an application ontology.

Similarly, the data is modelled on an ontology. Twine isn’t limited to these ontologies. Radar are beginning the process of bringing in other ontologies and using them in Twine. Later, they will allow people to make their own ontologies (e.g. to express domain specific stuff). In the long run, the community infrastructure will allow people to have a more extensible infrastructure.

Twine does natural language processing on text, mainly providing auto tagging with semantic capabilities. It has an underlying ontology with a million instances of thousands of concepts to generate these tags (right now, they are exposing just some of these). Radar are also looking at statistical analyses or clustering of related content, more of which we will see in the Fall (mainly, which people, items and interests are related to each other). For example, “here are bunch of things that are all about movies you like”. Twine uses machine learning to create these clusters.

Twine search also has semantic capabilities. You can filter bookmarks by the companies they are related to, or filter people by the places they are from. Underneath Twine, they have also done a lot of work on scaling.

Consumer prime-time launch of Twine is slated for the Fall. A good few bugs still have to be addressed, but Nova says there has been a “wonderful flowering of participation and friendships” in Twine. Many networks of like-minded people with common interests are being formed, and it is very interesting to see this take place. Nova himself has 500 contacts in Twine, and just 300 in Facebook. He now uses it as his main news source. David Lewis (the top Twiner) has 1000+ contacts in Twine.

Twine wants to bring semantics to the masses, and is not just aiming at Semantic Web researchers: it has to be mainstream. The main common thread in feedback received is that the interface needs to be simplified more. (Nova says he shaved his head as part of this new simpler interface :-)) Someone who knows nothing about structured data or auto tagging should be able to figure out in a few minutes or even seconds how to use it. It takes a few days at the moment to get a sense of the value, but Nova says it can be very addictive when you get into it.

Individuals are the first market, even if you are on your own and don’t have any friends -) It is even more valuable if you are connected to other people, if you join groups, giving a richer network effect. The main value proposition is that you can keep track of things you like, people you know, and capturing knowledge you think is important.

Motley Fool recently talked about Google killers. Twine is not one, according to Nova, as it is not trying to index the entire Web. Twine is about the information that you think is important, not everything available. Twine also pulls in related things (e.g. from links in an e-mail), capturing information around the information that you bring in.

When groups start using Twine, collective intelligence starts to take place (by leveraging other people who are researching stuff, finding things, testing, commenting, etc.). It’s a type of communal knowledge base similar to other things like Wikia or Freebase. However, unlike many public communal sites, in Twine more than half of the data and activities are private (60%). Therefore privacy and permission control is very important, and it goes deep into the Twine data.

Initially Radar had their own triple store, an LGPL one from the CALO project. They found that it didn’t scale towards web-scale applications, and it didn’t have the levels of transaction control you’d need from an enterprise application. They decided to go for a SQL database (PostgreSQL) with WebDAV. However, relational databases weren’t optimised for the “shape” of data that they were putting into it, so it needed to be tweaked. They’ve had no performance issues so far, but they may move to a federated model next year. Twine uses an eight-element tuple store (subject-predicate-object, provenance, time stamp, confidence value, and other statistics about the triple or item itself). They can do predicate inferencing across statements, access control, etc. The platform is all written in Java, and Twine then sits on top of that.

Next he talked about the Twine beta status. There have been 20000 beta testers in last 30 days, 9000 twines created, 150000 items added, 60% of twines are private, and new features are being added every four weeks (in point releases). Some of the feature requests they’ve received include import capabilities, interoperability with other apps, and the ability to use other ontologies.

Twine will stay in invite beta for the summer. Soon, they will take off the password door to the public twines, so that they will all be visible to search engines. Radar will be SEO-ing the content automatically, so you will see more “walk-ins” after that happens. They will still be able to control who gets an account, but stuff will be publicly accessible.

In the Fall, Radar will open it so that anyone can open an account. You will be able to really customise Twine, to author and develop rich semantic content. Nova says that Twine will then be a step beyond blogs and wikis when it happens (but he can’t say much about the new stuff for now).

Next, there were some questions.

Q: The first one was about privacy. What if you add something and then later you decide that you want to delete it - is it really deleted or does Twine keep it around?

A: Nova answered that currently, it is not really deleted, it goes into a non-visible triple. But they will be doing that (really deleting it) soon.

Q: What is the approach to interoperability with Twine? What other types of semantic applications will Twine work with?

A: Today, Twine works with e-mail (in / out), RSS (get feeds out), and browsers (e.g. for bookmarking). There have been lots of requests for interoperability with mindmaps, various databases, enterprise applications, etc., so Radar are giving it a lot of thought. Twine has to provide APIs. They have a REST and a SPARQL API: they are not fully ready just yet, but by end of the year Twine will have a usable REST API. Unfortunately, Radar can’t handle the long tail of requests for features, there’s just too much, but an API will help people to make their own add-ons.

Then there’s the ontology level. You will be able to get the data about you or related to you out of Twine in RDF. You should also be able to get stuff out using other ontologies that are common, e.g. using FOAF, SIOC (yay!), or Dublin Core.

They are also looking at specific adaptors that they need to build. For example, this includes importers for del.icio.us, Digg, desktop bookmark files, Outlook contacts, and a bunch of others. They will be rolling out some of these in the Fall timeframe. Also, there may be a demand for Lotus Notes interoperability - or Exchange - possibly. Radar may actually look at other semantic applications like Freebase that they could interoperate with first. They have already hardcoded in some interoperability with Amazon for example.

Q: When Radar went to VCs and were turned down, was Twine part of the pitch? (For the second time around with Paul Allen, the questioner presumed that Nova did have it as part of the pitch.)

A: In 2003, Radar had a desktop-based semantic tool called “Personal Radar”. It was basically a Java-based P2P “Twine” using RDF. It had lots of eye candy and visualisations. The VCs said “semantic what?” and it was extremely hard to explain P2P, Semantic Web, RDF, and knowledge sharing to them. He said the VCs are mainly interested in when you are going to make money for them. But most of his pitch was blue sky, with no business plan, demonstrating a piece of technology, and pushing the fact that he knows people will need it. Paul Allen was more visionary, and he really believes adding structure to the Web is inevitable. He was willing to take a bet before they were in business. Then they went on to get Series A funding. The VCs said it was too early, but they eventually got it. Series B wasn’t as hard, and it fell into place in a matter of weeks, so it was a good round.

Even though there’s a lot of talk about the Semantic Web in the press and on the Web, most VCs are still figuring it out now and they are interested in making just one bet in the space. The main thing you need to avoid is being a platform without having any applications to show. It has to be compelling, where you can envisage users using them. Valley VCs are jaded about platforms.

Q: As one imports information from various places, what exactly is there in Twine that will prevent a person having to merge any duplicate objects?

A: Nova said there is limited duplication detection at the moment, but this will be improved in a few months. Most people submit similar bookmarks and it is reasonably straightforward to identify these, e.g. when the same item is arrived at through different paths on a website and has different URLs.

Q: Ivan Herman from the W3C asked if Radar were considering leveraging the linked open data community?

A: Nova said that DBpedia would be one of those main sources of data that they want to integrate with - the FOAF-scape, the SIOC-o-sphere, and DBpedia. Wikipedia URIs are already being used to identify tags, and this is something they will leverage.

Q: How can copyright be managed in Twine?

A: Nova said that it’s thanks to the Digital Millennium Copyright Act (DMCA). It provides a safe harbour if you cannot reasonably prevent against anything and everything being uploaded (and are unaware of it). Twine’s user agreement says please do not add other people’s copyright material. Fair use is okay, and if you share something copyrighted, it is better to have a blurb with a link to the main content. Therefore, Twine is using the same procedure as in other UGC sites.

Q: How are Radar going to make money?

A: Twine is focused on advertising as the first revenue stream. Twine has semantic profile of users and groups, so it can understand their interests very well. Twine will start to show sponsored content or ads in Twine based on these interests. If something is extremely relevant to your interests, then it is almost like content (even if it is sponsored). They will be pilot testing this advertising soon.

Q: Have Radar been approached by Google, Facebook, as the value proposition for Twine is very interesting?

A: Nova said they are not trying to compete with Facebook (right now!), but rather they are trying to find the magic formula that will work for Twine right now. Facebook has a lot of fluffy stuff: vampires, weird games, etc. Nova said he’d prefer to spin the bottle with a real person. Twine will focus on professional people who have a stronger need for a particular interest, doing things technically that are outside the scope of what they are doing at the moment.

Q: Why does Twine use tuple storage: why is it not using a quad?

A: Nova said it’s faster in their system, so for performance reasons they decided to avoid reification.

(I will also post my notes from Eric Miller’s keynote in the next day or three.)

English , , , , , , , , , , , , , , , , , , , , , ,

CELT talk / WWW@15 on Morning Ireland / Ulrich Schnauss

May 2nd, 2008

A mixed-up blog post, but I haven’t the energy to write three separate posts, so here’s a three-in-one:

  • On Wednesday, I gave a talk at CELT, NUI Galway about “Learning via the Social Web”, which was a slightly-revised version of the one I gave in February. Again, there was an amazing turnout, and there will be a webcast made available via the CELT website at a later date. For now, you can access the PowerPoint slides here.
  • Yesterday, Damien Mulley and I were interviewed by Richard Downes on