Archive

Archive for March, 2009

SPIN Webcast

March 31st, 2009

No, it isn't an elaborate April Fool's joke.  Tomorrow, TopQuadrant is sponsoring another webcast with Semantic Universe.  The topic of this one is SPIN, a Semantic Web modeling language based on SPARQL and organized like an Object Oriented model.  That actually just scratches the surface - feel free to attend the webinar (registration is free at Semantic Universe) and see demo and details. 

Uncategorized

UKSG09 Uncertain vision in sunny Torquay

March 31st, 2009

uksg Glorious sunshine greeted the opening of the first day of UKSG 2009 in Torquay yesterday.  The stroll along the seafront from the conference hotel (Grand in name and all facilities, except Internet access - £1/minute for dialup indeed!)  was in delightful sharp contrast to the often depressing plane and taxi rides to downtown conference centres.

IMG_0012 The seaside theme was continued with the bright conference bags.  Someone had obviously got hold of a job lot of old deckchair canvas.  700 plus academic librarians and publishers and supplier representatives settled down, in the auditorium of the Riviera Centre, to hear about the future of their world.

The first keynote speakers were very different in topic and delivery, but all three left you with the impression of upcoming change the next few years for which they were not totally sure of the shape.

First up was Knewco Inc’s Jan Velterop pitch was a somewhat meandering treatise on the wonders and benefits of storing metadata in triples – something he kept saying he would explain later.  The Twitter #uksg09 channel was screaming “when is he going to tell us about triples” and “what’s a triple” whilst he was talking.  He eventually got there but I’m not sure how many of the audience understood the massive benefits of storing and liking data in triples, that we at Talis are fully aware of.   Coincidentally, for those who did get his message, I was posting about the launch of the Talis Connected Commons for open free storage of data – in triples, in the Talis Platform.

Next up was Sir Timothy O’Shea from the University of Edinburgh, who talked about the many virtual things they are doing up in Scotland.  You can take your virtual sheep from your virtual farm to the virtual vet, and even on to a virtual post mortem.  His picture of the way information technology is playing its part in changing life at the university, apart from being a great sales pitch for it, left him predicting that this was only the early stages of a massive revolution.  As to where it was going to lead us n a few years he was less clear.

Joseph Janes, of the University of Washington Information School, was one of those great speakers who dispensed with any visual aids or prompts and delivered us a very entertaining 30 minutes comparing the entry in to this new world of technology enhance information access, with his experience as an American wandering around a British seaside town.  His message that we expect the next few years to feel very similar on the surface, as we will recognise most of the components, but will actually be very different when you analyse it.  As an American he recognises cars, buses, adverts, and food, but in Britain they travel on the wrong side of the road, are different shapes, and are products he doesn’t recognise.   As we travel in to an uncertain but exciting future, don’t be fooled recognising a technology, watch how it is being used.

A great start to the day, which included a good break-out session from Huddersfield’s Dave Pattern. He ended his review of OPACs and predictions about the development of OPAC 2.0 and beyond, with a heads-up about my session today, which caused me to spend a couple of hours in the hotel bar, the only place with Wifi, tweaking my slides.  It would be much easier to follow Mr Janes’ example and deliver my message of the cuff without slides – not this time perhaps ;-)

Looking forward to another good day – even if the sun seems to have deserted us.

English

Integrity Constraints, Reasoning, and a Preview Release

March 30th, 2009


In a previous post Evren introduced some of the work we’ve been doing lately to turn OWL into an expressive schema or data validation language. In other words, using OWL to specify and implement integrity constraints for RDF and other data.

Simple Integrity Constraints & Reasoning

In this post I want to give a simple example to motivate the integration of OWL reasoning with integrity constraint checking. Consider the case encountered when instance data is expressed in terms of the most specific concepts in an ontology:

:Citizen a owl:Class .

:Man a owl:Class .

:Woman a owl:Class .


:ssn a owl:DatatypeProperty .


:Citizen owl:disjointUnionOf ( :Man :Woman ) .

So, in this ontology there are three concepts: citizen, man, woman; and all citizens are men or women but not both. And, further, there is a property, Social Security Number.

The instance data is

:Marge a :Woman .
:Homer a :Man ;

  :ssn “123-45-6789″ .

So: there is a woman, Marge; and a man, Homer, who has a Social Security Number. In this example, how can we say that all citizens should have Social Security Numbers? Like this:

:Citizen rdfs:subClassOf [
  a owl:Restriction ;

  owl:onProperty :ssn ;

  owl:cardinality 1 ] .

But without reasoner integration, this constraint won’t apply to either Marge or Homer, because the constraint refers to citizens, not to men or women. Which is as it should be since duplicating the constraint is not only inaccurate but error-prone (DRY, after all).

Since our integrity constraint checker uses OWL reasoning, however, it will infer that Marge and Homer are both citizens:

:Marge a :Citizen .
:Homer a :Citizen .

Thus, when it applies the integrity constraint (that citizens must have SSNs) it will produce a validation error: Marge is a citizen, which we know by reasoning, but the data does not contain Marge’s SSN.

Bad Workarounds

Integration with reasoning allows us to do better data modeling, since it prevents us from repeating the constraints for all of the most specific class types (men, women) when it’s simpler to put the constraint on a more general class (citizen). And this is more accurate data modeling, too, since the requirement for SSNs is not dependent on a person’s gender.

The obvious workaround, if you don’t have reasoning integrated with constraint checking, is to write integrity constraints in terms of the most specific concepts, which is almost always an unnecessary proliferation of constraints–violating DRY–and may be inaccurate modeling, too.

This workaround makes constraint authoring and maintenance much more difficult. By integrating with the reasoner, we support a more natural and scalable approach to integrity constraint specification, one that leverages the ontology by placing constraints appropriately high in the class hierarchy.

Preview Release

If you want to play with a preview release of our OWL-based Integrity Constraint validator, you can download it now. Note that it’s not licensed under open source terms. This version will likely be released open source in a future version of Pellet, but we’re not doing that today. Information about installation, use, and the forum for more information and bug reports (please!) are all included in the README.txt. Details of the evaluation license terms are in the LICENSE.txt.

Edited: 2009-06-18 to update version at download link.

English, Integrity Constraints

Free hosting for Open Data

March 30th, 2009

Over on our sister blog Nodalities, my colleague Leigh Dodds has announced the launch of  the Talis Connected Commons.

True to our desire to see a truly open web of data, under the terms of the Connected Commons scheme Talis is offering free access to the [Talis] Platform for the purposes of hosting public domain data. And the offer isn’t just limited to free hosting: the data access services, including access to a public SPARQL endpoint, are also freely available.

The terms of the offer are as follows: if you own, or are creating, a public domain dataset then you can store that data in the Platform as RDF, for free. We’re setting an initial cap of 50 million triples on each dataset, but that should be plenty of space in which to collect some really interesting data.

So have you got, or want to create, up to 50 million triples you would like to put in the public domain along with up to 10Gb of content.  Yes, well get yourself over to the The Connected Commons page and check out if you qualify.  There is also a FAQ to give you more detail.

The Connected Commons is for all sorts of data, but I’m positive that the library world provides a rich source of such open data sets - get in there guys and get your data openly linked and out there.

 

English

Barómetro de la web social y la semweb en el entorno europeo.

March 30th, 2009

Más lógica, mayor eficiencia, menor sujección a los designios rankistas de Google, mayor horizontalización de las fuentes de conocimiento, más relevancia de los resultados, así como de la publicidad sin necesidad de incurrir en amenazas a la privacidad, son algunas de sus promesas.

Estoy preparando un nuevo taller – sesión acerca del tema (Web 3.0) el próximo miércoles 1 de abril en Barcelona para CETEI-Espiral y en ella incluiré al programa que ya he desarrollado en otras ocasiones, algunos de los primeros resultados del  “Semantic Web Awareness Barometer” realizado hace pocas semanas.

Se trata de un estudio realizado por Semantic Web Company Vienna, el Corporate Semantic Web FU Berlin y el Know Center de Graz y los resultados, dados los índices de participación por países pueden resultar válidos en el entorno Europeo. Incluye algunos datos interesantes y creo que bastante fiables sobre la web social, software social y su penetración actual en las empresas:

Software social:

1. En cuanto a penetración en empresas, los wikis son la herramienta más utilizada, antes que los servicios de Social Bookmarking y redes sociales. En este orden, los sistemas más utilizados son wikis, blogs, feeds, redes sociales, entornos virtuales, podcast y marcadores sociales.

software-social-en-europa

2. Distintas aplicaciones y patrones de uso del software social.

3. Distintas nociones sobre los beneficios y problemas del Social Software (en cuanto a los problemas, se reporta la falta de tiempo como el principal para no acceder al software social)

Web Semántica:

1. La web semántica es algo ya familiar para muchos.

2. Carácter autoaprendido de la semweb. Los expertos lo son porque se han interesado y formado de forma autónoma, en el tema.

3. Importancia de la web semántica para las empresas: La búsqueda es la “killer app” (la mejora en la relevancia de los resultados es lo que más se valora). Menores costes de integración con aplicaciones y mayor control de los datos pueden ser aspectos importantes.

4. Distintas nociones sobre las barreras: La falta de experiencias de éxito, los pocos expertos en web semántica en el entorno de la organización y la ausencia de cultura empresarial al respecto podrían ser las más importantes.

barreras

5. Las competencias necesarias del trabajador del conocimiento, las  dinámicas de colaboración en grupos de trabajo, podrían cambiar, evolucionar gracias a mecanismos más eficientes como los que podría proveer la semweb.

competencias-colaboracion

6. Rentable, lista para el mercado en 2 – 5 años. La crisis y la necesidad que genera de buscar nuevas soluciones podría influir.

7. No existen diferencias en cuanto a regiones o competencia y familiaridad con las NNTT en relación a la conciencia sobre la web semántica.


Os dejo enlace al informe completo (pdf): Semantic Web Awareness Barometer 2008 – Preliminary Results


Relacionados:

Presentación: Construyendo entre todos la web semántica.

Compártelo: bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Spanish

Linked Data is not owl:sameAs Semantic Web

March 30th, 2009

twitter_cloudletWhile some people work heavily on the extension of the semantic web infrastructure, like Talis Connected Commons or OpenLink´s Amazon EC2 Instantiation others have started to bring the semantic web closer to the developers and therefore to a much broader audience: They offer search facilities or Linked Data Navigators like OpenLink´s Entity Finder or DERI´s VisiNav.

Those kind of applications should not be confused with “semantic web” end-user-applications like Google´s Wonderwheel or INTSPEI´s Cloudlet: To add some semantics to existing user-interfaces can be helpful and obviously users are ready for such experiments, but of course this is NOT the innovation which the semantic web will bring but it is a very important step to be taken in parallel with the linked data initiative.

Let´s take a look at Cloudlet: This tool is an easy-to-use free Firefox extension that adds context-sensitive tag clouds to the most popular search engines and helps people more efficiently navigate through their search results. The previous version of Search Cloudlet worked with Google and Yahoo; the new version also works with Twitter. It adds Tag Clouds, Author Clouds, Recipient Clouds and Hashtag Clouds to Twitter search, Twitter user profiles and home pages. See some reviews on this popular tool.

Cloudlet is a child of the Web. INTSPEI has learned all lessons from Web 2.0 especially how to promote ideas using the blogosphere and how to identify market trends as early as possible, and it generates some added value for the users which is obvious. Sure, it doesn´t make use of linked data yet, but as a typical representative of the fast growing “semantic search evolution” it reminds me on Chris Welty´s famous insight: “In the Semantic Web, it is not the Semantic which is new, it is the Web which is new.”

Web 1.0 was the WWW without tons of network effects. Web 2.0 changed that a lot.

Linked Data is not the Semantic Web, it´s the basement for it. From a software developer´s and an IT archictect´s perspective it might seem as those two concepts were the same. But this community represents a very small percentage of all web-users.

So where is the User´s Web in the Linked Data architecture? If you´re looking at TimBL´s Linked Data principles one can clearly see that this is a “Web” for developers.

But things evolve. And some Web companies will jump on the bandwagon and will, for instance, improve their tagclouds, their semantic search, their recommender systems (Twine?) or their similarity search a lot by making use of linked data.

Like semantic search becomes mainstream (or call it “semantic search 2.0″) right now, then (in about three years, I guess) linked data will become part of a lot of mainstream applications. Linked data will generate tons of new network effects, maybe even new business models, it won´t be avant-garde anymore. It will be part of the Semantic Web.

English

Talis Connected Commons launches

March 29th, 2009

Yesterday I attended the OKCon Open Knowledge Conference. The conference — which was attended by around 70 people (by my rough count) — brought together a wide variety of people to present on a range of topics from knowledge transfer for sustainable development through to linked data and the semantic web. A really broad range of issues that ranged from the social to the technical. While I’m not sure that the mix always worked, I came away having learnt about a number of interesting projects. I also noticed a definite theme centred on the need for easier publishing and sharing of data and information between development projects.

Which is why I was pleased to be able to announce at the end of my talk a new initiative from Talis called the Talis Connected Commons. We’ve been working on this plan for a while, so it was great to be able to finally publically share the details. The essence of the scheme is that you can now host public domain data in the Talis Platform for free, and immediately use the existing Platform services to interact with that data. That covers both simple data access and searching features through to a SPARQL endpoint, with outputs in a range of formats including RDF/XML, RSS and JSON.

There are a couple of, quite reasonable, conditions that apply. Firstly the data has to be truly in public domain, which means using one of the currently defined open data licences (i.e. CC0 or Open Data Commons PDDL). Secondly there’s an upper limit on the storage: 50 million triples and 10gb of supporting content. But that’s plenty of room in which to host some interesting data.

Personally I think this is great news for open data projects as it means that there is an immediately available infrastructure and API into which you can pour your data. And, importantly, retrieve it again afterwards; there are plenty of ways to get data into and out of the Platform. This means that the focus can be on the data collection and publishing, which is where it should be.

There should be a lot of useful and interesting data sets that can be published in this way. I’m expecting that the scheme should be of immediate interest to people working with public sector information and around publishing of open scientific data. For more information on the scheme check out the homepage and the detailed FAQ.
It’s great to be working for a company that takes open data this seriously. And is a concrete sign of its commitment to helping build a truly open data commons. (We’re hiring, btw.)

English, Semantic Web, Work

Talis Connected Commons launches

March 29th, 2009

Yesterday I attended the OKCon Open Knowledge Conference. The conference -- which was attended by around 70 people (by my rough count) -- brought together a wide variety of people to present on a range of topics from knowledge transfer for sustainable development through to linked data and the semantic web. A really broad range of issues that ranged from the social to the technical. While I'm not sure that the mix always worked, I came away having learnt about a number of interesting projects. I also noticed a definite theme centred on the need for easier publishing and sharing of data and information between development projects.

Which is why I was pleased to be able to announce at the end of my talk a new initiative from Talis called the Talis Connected Commons. We've been working on this plan for a while, so it was great to be able to finally publically share the details. The essence of the scheme is that you can now host public domain data in the Talis Platform for free, and immediately use the existing Platform services to interact with that data. That covers both simple data access and searching features through to a SPARQL endpoint, with outputs in a range of formats including RDF/XML, RSS and JSON.

There are a couple of, quite reasonable, conditions that apply. Firstly the data has to be truly in public domain, which means using one of the currently defined open data licences (i.e. CC0 or Open Data Commons PDDL). Secondly there's an upper limit on the storage: 50 million triples and 10gb of supporting content. But that's plenty of room in which to host some interesting data.

Personally I think this is great news for open data projects as it means that there is an immediately available infrastructure and API into which you can pour your data. And, importantly, retrieve it again afterwards; there are plenty of ways to get data into and out of the Platform. This means that the focus can be on the data collection and publishing, which is where it should be.

There should be a lot of useful and interesting data sets that can be published in this way. I'm expecting that the scheme should be of immediate interest to people working with public sector information and around publishing of open scientific data. For more information on the scheme check out the homepage and the detailed FAQ.

It's great to be working for a company that takes open data this seriously. And is a concrete sign of its commitment to helping build a truly open data commons. (We're hiring, btw.)

English

iPhone coding for web developers

March 28th, 2009

This week the London Flash Platform User Group ran an evening of iPhone developer talks. My talk, “iPhone Coding For Web Developers” seemed to go down well. As a web developer, I’ve found the iPhone development environment exciting in its power and possibilities, but also perplexing in its lack of basic facilities that I’d take for granted in a modern dynamic language.

This talk (based on a previous blog post here) goes into some detail about how I use HTTP, JSON and other web-oriented tech in my iPhone work.

English, Talks, Uncategorized, iphone

Twine is Now Integrated with Twitter

March 27th, 2009
We've integrated Twine and Twitter so you can "tweet what you twine" -- it's surprisingly easy and cool. Try it!

English, Microblogging, bookmarking, social bookmarking, social media, tweet, twine, twitter

Check out the Resolved Entity Analyzer

March 26th, 2009

Developers - we've posted information about the Resolved Entity Analyzer in the Showcase. It analyzes entities and outputs in an xml file containing simplified data gathered from OpenCalais and Linked Data.

Check it out here.

English, Linked Data, Official Blog, RDF output, disambiguation

Generation of SPARQL Queries

March 25th, 2009

We produced a webinar a couple weeks back that shows off some of the cool things we can do to help people cope with creating and managing SPARQL queries.  There were some technical difficulties on the day of the webinar itself that were disappointing, but I understand that the recording (linked here) came out well (I can't listen to it - you know how hard it is to listen to your own voice). 

I have shown in particular the query generation stuff live to a few audiences since then - most of them - well, all of them so far - have been pretty excited about it.  I find that even as an experienced SPARQLer, I use the automated generation quite a lot myself.

Uncategorized

the next google

March 25th, 2009
Google in 1998
Image via Wikipedia

Maybe you have noticed it already; today in the morning something new appeared at Google’s search engine interface: A bunch of corresponding search-suggestions based on your search query. Google spoke about this enhancement:

Starting today, we’re deploying a new technology that can better understand associations and concepts related to your search, and one of its first applications lets us offer you even more useful related searches (the terms found at the bottom, and sometimes at the top, of the search results page).

I tried it. So, if you type in “time travel” you also get search proposals like “theory of relativity time travel” or “wormhole time travel”. Google annouced, that the service is available in various languages. The direct test with German is a little disillusioning: Searching for “zeit reise” (which is the same concept as above, in german) leads to alternative searches like “reisen 50er jahren” (travel 50ies) and “reisen im mittelalter” (travel in the medieval).

Even if this semantic-like extension of the basis search function still needs some tuning, the point is getting clearer: Also Google is doing developments to get more meaningful results into their search algorithms. And parts of the semantic methodology are finding their way into mainstream services like search engines - as we have seen with Wolfram Alpha some days ago. So keep your eyes open - maybe next morning you’ll find another piece of the semantic puzzle embedded into one of your favorite web-apps.

Reblog this post [with Zemanta]

English

Mailing Lists and Social Semantic Web

March 25th, 2009

Social Web EvolutionToday I’ve received my printed copy of the book Social Web Evolution: Integrating Semantic Applications and Web 2.0 Technologies, which includes a chapter titled Mailing Lists and Social Semantic Web gathering all the work made these last years around SWAML, SIOC, mailing lists and the Social Semantic Web.

This is the first book that I’ve written, so I’m very proud of it. Thank you to the co-authors of the chapter (Diego, Lian, Labra and Patricia), the editors of the book, and all the people that help us during these years.

If someone wants to take a look at book, you can find it on Amazon or just use the preview provided by Google Books.

English, Spanish

Automated Categorization of Search Results, a New Era?

March 23rd, 2009

Since the hakia Galleries have been on-line, we have received nothing but appraisals. Our proprietary approach to “Aspect Categorization” shines with examples in topics ranging from music to health. We currently cover more than a million popular queries.

hakia’s fully automated gallery production where the search results are categorized according to the query can be seen in the following demo link where 1425 different car brands and models are covered.

Car Brands & Models.

This is part of our ongoing effort to spread this capability to all search queries, effectively creating a new organization of the content on the entire Web, in a way as distinct as how Wikipedia invented its own style.

Microsoft’s recent news about KUMO and its screen-shots leave no doubt that some people are already convinced this is the way to progress in search.

Aspect categorization is different than what some search engines are already doing. For example, dividing the SERP into Web Results, Videos, News, Images, etc., is not aspect categorization. However, when the categories are related to the query, such as Obama’s Speeches and Quotes, Obama’s Fans, etc., (for the query Obama) then it is aspect categorization.

Aspect categorization in search is a tough business, it requires carefull off-line analysis to determine how the categories are going to be decided algorithmically, resources will be identified for crawling, and how the results will be detected to fit in.

The effectiveness of this approach in the broad search space is yet to be seen, and the users will have the last word as always. The tech bloggers and authors will be able to make their own judgment and recognize the limitations and imitations. In light of our patent application in progress, we are also anxious to see where all this leads to. Some exciting times ahead. Until then happy searching at hakia.

English, Technology

“The Social Semantic Web”: now available to pre-order from Springer and Amazon

March 23rd, 2009

Our forthcoming book entitled “The Social Semantic Web”, to be published by Springer in Autumn 2009, is now available to pre-order from both Springer and Amazon.

20090323a

An accompanying website for the book will be at socialsemanticweb.net.

English

Open now: LOD Triplification Challenge 2009

March 23rd, 2009

The yearly organized Linking Open Data Triplification Challenge (as part of this year´s I-Semantics conference, 2 - 4 September 2009, Graz/Austria) awards prizes to the most promising triplifications of existing Web applications, Websites and data sets.

The challenge (Patron: Sir Tim Berners-Lee) is open to anyone interested in applying Semantic Web and Linked Data technologies. We envision submissions such as following:

  • Applications of Linked Data tools and techniques such as for example Triplify, Virtuoso or D2RQ on custom Web applications and data sets exposing a large quantity and variety of content.
  • Implementations of exporters and mappers from existing content repository formats (such as mbox mailing list archives, Bib Te X, XML-Schemes etc.) into RDF and Linked Data.
  • Adoptions / configurations of Triplify for standard Web applications, such as for example Wikis, Weblogs, Webshops, Forums, Web-Gallery, ERP/CRM systems and Web-calendar software. You can find popular Web applications for example at Source Forge.
  • Portings of the Triplify script into other Web application programming languages such as Python, Ruby, Perl, ASP. The Triplify script is very small (<300 lines of code) however, the port should be as compatible as possible with the current reference implementation but integrate well with the environment given by the programming language.
  • Applications showcasing the benefits of Linked Data to end-users such as for information syndication, specialized search, browsing or augmentation of content.

Submissions should consist of a two page description in JUCS format of the application, accompanied by (a link to) the software source code and a link to an online demo. The descriptions should be submitted electronically via email to Michael Hausenblas with the subject Triplification Challenge Submission by May 30th, 2009.

English

Web Semántica para Torpes

March 21st, 2009

En Semantic Universe han presentado el libro Semantic Web for Dummies, un estupendo libro (en inglés) de la serie “Dummies” en el que se detallan de modo sencillo todas las oportunidades que la Web 3.0 ofrece tanto para el usuario común como para negocios online, los formatos de documentos abiertos RDF y OWL, y todo lo que se debe saber para una inmersión en la Web Semántica.

Además, si te registras como usuario en el sitio Semantic Universe (sin coste) puedes descargar gratuitamente el capítulo “The Data Web at Work for Business“, donde Jeff Bollock – el autor del libro – explica la aplicación de las tecnologías web en el trabajo. Y si quieres el libro completo … por menos de 14 euros lo tienes en Amazon.

51e6h9sl-tl_ss500_


Comparte este artículo: TwitThis Meneame Facebook Bitacoras.com Wikio del.icio.us StumbleUpon Google Bookmarks Live MySpace Technorati Turn this article into a PDF! E-mail this story to a friend! Print this article!

Artículos Relacionados:

Amazon, Libros, Spanish, Web 3.0, Web Semántica

Happy days are here again…

March 20th, 2009

As you may have noticed we've had some intermittent stability and response time issues over the last few days. We believe we have the problem solved - or at least quarantined.

We have one particularily high volume user who submits a very wide range of content types to us. Due to some errors in the way they were using the API and some errors in the way we were handling errors (whew..) - we were seeing system utilizations that were off the chart.

We've moved that user to their own little quarantine island until we get things worked out with them. As soon as we did this - the remainder of our servers paused for a moment, took a deep breath, and then went back to almost idle - where they belong.

So - things are looking good. We'll continue to keep a close eye on the system and make sure things stay settled down.

We learned a few things about how to debug these types of errors and will be faster in the future.

Thanks all for your patience.

Tom

 

English, Official Blog

System Stability

March 20th, 2009

All:

We're having some intermittent system stability problems. We have all hands on deck working on it and will give you an update as soon as we have more information.

 

Tom

English

Semantic Web Awareness Barometer 2008 - Preliminary Results

March 20th, 2009

First results from our last online survey “Semantic Web Awareness Barometer” are now available. We conducted the survey togetehr with the Corporate Semantic Web Initiative from the FU Berlin and the Know Center in Graz. We got 256 valid cases (from 561 responses) which reveal some intertesting results concerning the experience , expectations and readiness for Social Software and the Semantic Web. In short:

Social Software
1. Wikis are king! Social Bookmarking stays behind.
2. Differring applications & usage patterns of social software
3. Differring notions about the benefits of and barriers to Social Software
Semantic Web
1. Semantic Web is something familiar!
2. Application-oriented catch up – but where are the young academics?
3. „I taught myself about the Semantic Web.“
4. Semantic Web has a corporate relevance!: Search – the killer app! Integration costs & data control might be important aspects.
5. Differring notions about the barriers?
6. Competencies and collaboration will change …
7. Time to market 2 – 5 years!
8. No differences in region, IT competence & familiarity
We will give a short presentation at today’s Semantic Web Meetup in Berlin. If you can’t join us, don’t worry! You can download the slides right here: Semantic Web Awareness Barometer 2008 - Preliminary Results
A detailed report will be available by April.
Reblog this post [with Zemanta]

English

Google and the Semantic Web: About Quad Stores and URIs

March 20th, 2009

Just recently Google launched another interesting service called “In Quotes”. It delivers quotes from stories linked to from Google News and users can compare opinions of e.g. politicians in a very comfortable way.

If  a closer look is taken at the system, one can see that any person whose quotes are listed has got a URI: Barack Obama has got the uniform “qsid” tPjE5CDNzMicmM.

It seems like “qsid” stands for “Quad Store ID” which would perfectly support such a URI based system.

Does Google slowly approximate to the Semantic Web?

English

Call for Presenters: Epic PAWS Meetup

March 19th, 2009

The next Palo Alto Semantic Web (PAWS) Meetup will be hosted by Microsoft. The format will be many short demos of semantically inclined products and technologies.

Got a unique product based on semantic technology? Do you have a geeky demo that, even though it’s demoed from the command line, will knock the socks off people? Spread the word, because we’re looking for companies and people to show off cool products. Please email kconry at microsoft dot com if you are interested, or if you’d like to suggest a company that is doing something different in semantics.

Each presenter will get five minutes for their demonstration and five minutes for questions from the audience. This is a great opportunity to disseminate your product to like-minded people, get feedback from a smart group, and get people excited about what you’re doing. Come learn, geek out, meet interesting people and show off your work!

The meetup will take place in the evening of April 14 at Microsoft’s Silicon Valley Campus in Mountain View. We’ll have more details once we secure the presenters, but please save the date.

We are accepting proposals until Wednesday, April 1. Submit now!

English

Usabilidad y ontologías: Nuevo paso en la evolución de la semweb de la mano de Twine

March 19th, 2009

Me entrevistaban el otro día acerca de la web 3.0, la web semántica y me resultaba difícil, a pesar de la experiencia en varios talleres sobre el tema, acercarla al usuario sin mencionar la primera aplicación que se ha encargado de ello: Twine.

Quería dejaros, además de la noticia sobre este nuevo e importante desarrollo, algunas de las cosas que les comentaba:

Sobre si hay empresas trabajando hoy en la construcción o desarrollo de la web 3.0….

“Muchas empresas trabajan para la web semántica y pueden forzar a Google (elemento imprescindile para convertirla en “mainstream”) a reconocer que la adopta o adoptarla en algunos puntos.  Twine, una aplicación que está haciendo de algún modo de puente hacia esta (ya es casi tan popular como delicious), está a punto de lanzar un servicio para hacer más fácil la escritura de datos semánticos (ontologías). Otras empresas como Hakia o Yahoo llevan ya tiempo incorporando todo esto a su buscador…. y Kumo, el nuevo buscador que lanzará este año Microsoft y que puede suponer una seria competencia para Google, adoptará, además de mecanismos propios de la búsqueda vertical y sintáctica, elementos de la semweb.”

Sobre si la web semántica sustituye a la actual….

“La web semántica es una capa añadida a la web actual, no la sustituye. Añade significado visible a los buscadores que mejora la eficiencia, la experiencia de usuario de la web, pero en cuanto a lo formal, el diseño de lo que vemos actualmente no va a cambiar demasiado.”

Sobre el porqué de la no adopción o la adopción partcial por parte de algunos buscadores, de la semweb:

“La construcción de ontologías (lenguajes, algoritmos que dirían a los ordenadores cuándo una cosa es un lugar, una persona, un evento, una organización, qué relaciones tiene con otras cosas, qué componentes la integran, etc…) es compleja y no existe hoy ningún buscador que pueda ser demasiado preciso en todos los ámbitos de conocimiento. Mientras no estén desarrolladas todas las ontologías, tendrán que combinar, como lo hará Microsoft, distintos sistemas.”

Como idea general, es una web más eficiente, más adaptada a nuestro lenguaje natural como humanos, más independiente de nosotros. (Tenéis en Presentación: Construyendo entre todos la web semántica. un buen tutorial básico para su comprensión)


Pero vayamos a la noticia que ha motivado este post: Twine, que parece que está creciendo hasta el punto de aproximarse a la popularidad de delicious, prepara una herramienta “usable” para la creación de ontologías, base para asegurar la eficiencia de la web semántica.

Hemos escrito mucho aquí acerca de Twine. Su bookmarklet, que acaba de mejorar, quizás sea su característica más popular. Con un solo click en páginas con microformatos adecuados (u otros marcadores semánticos) se etiquetan de forma automática los contenidos, clasificando Twine cada concepto clave según sean personas, lugares y otros tipos de información para mejorar su eficiencia como portal de búsquedas y recomendaciones afines a nuestros intereses.

El ahorro de tiempo y la precisión (en inglés) suponen, respecto a Delicious, una importante ventaja y así parecemos haberlo valorado los usuarios:

Según Nova Spivack en RWW, Twine, si sigue creciendo al ritmo actual, habrá sobrepasado a Delicious, una de las herramientas fundamentales de la web 2.0 en verano de este mismo 2009.

Sobre la nueva aplicación, que será lanzada, previsiblemente, este año, es destacable que se lanzará con licencia Open Source, así como que las ontologías creadas no trabajarán únicamente en Twine sinó que podrán ser utilizadas por cualquier aplicación.


ontologysite

La complejidad, el trabajo que supone crear ontologías (según Wikipedia son la formulación de exhaustivos y rigurosos esquemas conceptuales dentro de uno o varios dominios dados con la finalidad de facilitar la comunicación y la compartición de la información entre diferentes sistemas y entidades), hace que a día de hoy sean aún pocas las creadas, la mayoría en temas de tecnología)

La dificultad de semantización de los contenidos es, como veíamos, uno de los principales argumentos de Google para la no implantación de tecnologías semánticas en su buscador. Así, herramientas usables para crearlas podrian cambiar de forma radical este escenario, suponiendo un paso importante en la evolución de la web semántica.

No será la primera en ese sentido: Protege o CmapTools Ontology editor, que parte de la idea de los mapas conceptuales para ayudar a la edición de ontologías llevan tiempo trabajando. También Microsoft, que parece que apuesta desde hace tiempo como ventaja estratégica por la web semántica, anunciaba hace poco un nuevo complemento de semantización para Word 2007 que permite a los autores anotar palabras o frases con términos definidos en ontologías externas.

Pero si esta promesa de Twine es capaz de trabajar de forma tan intensa por la usabilidad como lo ha hecho la firma hasta hora con su aplicación para redes de interés, creo que puede acelerar de forma significativa la evolución de la web.

Os dejo, para finalizar, el usuario de El caparazón en Twine.

Compártelo: bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Spanish

Tales from the SIOC-o-sphere part #9

March 19th, 2009

It’s been another exciting six months in terms of SIOC-related developments. Here’s a summary:

Reblog this post [with Zemanta]

English

MIT adopts universal open access policy

March 19th, 2009

Yesterday the MIT faculty approved a university-wide open access policy. The full txt of the resolution, which passed unanimously, i available on Peter Suber’s Open Access News blog. Here’s an excerpt.

“Each Faculty member grants to the Massachusetts Institute of Technology nonexclusive permission to make available his or her scholarly articles and to exercise the copyright in those articles for the purpose of open dissemination. In legal terms, each Faculty member grants to MIT a nonexclusive, irrevocable, paid-up, worldwide license to exercise any and all rights under copyright relating to each of his or her scholarly articles, in any medium, provided that the articles are not sold for a profit, and to authorize others to do the same. The policy will apply to all scholarly articles written while the person is a member of the Faculty except for any articles completed before the adoption of this policy and any articles for which the Faculty member entered into an incompatible licensing or assignment agreement before the adoption of this policy. … The Provost’s Office will make the scholarly article available to the public in an open- access repository. The Office of the Provost, in consultation with the Faculty Committee on the Library System will be responsible for interpreting this policy, resolving disputes concerning its interpretation and application, and recommending changes to the Faculty.

I have to say I am conflicted about this and wish I was more informed. As a researcher, I am 100% for the right to make papers describing our results freely available. But I also recognize that publishers and professional societies are an essential part of our research infrastructure and their business models are partially built on copyright and controlling access to content.

Just as we are seeing the big changes in main stream media, we will probably see related changes in publishers, including professional societies. We’ll have to wait and see if they represent a phase shift to a new and better model or simply the collapse of the old one.

The analogy between the two is far from perfect. Traditional MSM publishers pay a professional staff to research, write and edit stories. Journal publishers and professional societies don’t typically pay their authors who increasingly deliver camera ready copy or near camera-ready electronic copy.

English

Fill out your Web Browser brackets: IE8, Chrome, Safari

March 19th, 2009

I guess it’s time for March browser madness, with a fast new Safari 4 beta, the release of IE 8, and a new Google Chrome beta. Let’s add Firefox so that the pairings work out. Of course, none of the browsers are doing well in the pawn2own contest.

English

UMBC Alumni reception at National Cryptologic Musuem 3/25

March 18th, 2009

Here’s an item of possible interest to UMBC alumni in the area. The UMBC Alumni Association is holding a special tour and evening of networking at the National Cryptologic Musuem from 6-8pm on Wednesday March 25. If you have never visited the museum, it’s an opportunity to see some very interesting exhibits on ciphers and codes, including a working enigma machine. UMBC President Freeman Hrabowski will be there to meet with and talk to the participants. You can get more information and register for the event online or contact Monique Armstrong (phone: 410-455-1879).

English

Cloudera offers a simpler Hadoop distribution

March 18th, 2009

We are early in the era of big data (including social and/or semantic) and more and more of us need the tools to handle it. Monday’s NYT had a story, Hadoop, a Free Software Program, Finds Uses Beyond Search, on Hadoop and Cloudera, a new startup that offering its own Hadoop distribution that is designed to beasier to install and configure.

“In the span of just a couple of years, Hadoop, a free software program named after a toy elephant, has taken over some of the world’s biggest Web sites. It controls the top search engines and determines the ads displayed next to the results. It decides what people see on Yahoo’s homepage and finds long-lost friends on Facebook.”

Three top engineers from Google, Yahoo and Facebook, along with a former executive from Oracle, are betting it will. They announced a start-up Monday called Cloudera, based in Burlingame, Calif., that will try to bring Hadoop’s capabilities to industries as far afield as genomics, retailing and finance. The company has just released its own version of Hadoop. The software remains free, but Cloudera hopes to make money selling support and consulting services for the software. It has only a few customers, but it wants to attract biotech, oil and gas, retail and insurance customers to the idea of making more out of their information for less.

Cloudera’s distribution, curently based on Hadoop v0.18.3, uses RPM and comes with a Web-based configuration aide. The company also offers some free basic training in mapReduce concepts, using Hadoop, developing appropriate algorithms and using Hive.

English

Nice video shows how hidden structured data from the Drupal content management system can lead to semantic search

March 18th, 2009

(Cross-posted at socialmedia.net.)

Via Drupal creator Dries Buytaert’s post entitled RDFa and Drupal and St?phane Corlosquet’s post about RDFa and Drupal examples and use cases, there is a really cool video that demonstrates how the structured data that is available in many Drupal deployments (but is difficult to leverage due to HTML representations) can be exposed and leveraged using RDFa semantic data. The video shows deep searches of Drupal data using Yahoo! SearchMonkey and also some visual navigations of this linked data. The possibilities are very exciting, as Dries says:

Google and Yahoo! are getting increasingly hungry for structured data. It is no surprise, because if they could built a global, vertical search engine that, say, searches all products online, or one that searches all job applications online, they could disintermediate many existing companies. [...] Hundreds of thousands of Drupal sites contain vast amounts of structured data, covering an enormous range of topics [and these structures] can be associated with rich, semantic meta-data that Drupal could output in its XHTML as RDFa. For example, say we have an HTML textfield that captures a number, and that we assign it an RDF property of ‘price’. Semantic search engines then recognize it as a ‘price’ field. Add fields for ’shipping cost’, ‘weight’, ‘color’ (and/or any number of others) and the possibilities become very exciting.

The video is below:

This effort has been growing over the past year, since it was championed by Rasmus Lerdorf (the creator of PHP) and proposed by Dries himself at DrupalCon 2008. Based on St?phane’s roadmap for RDFa in Drupal 7, the video shows some modules that have been developed for Drupal 6 to demonstrate the power of having embedded RDFa representations of Drupal structures. RDFa is currently being integrated into the core of Drupal 7.

There’s a nice line in the video about this embedded data:

It’s machine readable and now we have access to all of the machine-readable fields available to us before. Very quick, very simple, just what RDFa is supposed to be: human readable data [text], formatting data [HTML] and machine-readable data [RDFa] all in the same document, all inline, all describing the same thing.

(See also this great video and deck of slides about the “Practical Semantic Web and Why You Should Care” by Boris Mann from DrupalCon 2009.)

English