Archive

Archive for October, 2009

Oportunidades de negocio para la Inteligencia Artificial en la (post)Web 2.0

October 31st, 2009

Ya he colgado en Slideshare las transparencias que usé en la conferencia que impartí en las Jornadas Imaginática 2009, titulada Oportunidades de negocio para la Inteligencia Artificial en la (post)Web 2.0. ...

Spanish

Entornos de colaboración científica y ciudadana más eficientes gracias a la Web semántica

October 31st, 2009

Os dejaré en breve programa de Virtual Educa Buenos Aires 2009, que me llevará a Buenos Aires en unos días y donde explicaré como ejemplo didáctico de “Web semántica en educación”, la existencia de Entornos colaborativos con soporte semántico para la investigación científica 2.0 (e-ciencia semántica).

Confluyen en ella dos de los temas principales de este blog: la web semántica y la web social. Además, el conocimiento es uno de los dos ejes principales (con el relacional) en que situo la explicación de lo que es la nueva web:

Nace, en el Rensselaer Polytechnic Institute  una nueva  experiencia de optimización de la funcionalidad de la web en cuanto a conocimiento, creando un entorno que aísla, con el objetivo de potenciar su efecto,  las variables más relevantes (sociales, semánticas) para una construcción mejorada de conocimiento en la web.

Lo que se pretende es aumentar, optimizar las formas de acceso al conocimiento científico, (a un nivel sin precedentes en la historia de la ciencia, dicen ellos). Incluye, además, presupuestos para la semantización de contenidos científicos.

La idea, en mi opinión algo ambiciosa (la he leído como “la democratización del conocimiento” y reconocidamente no demasiado diferente de lo que se viene haciendo en Wikipedia, “permitiría que científicos, profesores y cualquier ciudadano puedan revisar los datos, interpretarlos, verificarlos, compartir información, continuar con estudios que quedaron empezados y, sobre todo, comprender realmente las palabras que encuentran en la red”

Veremos cómo evoluciona este nuevo acercamiento (Wolfram Alpha, aunque a nivel más cerrado, menos interoperable supone otro ejemplo del mismo intento) entre el lenguaje de la ciencia y el general. De momento, es un proyecto ambicioso:

“En palabras de Deborah McGuinness, segunda investigadora principal del proyecto y profesora de Constellation, “Las tecnologías semánticas reducen la barrera de entrada para hacer ciencia. Con la web semántica podemos tender un puente entre la pregunta que alguien quiere hacer con su limitado lenguaje científico y la extrema complejidad de los datos subyacentes”.

Fox, principal investigador del proyecto y profesor del Tetherless World Constellation en Rensselaer, comenta que: “existirán nuevas oportunidades de gran alcance para revisar datos. Puede no ser la tradicional revisión por parte de pares como es costumbre en la publicación científica porque muchas personas no serán expertos, pero cada usuario traerá consigo un punto de vista muy legítimo a los datos, particularmente cuando los utilicen en nuevas y diferentes formas”.2009-1001-tetherless_grant


  • Catedrales vs. bazares del conocimiento, mayor relevancia, diversidad:

Dejadme utilizar la metáfora de la Catedral y el Bazar (de los monopolios al conocimiento distribuido y diverso) porque el tema parece ir en ese sentido:

Mayor facilidad a la hora de compartir datos gracias a las tecnologías semánticas, facilitará cosas vinculadas tradicionalmente al conocimiento científico. En el caso de las citas, el acceso a ciertos “data sets” puede ser controlado con etiquetas semánticas sobre las fuentes, permitiendo a los usuarios dar créditos de forma adecuada a los creadores originales y permitiendo al creador, además, controlar quienes están accediendo o usando sus datos. Esto podría hacer que los científicos citaran con mayor facilidad servicios online en los Journals, comenta McGuinness.

Veremos cómo evoluciona esta iniciativa. De momento, el carácter abierto con que se plantea promete nuevos datos para la “Linked data web”, más allá de nuevos jardines vallados, como entorno distribuido, también, de construcción colaborativa de conocimiento, el mejor exponente de la www semántica abierta que nos conviene a todos.

Fuente noticia:  Fuente1Fuente 2.

Compártelo



  • BarraPunto
  • del.icio.us
  • Tumblr
  • Twitter
  • Facebook
  • Google Bookmarks
  • Meneame
  • MisterWong
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Wikio
  • Bitacoras.com
  • Diigo
  • FriendFeed
  • Netvibes
  • Ping.fm
  • Posterous
  • PDF
  • Print

2009, Ciencia, Evolución, Planeta educativo, Spanish, TRABAJOS DESTACADOS, Web 3.0, Web Semántica, Wikipedia, ciencia 2.0, colaboración científica, comunidades, cultura 2.0, e-ciencia, e-learning2.0, e-science, educacion, educación 2.0, filosofía, filtrado de contenidos, fundamentos, futurismo, herramientas semánticas, innovación, inteligencia colectiva, linked data web, universidad2.0, web3.0, zeitgeist evolución

Oportunidades de negocio para la Inteligencia Artificial en la (post)Web 2.0

October 31st, 2009

Ya he colgado en Slideshare las transparencias que usé en la conferencia que impartí en las Jornadas Imaginática 2009, titulada Oportunidades de negocio para la Inteligencia Artificial en la (post)Web 2.0. ...

Spanish

New York Times publishes Linked Open Data

October 30th, 2009

Like many newspapers, the New York Times links the first mention of well known entitles in its articles to a reference page. For example, a mention of Barack Obama links to a page which is a collection of basic information on President Obama and links to relevant stories and other resources that the Times has created.

Now the Times is also using RDF to publish some of information as linked open data. Yesterday the Times announced the publication of an LOD collection covering about 5,000 people at http://data.nytimes.com/ under under a Creative Commons 3.0 Attribution License and plan to put their full collection of 30K topics online soon.

“Over the last several months we have manually mapped more than 5,000 person name subject headings onto Freebase and DBPedia. And today we are pleased to announce the launch of http://data.nytimes.com and the release of these 5,000 person name subject headings as Linked Open Data.

Over the next several months, we plan to expand http://data.nytimes.com to include each of the nearly 30,000 subject headings we use to power Times Topics pages, a collection that includes locations, organizations and descriptors in addition to person names.”

English, Ontologies, RDF, Semantic Web

Steven Abram – Open in Libraries Technology & Education

October 30th, 2009

online09

stephen Abram Stephen Abram is Vice President, Innovation for library system vendor SirsiDynix.  He is track keynote speaker for the The Open Movement in Libraries, Technology & Education track, on the third day of the conference.

In this first podcast in our Online Information 2009 series, Stephen first explores the meaning of the, often over used, openness concept.  Are we talking about openness of systems, software APIs, open source, approach, minds, libraries, or a combination of several. of these.

With such a broad topic, it was inevitable that we addressed many many aspects of the influences of technology and attitudes on the way libraries are evolving.  Touching on the library system industry, and how it has and is changing, postulating on the future of libraries, and external influences from our rapidly changing world, this is a great introduction to his presentation an the track it kicks off.

English, Online Information, Podcast

Win $40K in the DARPA Network Challenge

October 29th, 2009

DARPA will hold the DARPA Network Challenge to explore how “broad-scope problems can be solved using Internet-based technologies.

“To mark the 40th anniversary of the Internet, DARPA has announced the DARPA Network Challenge, a competition that will explore the role the Internet and social networking plays in the timely communication, wide area team-building and urgent mobilization required to solve broad scope, time-critical problems.

The challenge is to be the first to submit the locations of ten moored, 8 foot, red weather balloons located at ten fixed locations in the continental United States. Balloons will be in readily accessible locations and visible from nearby roadways.”

According to the rules, the balloons will be on display from 10:00AM to 4:00PM on Saturday, 5 December 2009. A prize of $40,000 will be awarded to the first participant to submit the latitude and longitude of all ten weather balloons within the contest period, which ends on 14 December 2009.

English, Web, darpa, internet, social media

Some of my (very) preliminary opinions on Google Wave

October 28th, 2009

I was interviewed by Marie Boran from Silicon Republic recently for an interesting article she was writing entitled "Will Google Wave topple the e-mail status quo and change the way we work?". I thought that maybe my longer answers may be of interest and am pasting them below.

read more

English, Facebook, Google, Google Talk, Jaiku, SIOC, Searching, Semantic Web, friendfeed, google wave, social software, twitter

Some of my (very) preliminary opinions on Google Wave

October 28th, 2009

I was interviewed by Marie Boran from Silicon Republic recently for an interesting article she was writing entitled "Will Google Wave topple the e-mail status quo and change the way we work?". I thought that maybe my longer answers may be of interest and am pasting them below.

read more

English, Facebook, Google, Google Talk, Jaiku, SIOC, Searching, Semantic Web, friendfeed, google wave, social software, twitter

What’s After the Real Time Web?

October 28th, 2009
In typical Web-industry style we're all focused minutely on the leading trend-of-the-year, the real-time Web. But in this obsession we have become a bit myopic. The real-time Web, or what some of us...

Collective Intelligence, English, Global Brain and Global Mind, Group Minds, Memes & Memetics, Mobile Computing, My Best Articles, Politics, Semantic Web, Society, Systems Theory, Technology, The Future, The Metaweb, The Semantic Graph, Transhumans, Web 3.0, Web Technology, Web/Tech, Wild Speculation, evolution, futurism, global brain, global mind, government, real time, real-time web, science, search, social networks, software, the singularity, the stream

Some of my (very) preliminary opinions on Google Wave

October 28th, 2009

I was interviewed by Marie Boran from Silicon Republic recently for an interesting article she was writing entitled “Will Google Wave topple the e-mail status quo and change the way we work?“. I thought that maybe my longer answers may be of interest and am pasting them below.

Disclaimer: My knowledge of Google Wave is second hand through various videos and demonstrations I’ve seen… Also, my answers were written pretty quickly!

As someone who is both behind Ireland’s biggest online community boards.ie and a researcher at DERI on the Semantic Web, are you excited about Google Wave?

Technically, I think it’s an exciting development – commercially, it obviously provides potential for others (Google included) to set up a competing service to us (!), but I think what is good is the way it has been shown that Google Wave can integrate with existing platforms. For example, there’s a nice demo showing how Google Wave plus MediaWiki (the software that powers the Wikipedia) can be used to help editors who are simultaneously editing a wiki page. If it can be done for wikis, it could aid with lots of things relevant to online communities like boards.ie. For example, moderators could see what other moderators are online at the same time, communicate on issues such as troublesome users, posts with questionable content, and then avoid stepping on each other’s toes when dealing with issues.

Does it potential for collaborative research projects? Or is it heavyweight/serious enough?

I think it has some potential when combined with other tools that people are using already. There’s an example from SAP of Google Wave being integrated with a business process modelling application. People always seem to step back to e-mail for doing various research actions. While wikis and the like can be useful tools for quickly drafting research ideas, papers, projects, etc., there is that element of not knowing who is doing stuff at the same time as you. Just as people are using Gtalk to augment Gmail by being able to communicate in contacts in real-time when browsing e-mails, Google Wave could potentially be integrated with other platforms such as collaborative work environments, document sharing systems, etc. It may not be heavyweight enough on its own but at least it can augment what we already use.

Where does Google Wave sit in terms of the development of the Semantic Web?

I think it could be a huge source of data for the Semantic Web. What we find with various social and collaborative platforms is that people are voluntarily creating lots of useful related data about various objects (people, events, hobbies, organisations) and having a more real-time approach to creating content collaboratively will only make that source of data bigger and hopefully more interlinked. I’d hope that data from Google Wave can be made available using technologies such as SIOC from DERI, NUI Galway and the Online Presence Ontology (something we are also working on).

If we are to use Google Wave to pull in feeds from all over the Web will both RSS and widgets become sexy again?

I haven’t seen the example of Wave pulling in feeds, but in theory, what I could imagine is that real-time updating of information from various sources could allow that stream of current information to be updated, commented upon and forwarded to various other Waves in a very dynamic way. We’ve seen how Twitter has already provided some new life for RSS feeds in terms of services like Twitterfeed automatically pushing RSS updates to Twitter, and this results in some significant amounts of rebroadcasting of that content via retweets etc.

Certainly, one of the big things about Wave is its integration of various third-party widgets, and I think once it is fully launched we will see lots of cool applications building on the APIs that they provide. There’s been a few basic demonstrator gadgets shown already like polls, board games and event planning, but it’ll be the third-party ones that make good use of the real-time collaboration that will probably be the most interesting, as there’ll be many more people with ideas compared to some internal developers.

Is Wave the first serious example of a communications platform that will only be as good as the third-party developers that contribute to it?

Not really. I think that title applies to many of the communications platforms we use on the Web. Facebook was a busy service but really took off once the user-contributable applications layer was added. Drupal was obviously the work of a core group of people but again the third-party contributions outweigh those of the few that made it.

We already have e-mail and IM combined in Gmail and Google Docs covers the collaborative element so people might be thinking ‘what is so new, groundbreaking or beneficial about Wave?’ What’s your opinion on this?

Perhaps the real-time editing and updating process. Often times, it’s difficult to go back in a conversation and add to or fix something you’ve said earlier. But it’s not just a matter of rewriting the past – you can also go back and see what people said before they made an update (“rewind the Wave”).

Is Google heading towards unified communications with Wave, and is it possible that it will combine Gmail, Wave and Google Voice in the future?

I guess Wave could be one portion of a UC suite but I think the Wave idea doesn’t encompass all of the parts…

Do you think Google is looking to pull in conversations the way FriendFeed, Facebook and Twitter does? If so, will it succeed?

Yes, certainly Google have had interests in this area with their acquisition of Jaiku some time back (everyone assumed this would lead to a competitor to Twitter; most recently they made the Jaiku engine available as open source). I am not sure if Google intends to make available a single entry point to all public waves that would rival Twitter or Facebook status updates, but if so, it could be a very powerful competitor.

Is it possible that Wave will become as widely used and ubiquitous as Gmail?

It will take some critical mass to get it going, integrating it into Gmail could be a good first step.

And finally – is the game changing in your opinion?

Certainly, we’ve moved from frequently updated blogs (every few hours/days) to more frequently updated microblogs (every few minutes/seconds) to being able to not just update in real-time but go back and easily add to / update what’s been said any time in the past. People want the freshest content, and this is another step towards not just providing content that is fresh now but a way of freshening the content we’ve made in the past.

Reblog this post [with Zemanta]

English, Facebook, Google, Google Talk, Jaiku, SIOC, Searching, Semantic Web, friendfeed, google wave, social software, twitter

Linked Data Flows: A new picture to illustrate the “openness” we mean

October 28th, 2009

(Original post taken from “About the Social Semantic Web“)

A lot of activities around Linking Open Data (“LOD”) and the associated data sets which are nicely visualised as a “cloud” are going on for quite a while now. It is exciting to see how the rather academic “Semantic Web” and all the work which is associated with this disruptive technology can be transformed now into real business use cases.

What I have observed in the last few months, especially in business communities, is the following:

  • “Linked Data” sounds interesting for the business people because the phrase creates a lot of associations in a second or two; also the database crowd seems to be attracted by this web-based approach of data integration
  • “Web of Data” is somehow misleading because many people think that this will be a new web which replaces something else. Same story with the “Semantic Web”
  • “Linking Open Data” sounds dangerous and not trustworthy to many companies

For insiders it is clear, that the “openness” of data, especially in commercial settings, can be controlled and has to be controlled in many cases i.e. by defining the right licensing models. But here we are still at the beginning as a workshop at ISWC 2009 has illustrated.

Anyway, looking at the characteristics of Linked Data Flows, they can be one-way or mutual. In some cases data from companies will be put into the cloud, and can be opened up for many purposes, in other use cases it will stay inside the boundaries. In other scenarios only (open) data from the web will be consumed and linked with corporate data, but no data will be exposed to the world (except the fact, that data was consumed by an entity).

And of course: On many other occasions datasets and repositories will be opened up partly depending on the CCs (or similar, not yet defined attributes) and the underlying privacy regulations one wants to use.

This makes clear that LOD / Linking Open Data is just one detail of a bigger picture. Since companies (and governments) play a crucial role to develop the whole infrastructure, we need to draw a new picture that illustrates the various Linked Data Flows in a better way:

linkeddataworld

Concluding from this the best thing would be to talk about Linked Data in general and just refer to Linking Open Data in the right context. Despite better knowledge for business people the term  “open” is still associated with “free” and “dubious provenance”. And given the fact that hardly anybody has given hard evidence on the ROI of open business models the “open argument” does count little in a time of decreasing economic prosperity.

So what would be critical to get the Linked Data thing running is to provide the corresponding business and licensing models for your Linked Data strategy. But this includes having a good understanding of the assets you want to capitalize. Given the fact that metada assets are still a novel and vastly unexplored business field which so far lack a regulated supply and demand structure there are still lots of structural obstacles that hinder the uptake of Linked Data. Providing more of the same in a laissez faire mode – like TimBL critisized at this year’s Web 2.0 Summit – might be inspiring for the in-crowd, but it might not be sufficient to build a linked data business.

Corporate Semantic Web, English, Linked Data, Linked Data & Open Data, Web of Data, business model, linking open data, media

OWL 2 becomes a W3C recommendation

October 27th, 2009

OWL 2, the new version of the Web Ontology Language, officially became a W3C standard yesterday. From the W3C press release:

“Today W3C announces a new version of a standard for representing knowledge on the Web. OWL 2, part of W3C’s Semantic Web toolkit, allows people to capture their knowledge about a particular domain (say, energy or medicine) and then use tools to manage information, search through it, and learn more from it. Furthermore, as an open standard based on Web technology, it lowers the cost of merging knowledge from multiple domains.”

AI, English, KR, OWL, Ontologies, Semantic Web

OWL 2 is a W3C Recommmendation

October 27th, 2009

Today W3C announces a new version of a standard for representing knowledge on the Web. OWL 2, part of W3C's Semantic Web toolkit, allows people to capture their knowledge about a particular domain (say, energy or medicine) and then use tools to manage information, search through it, and learn more from it. As an open standard based on Web technology, OWL 2 lowers the cost of merging knowledge from multiple domains. More than a dozen implementations of OWL 2 are already available. The standard consists of 13 documents, of which 4 are instructional.

Activity news, English

La búsqueda social de Google es en el Grafo Global, de conocimiento, no en el Social y lúdico

October 27th, 2009

Ya véis que también Facebook quiere difundir nuestros contenidos :) Y nos ofrece el nuevo botón, que véis aquí ya instalado, a imagen y semejanza del tradicional de Backtype o Tweetmeme para Twitter (tremendo dilema el de FB ahora imitando al Cisne Negro de Twitter, ahora intentando mantener su propia identidad y siendo fiel a sus usos).

Podéis encontrarlo, con las correspondientes instrucciones, en el blog de Facebook.

Pero de lo que quería hablaros hoy aquí es de la noticia, en la línea de la futura web 3.0 personalizada, que llevamos apuntando ya hace unos días: la búsqueda social ya es (en inglés y versión US) una realidad. Lo ilustran bien con el siguiente ejemplo:

“Mucha gente escribe sobre New York, así que si realizo una búsqueda es probable que no sea lo escrito por mis mejores amigos los que aparezca en primer lugar en los resultados. La personalización trataba de solucionar el tema, pero Social Search da un paso más y encuentra contenido público relevante de nuestros amigos y contactos:  “Resultados de la gente de tu círculo social para New York.”

Tenéis en este vídeo más información sobre su funcionamiento:

Lo  comentábamos ayer: Microsoft, Facebook, están jugando fuerte la batalla por el grafo social. Y en ese sentido, sea cual sea nuestra afinidad “casi política” con cualquiera de los entornos, lo cierto es que resulta difícil hablar de grafo social sin tener en cuenta Facebook.

Gmail, Google reader y aquellos servicios que hayamos añadido a nuestro perfil en Google (twitter, friendfeed, etc…), a excepción de FB. serán los que Google considere fuentes relevantes. Eso, frente a  un binomio Facebook-Bing casi universal ahora que integra Twitter, puede situar a Google en clara desventaja.

Decían con acierto David, Luisa en la anotación de ayer en Bitácoras: OK, pero algo harán para solucionarlo…..

Y ese algo parece llamarse Twitter, que se incorpora, con Friendfeed y muchos otros servicios a la Google Social Search para crear escenarios, contextos de interés realmente valiosos, relevantes y significativos.

Creo que hay cosas que olvidamos, que el propio gigante está tergiversando  al ofrecer el ejemplo anterior… y es que probablemente estemos hablando, una vez más, de las distintas perspectivas (social – de interés) de las redes, de distintos usos:

¿Buscaríais, de forma contextual en Facebook-Bing (o sea, de acuerdo a un microuniverso generado desde las preferencias e intereses de vuestros contactos en FB) datos, contenidos relevantes para proyectos académicos o profesionales? Una búsqueda en Delicious o, a veces, en Twitter (como la que nos ofrece esta tentativa de Google), podría ser más adecuada en ese caso.

¿Tan suculento es el ámbito de ocio, de las redes puramente sociales, relacionales, que nadie quiere renunciar a él? ¿Tan relevante es, en otros términos, la opinón de nuestros amigos en nuestras decisiones de compra? ¿Se está una vez más subestimando, estereotipando, nuestro criterio?

Diría que Google se equivoca en el foco, en los ejemplos. Que pasó ya el momento del grafo social, que el grafo realmente significativo ahora (el Global Giant Graph de Berners Lee, propio de un estadio maduro de la web) incluye nuestras comunidades profesionales, nuestras redes de interés (recordemos el término Redes Personales de Aprendizaje), más que nuestros contactos en redes sociales.

¿Porqué volver atrás,  a la estrechez de miras de Facebook, al aire conservador, adolescente de una plataforma que parte de, que fue pensada desde nuestros contactos en “lo real” si ya parecíamos haber trascendido a ello?

Relacionados:

Compártelo



  • BarraPunto
  • del.icio.us
  • Tumblr
  • Twitter
  • Facebook
  • Google Bookmarks
  • Meneame
  • MisterWong
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Wikio
  • Bitacoras.com
  • Diigo
  • FriendFeed
  • Netvibes
  • Ping.fm
  • Posterous
  • PDF
  • Print

2009, Anuncios generales, Aprendizaje, Evolución, Facebook, Google, Knowledge Management, PLEs, Planeta educativo, Redes sociales, Spanish, búsqueda social, cibercultura, comunidades, e-learning2.0, filtrado de contenidos, ggg, google social search, google vs. facebook, herramientas para blogs, inteligencia colectiva, lifestreaming, medios, web3.0, zeitgeist evolución

Más cambios en Facebook, Google search y el grafo social, nuevas batallas en la guerra de las redes sociales

October 26th, 2009

Veíamos ayer cómo la real time web, el aumento de su importancia gracias a la indexación de Twitter y FB (solo en Bing) incrementaba la ola Groundswell, la importancia  de las opiniones del consumidor / usuario en la web y por tanto, el fundamento de la adopción de estrategias de Social Media en empresas.

El problema de una web a dos velocidades, que originaba la aparición de múltiples startups desde la API de twitter, afectaba a los principales buscadores y resultaba ser, con el ámbito social (el social graph o grafo social), uno de los mayores “agujeros”, debilidades  o sectores en los que Google no tenía un monopolio casi absoluto.

Resume el cuadro siguiente, actualizado con los últimos movimientos, todo ello, los puntos fuertes y débiles de cada entorno:searchwars2

Como vemos, en cuanto a las búsquedas sociales y a la espera de que Wave sea más que un intento de conquista del Grafo social, Google tiene la batalla perdida:

Si bien anuncian nuevas características de búsqueda Social (web contextual social), que permitirían a los usuarios optar por los resultados que deriven de contenido creado por sus contactos en distintas redes sociales, el hecho de no poder indexar contenidos, contactos y perfiles en Facebook, más abierto para Bing pero igualmente privado y cerrado para el gigante, deja a Google en clara desventaja con respecto a la alianza Microsoft-Bing.

Lo que hoy puede no ser un problema para Google, dado el carácter privado de la actividad en Facebook de la mayoría de los usuarios (parece que Facebook intenta solucionar el tema de forma activa), puede significar una ventaja mucho mayor en el momento en que eso cambie y todos esos datos puedan ayudar a definir el microuniverso social que marca las opiniones y compras de cada consumidor.


Sobreinformación y filtros:

Pero la historia de la evolución del software social no terminaba aquí durante los últimos días. Iba a cerrar este post cuando aparecía lo siguiente:

Parece que Facebook, ante un escenario futuro de interoperabilidad (las vallas en los jardines son cada vez más difíciles de sostener)  y competencia feroz entre las redes sociales, trabaja duro para ofrecer valor añadido y anunciaba hace unas horas  su nueva portada:

Lo destaca e interpreta como estrategia de diferenciación, de diseño en función del uso,  Antonio en Error500, cuando anuncia que Facebook, que últimamente se había convertido en un servicio de Lifestreaming, de actualización de status indiferenciado de Twitter (livefeed),   modifica su portada  con un “newsfeed en el que no está todo lo comentado y subido por los contactos, sino una selección de lo más interesante compartido en los últimos días.

No sé si  este tipo de batallas favorece la evolución de la web. Creo que, de uno u otro modo van empujando a FB  hacia la apertura.  Y eso siempre es una buena noticia.

Compártelo



  • BarraPunto
  • del.icio.us
  • Tumblr
  • Twitter
  • Facebook
  • Google Bookmarks
  • Meneame
  • MisterWong
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Wikio
  • Bitacoras.com
  • Diigo
  • FriendFeed
  • Netvibes
  • Ping.fm
  • Posterous
  • PDF
  • Print

2009, Bing, Facebook, Google, Marketing, P2P, Planeta educativo, Redes sociales, Sociedad de la conversacion, Spanish, Web en tiempo real, bing y facebook, bing y twitter, buscadores alternativos, comunidades, cultura 2.0, empresa 2.0, google wave, google y twitter, groundswell, guerra redes sociales, lifestreaming, medios, real-time web, social media, sociología, twitter, web 2.0, web social, web3.0, zeitgeist evolución

Prisoners Dilemma and the Golden Balls game show

October 25th, 2009

Golden Balls is a UK game show with a final round, Split or Steal, that is similar to the prisoner’s dilemma. The two contestants have to simultaneously choose to split the prize or try to steal it. If both choose split, they each get half. If one chooses split and the other steal, than the stealer gets it all. If they both choose steal, neither gets anything. While the payoff matrix is not exactly that for the PD, it has a similar effect on the strategy. Check out this video of a Split or Steal round for £100,000. (Spotted on Hacker News)

AI, Agents, English, social media

Will Linked Data mean an early end for Marc & RDA

October 25th, 2009

For the uninitiated, NGC4LIB is a library focused mailing list which has a reputation for often engaging in massive discussions and disagreements around the minutiae of future cataloguing and library focused metadata practices.  They have recently been involved in one of these great debates stimulated by the comments of Sir Tim Berners-Lee in a recent interview.    As is often is the case on this list, the debate wandered well off topic in to the realms of FRBR and it’s alternatives before being brought back on topic by Jim Weinheimer, who started the conversation in the first place.

A statement in Jim’s contribution caught my eye:

Implementing linked data, although it would be great, is years and years away from any kind of practical implementation

hmg.gov.uk_data Implementing linked data is already well underway with many groups across the Globe.  For instance there are couple that we at Talis are closely involved with.  Following on from Sir Tim’s interview comments, the British Government are currently running a, soon to be opened, closed beta of data.gov.uk.  Through this site they are not only opening up data in many forms such as CSV, like their American cousins at data.gov, but they are also starting to encode in RDF and publishing it via the Talis Platform which provides a SPARQL (the query language of the Linked Data web) end point.  This approach not only lets anyone download the raw data, but also enables them to query it for whatever they have in mind. If you want a sneak preview of how such data is queried, take a look at some of theses examples.   In a similar vein, metadata from BBC programmes and music is being harvested in to Talis Platform stores.  Again these are open to anyone to innovate with – check out these screencasts  to see some of the early possibilities.

Ah but that is not bibliographic data, I hear someone cry – It’ll never catch on in libraries.  I get the impression from some comments on the NGC4LIB list, that it will not be possible for ‘our’ data to participate in this Link Data web until ‘we’ have predicted all possible uses for it, analysed them, and developed a metadata standard to cope with every eventuality.   There are already a few examples of the library world engaging with RDF and Linked data, one obvious one being the Library of  Congress with LCSH another the National Library of Sweden.  Neither of these examples are encoding the kind of detail you would expect in a Marc record, they are using ontology to describe associated concepts such as subjects.

There has been some ontology development towards this larger goal with Bibo (Bibliographic Ontology Specification).  Although not there yet, Bibo is good enough to be used in live applications whishing to encode bibliographic data.  Such an example is Talis Aspire.  Underpinned by the same Platform as the UK Government and BBC Linked Data services, it uses the Bibo ontology to describe resources an an academic context

Alongside data.gov.uk there is a Google Group conversation taking place. The refreshing part of this conversation is that it is between the producers of the data sets, those developing the way it should be encoded in to RDF, and those who want to consume it.  Several times you will see a difference of opinion between those that want to describe the data to it’s fullest, and those that wish to extract the most value from it. “I agree that is a cleaner way of encoding, but can you imagine how complex the query will be to extract what I want!”.  This approach is not unusual in the Linked Data world, where producers and consumers get together, pragmatically evolving a way forward.  Dataincubator.org is an open place where such pragmatic development and evolution is taking place.  Check out examples of a subset of Open Library data. (note this is an example of data, not a user interface).

Semantic Library _ Mark Twain Another, bibliographic focused, experiment can be found at semanticlibrary.org. From some of the example links on the home page, you can see that building in this way enables very different ways of exploring metadata.  People, subjects, publishers, works, editions, series, all being equally valid starting points to explore from.

Doth the bell toll for Marc and RDA?
Not for a long old time – Ontology like Bibo, and the results of work at Dataincubator.org and semanticlibrary.org, may well lead to more open useful, and most importantly linked, access to data previously limited to library search interfaces.  That data has to come from somewhere though, and the massive global network of libraries encoding their data using Marc ,and maybe soon RDA, are ideally placed to continue producing rich bibliographic metadata.  Metadata to be fed in to Linked Data web in the most appropriate form for that purpose.  There will continue to be a place for current cataloguing practices and processes for a significant period -supporting and enabling the bibliographic part of the Linked Data web, not being replaced by it.

No doubt the NGC4LIB conversation on this topic will continue. Regardless of how it progresses, there is a current need and desire for bibliographic data in the linked data web.  The people behind that desire, and the innovation to satisfy it, may well have come up with a satisfactory solution, for them, whilst we are still talking.

English, Linked Data, Open Data, Semantic Web, Talis, Talis Platform

Syndicating trust? Mediawiki, Wordpress and OpenID

October 25th, 2009

Fancy title but simple code. A periodic update script is setting user/group membership rules on the FOAF wiki based on a list of trusted (for this purpose) OpenIDs exported from a nearby blog. If you’ve commented on the blog using OpenID and it was accepted, this means you can also perform some admin actions (page deletes, moves, blocking spammers etc.) on the FOAF wiki without any additional fuss.

Both Wordpress blogs and Mediawiki wikis have some support for OpenID logins.

The FOAF wiki until recently only had one Sysop and Bureaucrat account (a bureaucrat has the same privileges as a Sysop except for the ability to create new bureaucrat accounts). So I’ve begun an experiment exploring idea of pre-approving certain OpenIDs for bureaucrat activities. For now, I take a list of OpenIDs from my own blog; these appear to be just the good guys, but this might be because only real humans have commented on my blog via OpenID. With a bit of tweaking I’m sure I could write SQL to select out only OpenIDs associated with posts or comments I’ve accepted as non spammy, though.

So now there’s a script I can run (thanks tobyink and others in #swig IRC for help) which compares an externally supplied list of OpenID URIs with those OpenIDs known to the wiki, and upgrades the status of any overlaps to be bureaucrats. Currently the ’syndication’ is trivial since the sites are on the same machine, and the UI is minimal; I haven’t figured out how best to convey this notion of ‘pre-approved upgrade’ to the people I’m putting in an admin group. Quite reasonably they might object to being misrepresented as contributors; who knows.

But all that aside, take a look and have a think. This kind of approach has a lot going for it. We will have all kinds of lists of people, groups of people, and in many cases we’ll know their OpenIDs. So why not pool what we know? If a blog or wiki has information about an OpenID that shows it is somehow trustworthy, or at least not obviously a spammer, there’s every reason to make notations (eg. FOAF/RDFa) that allow other such sites to harvest and integrate that data…

See also Dan Connolly’s DIG blog post on this, and the current list of Bureaucrats on the FOAF Wiki (and associated documentation). If your names on the list, it just means your OpenID was on a pre-approved list of folk who I trust based on their interactions with my own blog. I’d love to add more sources here and make it genuinely communal.

This is all part of the process of getting FOAF moving again. The brains of FOAF is in the IssueTracker page, and since the site was damaged by spammers and hackers recently I’m trying to make sure we have a happy / wholesome environment for maintaining shared documents. And that’s more than I can do as a solo admin, hence this design for opening things up…

English, FOAF, RDFa, Semantic Web, SocialWeb, Technology, coding, ggg, openid, privacy

Remote remotes

October 23rd, 2009

I’ve just closed the loop on last weekend’s XMPP / Apple Remote hack, using Strophe.js, a library that extends XMPP into normal Web pages. I hope I’ll find some way to use this in the NoTube project (eg. wired up to Web-based video playing in OpenSocial apps), but even if not it has been a useful learning experience. See this screenshot of a live HTML page, receiving and displaying remotely streamed events (green blob: button clicked; grey blob: button released). It doesn’t control any video yet, but you get the idea I hope.

Remote apple remote HTML demo

Remote apple remote HTML demo, screenshot showing a picture of handheld apple remote with a grey blob over the play/pause button, indicating a mouse up event. Also shows debug text in html indicating ButtonUpEvent: PLPZ.

This webclient needs the JID and password details for an XMPP account, and I think these need to be from the same HTTP server the HTML is published on. It works using BOSH or other tricks, but for now I’ve not delved into those details and options. Source is in the Buttons area of the FOAF svn: webclient. I made a set of images, for each button in combination with button-press (‘down’), button-release (‘up’). I’m running my own ejabberd and using an account ‘buttons@foaf.tv’ on the foaf.tv domain. I also use generic XMPP IM accounts on Google Talk, which work fine although I read recently that very chatty use of such services can result in data rates being reduced.

To send local Apple Remote events to such a client, you need a bit of code running on an OSX machine. I’ve done this in a mix of C and Ruby: imremoted.c (binary) to talk to the remote, and the script buttonhole_surfer.rb to re-broadcast the events. The ruby code uses Switchboard and by default loads account credentials from ~/.switchboardrc.

I’ve done a few tests with this setup. It is pretty responsive considering how much indirection is involved: but the demo UI I made could be prettier. The + and – buttons behave differently to the left and right (and menu and play/pause); only + and – send an event immediately. The others wait until the key is released, then send a pair of events. The other keys except for play/pause will also forget what’s happening unless you act quickly. This seems to be a hardware limitation. Apparently Apple are about to ship an updated $20 remote; I hope this aspect of the design is reconsidered, as it limits the UI options for code using these remotes.

I also tried it using two browsers side by side on the same laptop; and two laptops side by side. The events get broadcasted just fine. There is a lot more thinking to do re serious architecture, where passwords and credentials are stored, etc. But XMPP continues to look like a very interesting route.

Finally, why would anyone bother installing compiled C code, Ruby (plus XMPP libraries), their own Jabber server, and so on? Well hopefully, the work can be divided up. Not everyone installs a Jabber server. My thinking is that we can bundle a collection of TV and SPARQL XMPP functionality in a single install, such that local remotes can be used on the network, but also local software (eg. XBMC/Plex/Boxee) can also be exposed to the wider network – whether it’s XMPP .js running inside a Web page as shown here, or an iPhone or a multi-touch table. Each will offer different interaction possibilities, but they can chat away to each other using a common link, and common RDF vocabularies (an area we’re working on in NoTube). If some common micro-protocols over XMPP (sending clicks or sending commands or doing RDF queries) can support compelling functionality, then installing a ‘buttons adaptor’ is something you might do once, with multiple benefits. But for now, apart from the JQbus piece, the protocols are still vapourware. First I wanted to get the basic plumbing in place.

Update: I re-discovered a useful ‘which bosh server do you need?’ which reminds me that there are basically two kinds of BOSH software offering; those that are built into some existing Jabber/XMPP server (like the ejabberd installation I’m using on foaf.tv) and those that are stand-alone connection managers that proxy traffic into the wider XMPP network. In terms of the current experiment, it means the event stream can (within the XMPP universe) come from addresses other than the host running the Web app. So I should also try installing Punjab. Perhaps it will also let webapps served from other hosts (such as opensocial containers) talk to it via JSON tricks? So far I have only managed to serve working Buttons/Strophe HTML from the same host as my Jabber server, ie. foaf.tv. I’m not sure how feasible the cross-domain option is.

Update x2: Three different people have now mentioned opensoundcontrol to me as something similar, at least on a LAN; it clearly deserves some investigation

English, FOAF, Jabber/XMPP, SocialWeb, Technology, coding, ggg

W3C has published three HCLS related Intrerest Group Notes

October 23rd, 2009

The W3C Semantic Web in Health Care and Life Sciences Interest Group (HCLS) is pleased to announce the publishing of three Interest Group notes by the Scientific Discourse Task Force:

These notes describe how one can use the Semantic Web to express and integrate scientific data from different domains and from heterogeneous services. It is hoped that they will inspire further contributions to the ongoing work of the Health Care and Life Sciences Interest Group and its Scientific Discourse Task Force, as well as inspire those in other domains to exploit the Semantic Web.

Activity news, English

The impact of the economic recession on university library services

October 23rd, 2009

Senior managers in libraries have been managing fluctuating budgets for years now, but have managed to maintain service provision. However, the prospect of deeper financial cuts introduces the real possibility of reductions in opening hours, staff development as well as limitations in resource provision. The decreasing value of sterling will continue to impact UK libraries in what is now an internationalised supply chain, and shifting demands of expectations of students and academics will of course continue to have an impact.

Recession reportThis is how Head Librarians in UK universities currently perceive the oncoming impact of the economic downturn according to The impact of the economic recession on university library and IT services, a report published last month by JISC, SCONUL and UCISA, that seeks to find some of the questions that are taxing most if not all of us about how the UK’s economic problems are going to play out in the academic library sector. The report considers IT services alongside university libraries, and we have blogged about the impact of the recession on IT services on our Education blog.

To pretend that the recession somehow marked the start of budgetary restrictions in academia would be to mythologise the recent past, and this report doesn’t fall into that trap, quoting one respondent to the study from the Head Librarian of a post-1992 university:

I’ve had year on year cuts every year I’ve been here… but what we’re facing now actually is nothing new for us. We’ve had hefty audit difficulties but we’re through that now, but [the audit difficulties] resulted in fall backs, which resulted in budget cuts. So I’m quite expecting 09/10 to be difficult; I’m expecting 10/11 to be more difficult, but it’s within a context of never having much fat on the bones anyway. I know I’m going to cope with it because I’ve been doing it for the last seven years, I’m not coming from a position of plenty to a position of poverty.

An opportunity for review?

But lest we should feel that we’re on a never-ending downward spiral, the report is clear that the library service remains essential to the institution’s core mission of learning, teaching and research. And although there is realism that the “achievable” cost reductions of 2009/10 will give way to much more challenging conditions, there is also a sense of “looking at the bright side”, i.e. seeing an opportunity to review current practices and services to ensure that they remain fit for purpose:

It’s an opportunity for us to look at what we do well, where we have maximum benefit and add true value to activities both that are delivered by this department and also that this department contributes to the faculties and to other departments in the university. [Pre-1992 University]

It’s all the more praiseworthy, given the chronic budgetary challenges that university libraries have endured, that a shift to a more customer-focused service has nonetheless been achieved. It’s all the more remarkable that one of the principal manifestations of this transformation has been a breadth of service provision, with cataloguing and collections management giving way to “a service that delivers a wide range of information management tools across a very broad spectrum of format”.

Social learning spaces at risk?

The physical library building is a huge element of this service transformation. As the report notes:

Changing the physical space of the library so it works better for students has consequently increased their use of the library space (but not necessarily the library resources). So with the shift of resources online, evidence suggests students are now spending more time within library buildings than they have in the past; the library has become a social study space.

The ability to continue to improve and develop social learning spaces, as recommended by the report, may well be compromised by capital budget cuts, which according to the report, are more likely to be impacted than recurrent spend. Estate budgets including storage and social learning spaces may well be endangered, although the acknowledged status of social learning spaces as market differentiators in the competition between institutions to attract students, may mitigate to an extent.

Bournemouth University techno booths 2With library design and service enhancements such as extended library open hours now at risk, the problem as I see it is the difficulty of taking away something that has previously been given, a problem that is all the more acute when applied to something that is perceived as an entitlement. So these changes, should they occur, will require delicate handling, especially in the customer-centric services now offered on all campuses.

Rationalising resources?

Another fundamental aspect of academic library provision discussed in the report is information resources. Most libraries are planning to renegotiate their journal portfolio and software licences in coming years, and are also prepared to cut journal subscription and book purchase in preference to staff losses. The impact on university life of cancelled subscriptions has yet to be evaluated, although the report does point out that reductions in spend will have a knock-on effect of weakening library purchase power in the supply chain.

In the meantime, libraries are prioritising measures such as consortial purchasing alongside JISC collections, and also the emerging Open Access model, as a combined means of managing costs in journal subscriptions. Whilst the report suggests liaison with academics to identify e-resources that could possibly be discontinued due to insufficient use, the widespread licensing of national deals can hinder rationalisation of individual titles.

On top of global price increases, UK university library spending power has also been adversely impacted by the drop in the value of sterling. The report notes that no university has developed a plan to mitigate for the impact of currency fluctuations (a problem that extends beyond the library) even though it is a source of concern to everyone.

A choice of two negatives?

Of course we don’t know for certain how the budgetary challenges will impact the university library; all that the report has done is to open up the minds of Library Directors and synthesise the findings, valuable though that certainly is. But the report makes a number of general points that are applicable whatever the outcome.

Firstly, the report points out that libraries will need tools at their disposal for assessing their impact, value and costs, as the sector as a whole comes under increased costs pressure.

And secondly, libraries will inevitably have to make a choice between carrying out multiple cuts across the whole range of services or identifying entire areas to cut instead. The multiple cut scenario entails a risk devaluing the overall offering, and dashing user expectations right across the board. On the other hand, cutting an entire service area, even if it’s a real minority taste, is bound to cause pain.

A choice of two negatives – let’s hope that the future offers more than this.

English, Higher Education, Libraries

First drafts for SPARQL 1.1 published

October 23rd, 2009

The W3C SPARQL Working Group published the First Public Working Draft of six SPARQL 1.1 specifications. SPARQL is the query language of the Semantic Web, and SPARQL 1.1 enhances the SPARQL landscape with:

Activity news, English

I Jornadas sobre Lógica, Computación, e Inteligencia Artificial. Segunda sesión

October 22nd, 2009

En la segunda sesión de estas jornadas asistimos a las siguientes intervenciones: El profesor Eugenio Roanes introdujo en su conferencia Algunas aplicaciones de las bases de Gröbner en Inteligencia Artificial (véase aquí las transparencias de la presentación) la...

Spanish

I Jornadas sobre Lógica, Computación, e Inteligencia Artificial. Primera sesión

October 22nd, 2009

Durante los días 13 y 14 de Noviembre organizamos las I Jornadas sobre Lógica, Computación e Inteligencia Artificial, como jornadas científicas de bienvenida a los alumnos y como homenaje al profesor Luis M. Laita de la Rica, catedrático emérito de la Un...

Spanish

I Jornadas sobre Lógica, Computación, e Inteligencia Artificial. Segunda sesión

October 22nd, 2009
I Jornadas sobre Lógica, Computación, e Inteligencia Artificial. Segunda sesión

En la segunda sesión de estas jornadas asistimos a las siguientes intervenciones: El profesor Eugenio Roanes introdujo en su conferencia Algunas aplicaciones de las bases de Gröbner en Inteligencia Artificial (véase aquí las transparencias de la presentación) la...

Spanish

La próxima revolución será en 2011, la próxima web es de las redes sociales móviles

October 21st, 2009

Son días de novedades en la web.  No solo porque ya llega Windows7, con la fácil asignatura pendiente de mejorar Vista, ni por el tema del que hablaremos en unas horas del cambio en el panorama de la Real Time Web (con Google y Bing indexando Twitter, también Facebook, en el caso del último).

Cambios, revoluciones más profundas se predicen desde el importante Web 2.0 Summit que está teniendo lugar estos días. Es en ese foro en el que cada año, la analista de Morgan Stanley Mary Meeker realiza una presentación rápida y en profundidad de las nuevas tendencias en la web:

2meeker09a

Es importante la evolución de las redes sociales móviles en un contexto general de crecimiento, de cambio, incluso, de revolución:  Tim O’Reilly destacaba en su conferencia inaugural que la revolución es comparable a la que vivimos en 2004, con la emergencia de la web 2.0.  Meeker, en el siguiente gráfico, lo reflejaba, comparando la evolución del sector a la que vivimos en 2001:

3meeker09d

Podríamos estar hablando, en definitiva, de 2011, después de la recuperación actual y gracias a la evolución de la tecnología móvil, de una nueva era en la web.

4meeker09b

Redes Sociales Móviles:

Lo vemos en el tercer punto: Las plataformas que combinen las redes sociales con la web móvil signififcarán un cambio sin precedentes en cuanto a comunicación y e-commerce:

5meeker09e

Así, según el informe, “Los dispositivos móviles evolucionarán como controles remotos para expandir servicios basados en la web en tiempo real, incluyendo servicios basados en localización, creando oportunidades y “empoderando” a los consumidores de forma disruptiva y transformadora.”

En su opinión es Apple quien lidera el cambio, con el ecosistema iphone/itouch/itunes + accesorios y servicios como el que ha experimentado un crecimiento más importante en la historia de la tecnología de consumo.  Sin olvidar la importancia, cada vez mayor y en mi opinión más sostenible a largo plazo de  la web móvil abierta, con Google Android a la cabeza.

En cuanto a redes sociales,  ¿será Facebook quien las lidere también en entornos móviles? ¿Ganará terreno Twitter? ¿Aparecerá una nueva compañía?

Foursquare parece ser una opción prometedora en ese sentido…


Estas serían, según Meeker, las principales plataformas de distribución / publicación de contenidos hoy. Resulta importante la emergencia de Demand Media, que veremos pronto con más profundidad en El caparazón:

6meeker09c

En definitiva, es posible que la próxima web dependa, pueda estar basada en el punto de encuentro entre redes sociales y telefonía móvil, sin olvidar otras tendencias, como la web semántica, la web de las cosas o la web en tiempo real, de las que hemos hablado aquí en otros momentos.

Veíamos el otro día cómo el “mobile learning” era un tópico fundamental en el próximo Online Educa 2009. Igualmente clarividentes han sido en este sentido desde àtic 2a, de cuyas jornadas os hablaré en breve y para los que intervendré el 29 de Noviembre hablando precisamente de Internet Móvil y Web 3.0.

Compártelo



  • BarraPunto
  • del.icio.us
  • Tumblr
  • Twitter
  • Facebook
  • Google Bookmarks
  • Meneame
  • MisterWong
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Wikio
  • Bitacoras.com
  • Diigo
  • FriendFeed
  • Netvibes
  • Ping.fm
  • Posterous
  • PDF
  • Print

2009, Anuncios generales, Apple, Evolución, Facebook, Google, Planeta educativo, Spanish, Web Semántica, Web en tiempo real, cloud computing-web 4.0, comunidades, desarrollo-web, dispositivos, evolución web, futurismo, futuro web, iphone, lifestreaming, mobile app, móviles, periodismo ciudadano, redes sociales móviles, tendencias web, tim o´reilly, web 2.0, web 2.0 summit, web 2.0 summit 2009, web 2009, web 2011, web de las cosas, web3.0, zeitgeist evolución

Data Independence at ISWC

October 21st, 2009

Back on August 1, Ralph Hodgson declared Data Independence Day , to celebrate the opening of oegov, a website that collects and organizes ontologies and data sets about government. Along with recent developments in open data in the US government, this creates a an opportunity to mash-up government data in a way that has not been possible before.  

We're celebrating next week at ISWC with a tutorial on building semantic web applications for government. The tutorial will show attendees how to use semantic web standards to create their own data mashup applications.  A lot of the features of the semantic web come in to play - distributed vocabularies (using SKOS, of course), linked open data, RSS, etc.  The idea is that each attendee will walk away from the workshop with their own app that they created from data now available from the goverment.

Controlled vocabularies play a big role in this - bigger than you might have thought possible.  After all, if two people use a common controlled vocabulary well, they can share data.  But if they use it badly, well, then data quality issues dominate.  Fortunately, there are some controlled vocabularies being used in the government in a pretty consistent way.  They are published in convenient forms on OEGov, where they can be used as terminology hubs for mashing up information. 

The workshop is part of the International Semantic Web Conference 2009, to be held near Washington, DC from 25-29 October (the workshop itself will be held on Oct 26 in the afternoon, and you can register without attending the whole conference!).  The conference this year has a special focus on government data and applications, and should be a great event for anyone interested in openness of government data.

Goverment, SKOS

The List of OpenCalais Implementations Grows

October 21st, 2009

Add 10 to the list of innovative sites and services that use OpenCalais to reduce costs, deliver compelling content experiences and mine the social web for insight. See our press release for more details on each.

We are thrilled to recognize the following new sites and services that are changing the way we engage with news and the social Web. They join a growing number of others in media, publishing, blogging, and news aggregation who use OpenCalais.

The newest publishers, joining CBS Interactive / CNET, Huffington Post, DailyMe and others in using OpenCalais, include:

The New Republichttp://www.tnr.com – The new website uses OpenPublish, an OpenCalais-enabled Drupal-powered Content Management System (CMS) to increase editorial productivity, improve search engine optimization, and drive reader engagement, including faceted search, recommended reading sidebars and – coming soon – automatically generated topic hubs.

Al Jazeera English’s new blogging networkhttp://blogs.aljazeera.net/ – features Al Jazeera correspondents from around the world. All posts in the new blog network are semantically tagged using OpenCalais for optimal search and navigation. It also uses a Creative Commons license and allows users to sign-in to comment using Facebook Connect, Twitter or OpenID.

Slate Magazine’s News Dots Network http://slatest.slate.com/features/news_dots/default.htm News Dots visualizes the most recent topics in the news as a concise network of related topics. Like a human social network, the news tends to cluster around popular topics, and most stories are more closely related than one might think. Behind the scenes, News Dots scans all the articles from major publications—about 500 a day—and submits them to OpenCalais to identify the relevant people, places, companies, topics, etc.

I *heart* Seahttp://iheartsea.com/ – is a hyperlocal news aggregation site that collects some of the best blogs in Seattle, especially those serving the Capitol Hill area. I *heart* Sea uses OpenCalais to automatically tag the keywords of the blog posts in aggregates, to make it easier to find related information.

Innovative new media monitoring and intelligence tools using OpenCalais include:

Tattler (app)http://tattlerapp.com/ – is an open source topic monitoring tool for today's Web. Tattler finds and aggregates content from the Web on topics users ask it to monitor. Using OpenCalais and other Semantic Web technologies, Tattler mines news, websites, blogs, multimedia sites, and other social media like Twitter, to find mentions of the issues most relevant to users’ selected topics, making it easy for users to filter, organize, share, and take action on content gathered from the real-time Web.

Intercederhttp://www.interceder.net – is a social media monitoring tool that makes it easy to track trending topics and search through the latest content from major news Web sites, blogs, Twitter and YouTube.  Interceder uses the Daylife API, OpenCalais, Freebase, and Yahoo! Pipes to retrieve the latest news from major news websites.

AskJothttp://www.askjot.com – Ask Jot is a tool for analyzing web pages for keywords, and displaying them as links to search results from various services around the Web. Developed by John Wright of Wright Labs and formerly known as Semantalyzr, Ask Jot uses OpenCalais,  The New York Times article search API, DBPedia, the Yahoo! Answers API, the flickr API and many more.

New services using OpenCalais to deliver intelligent content experiences include:

Feedlyhttp://www.feedly.comThis Firefox plug-in brings to life user-selected inputs from Google Reader, friendfeed, Twitter, RSS feeds and more in an easy-to-read and engaging magazine-style format. Feedly uses OpenCalais and other semantic technologies for clustering, linking and organizing the content experience in an intuitive fashion that is nicely integrated into the browsing experience.

OpenPublishhttp://www.opensourceopenminds.com/openpublish– Based on the popular open source publishing platform Drupal, OpenPublish is a next-generation CMS that has been tailored to the needs of today's online publishers (magazines, newspapers, journals, trade publications, broadcast and wire services). Developed by Phase2 Technology, it uses semantic metatagging from OpenCalais to streamline content operations, automatically create topic hubs and recommend related articles and archived ‘more from this author’ stories.

DocumentCloudhttp://www.documentcloud.org – Founded by reporters from The New York Times and ProPublica, and funded by the Knight Foundation, DocumentCloud is a unique online resource that will offer public access to news reporters’ original source materials, including documents, media files and more. OpenCalais processes materials available through DocumentCloud to make it easy for users to explore connections between newsmakers, corporations, transactions and even quotations across documents and across the full collection of source information.

English, Official Blog, media, news aggregation, publishing

Surveying and Classifying SPARQL Extensions

October 20th, 2009

I realised recently that, while a lot of work has been done on creating and exploring interesting extensions to the SPARQL query language, there has yet to be a systematic survey of the range of different extensions that are currently implemented in various RDF triplestores. Or if there has been a survey, then I’ve clearly missed it.

In order to get a better idea of what kinds of extensions are available I’ve set myself the task of surveying those currently implemented. I intend to write-up and share the results of that work through this blog.

Rationale

I think that pulling together a list of extensions is a useful activity which should:

  • Help researchers and implementors to have a clearer view of existing work, thereby encouraging further experimentation
  • Promote convergence on a core set of useful extensions that could be implemented across a number of triplestores.
  • Help users to have a clearer understanding of what SPARQL extensions are currently supported in particular triplestores, letting them make informed decisions about which extensions to use when writing and sharing queries

It looks like the SPARQL Working Group may well be adding a standard library of extension functions into the next revision of the query language so the timing of this work should help contribute to that effort. However I’m looking beyond their immediate goals and hope to encourage the implementor community to explore models simple to the EXSLT effort which has been successful in creating a set of community-designed extensions for XSLT transformations. I see no reason why the same process can’t be applied to SPARQL extensions.

Clarity of which extensions are portable across triplestores is important to allow users to experiment with various triplestore implementations and services. If data is going to be truly portable, then this will be an important consideration.

With that in mind I’ve begun digging into the available documentation for a number of different triplestores. I’ve decided to organize my work by surveying each of the three different types of SPARQL extension.

Types of SPARQL Extension Function

Its possible to extend the SPARQL query language in any of the following three ways:

  • Extension Functions
  • Property Functions (aka “Magic Predicates”)
  • Language Extensions

Lets look at each of these in turn.

Extension Functions

Extension Functions are explicitly described by the current SPARQL specification under the banner of “extensible value testing“. The standard library of extensions that may be added to SPARQL 1.1 will fall into this category. Extension Functions are simple function calls that can be used within a FILTER in a SPARQL query to carry out some specific extra logic that cannot be handled by matching triple patterns. Examples of extension functions include substring testing, string concatenation, date tests, etc.

The specification indicates that these extension functions should have a unique URI, allowing them to be globally identified. Few engines are publishing useful information at these URIs, but this seems like it would be a useful thing to do. These URIs should be grounded in the web too.

Property Functions

Property Functions (aka “Magic Predicates”, or “Magic Properties”) are extensions to the triple matching process that is carried out when a SPARQL query is executed. This means that property functions don’t appear in a FILTER expression like an extension function. They instead appear within the graph pattern of the query. Unlike extension functions which have a syntax like a conventional functional call, property functions use turtle syntax and appear, to the untrained eye, as standard triple patterns.

For example, as property function that could split a resource URI into a namespace and a localname might look like this in a SPARQL query:


?uri a rdfs:Class.
?uri ex:splitURI (?namespace ?localname).

In that example the the property function ex:splitURI has as its input each of the URIs that are bound to the ?uri variable, and as its output binds the namespace URI and localname of those URIs to two new variables.

There are other ways to structure the inputs and outputs of a property function, depending on its purpose, but the important things to recognise are that:

  • the property function is written as a conventional triple pattern
  • parameters can be passed from either the subject or object portions of the triple (or potentially both)
  • similarly, output can be bound to variables that appear in either the subject or object portions of the triple
  • one technique for passing multiple parameters or generating multiple output values is to allow specification of an RDF list in the object portion of the triple

Property functions are very powerful as they can allow arbitrary complex logic to be used to extend the triple matching process. One common use is to extend the matching process by calling out to specialised indices or logic, e.g. for full-text indexing or geospatial functions and reasoning.

It is worth noting that Property Functions are not explicitly licensed by the current SPARQL specification. The specification does not describe them at all: they are simply allowed by the fact that they conform to the overall SPARQL grammar.

Testing whether a query uses Property Functions would therefore require a validator (such as the one that Dan Brickley describes here) to either have explicit knowledge of the function, e.g. based on its URI, or for implementors to publish some useful information at those locations so that a validator might determine whether a specific predicate is actually a “real” predicate or an extension through dereferencing the URI. I’m not aware of any implementation that currently does this.

Language Extensions

The final category of SPARQL extensions are extensions to the language itself. This type of extension involves amending the grammar of the language to include new operators, keywords, and types of expression. Examples of this type of extensions include sub-queries and aggregates (e.g. min and max). The forthcoming SPARQL 1.1 specification will standardise these and a few other language extensions that have been commonly implemented.

Arguably, if one changes the grammar of a language then you’re creating a new language: “SPARQL plus some extensions”. So some care needs to be taken with respect to this type of extension if one wants queries to be portable.

In my view while there is plenty of scope for the community to collaborate and converge on common extension of all of the types I’ve described here, the best place for language extensions to be formally ratified and agreed on is through the SPARQL Working Group. I personally don’t expect the Working Group to have to, or want to sign-off on every extension function or property function, but interoperability is ultimately best served by co-ordinating language extensions through the Working Group. Naturally this should happen after the implementor community have had a period of experimentation and research. This is obviously the process that has happened to date, and hopefully this will continue as the language continues to evolve. A bit of collective action ought to help ensure interoperability in other areas.

A Survey

For my survey of SPARQL extensions I’ve decided to tackle things in the order in which I have presented them here: I will first look at Extension Functions, then Property Functions, and then Language Extensions. For the rationale and reasons I’ve already outlined, I think the community is best served by organizing itself around standardising two of those types of extensions. And Extension Functions seem like the lowest hanging fruit.

I’m intending to do the survey in as open a way as possible, and want to ensure that I include as many different implementations as possible. Having said that initially I’m going to impose some editorial control simply to ensure consistency and quality. Implementors feel free to drop me a line providing me with information on your extensions or preferably pointers to the relevant documentation. I’ll also stress that while this survey has obvious relevance for my day job, that this is a personal project so things will progress as quickly as I’m able to find some time to push things forward.

I’m going to send regular status updates to the public-sparql-dev mailing list as that is the correct place for further discussion. I’ll also summarize my findings in further blog posts here. I’ve already begun the process of cataloguing Extension Functions as you can see by my recent email to the mailing list. I still have to include some additional information helpfully provided by OpenLink and to also update the entries for Mulgara to list its support for some of the EXSLT functions.

One other task I have on my list is to help provide some guidance on how implementors should publish information about their SPARQL extensions. It would be useful to have some descriptive metadata for these available from the relevant URIs. I’m intending to spend some time at Vocamp DC pulling together a vocabulary for that purpose. Let me know if you’re attending and want to collaborate.

English, Programming, Projects, Semantic Web, Television

links for 2009-10-20

October 20th, 2009