Ontology for Media Resource 1.0, API for Media Resource 1.0 Drafts Published
The W3C Media Annotations Working Group has published Working Drafts of Ontology for Media Resource 1.0 and API for Media Resource 1.0. The former document defines the Ontology for Media Resource 1.0, a core vocabulary to describe media resources on the Web. It is defined based on a core set of properties which covers basic metadata to describe media resources. Further it defines syntactic and semantic level mappings between elements from existing formats. The ontology is supposed to foster the interoperability among various kinds of metadata formats currently used to describe media resources on the Web. The latter defines a client-side API to access metadata information related to media resources on the Web.
Explorador de datos públicos de Google abierto a todos
Estados Unidos, UK se han unido recientemente al proyecto de Reutilización de datos gubernamentales, públicos. País Vasco, Asturias o Cataluña, que incluso apuesta por la apertura de APIs a disposición del público general para operar, crear aplicaciones o Mashups con los datos estructurados de la Linked Data Web son otros ejemplos próximos.
Datos metereológicos, de estadística oficial, de turismo, accidentes de tráfico, pasajeros de aviación, agricultura, resultados escolares, etc… son indexados desde hace un año por Google y pueden formar parte ahora del explorador de datos públicos de Google.
Si pensamos en la disolución de la línea entre lo profesional y lo amateur, este desarrollo de Google, con otros similares que permiten la realización de gráficos (Google charts), parece que siguen reforzándola.
Así, el tema puede ser útil para pequeñas organizaciones sin recursos para crear sistemas más complejos pero que tienen la necesidad de visualizar distintos datos y tendencias. Puede ser una buena herramienta, en este sentido (como Youtube direct), para pequeñas empresas periodísticas que quieran competir con los grandes medios. También, por supuesto, puede significar la posibilidad de distintas investigaciones independientes.
Podemos ahora, como vemos en el ejemplo, embeber gráficos, tablas y otras herramientas de representación visual en nuestros sitios. Su carácter dinámico hará que si los datos se actualizan, también lo haga el gráfico.
Lo que hay en este momento son 13 datasets (bases de datos, conjuntos) disponibles, desde las estadísticas de educación en California a indicadores de desarrollo en el mundo desde el World Bank. También los datos públicos del U.S. Center for Disease Control (recordemos las Flu Trends, la evolución de la enfermedad que nos ofrecía Google), el U.S. Bureau of Economic Analysis, Eurostat, la Organización para la cooperación económica y el desarrollo y el departamento de educación de California.
4 opciones de visualización, selección de criterios y variables dinámica son algunas de las cosas que podéis probar en este, que estoy segura que interesará a los amigos/as (reciente e interesante iniciativa), que se dediquen a Orientación laboral, empleo sobre la evolución del paro en españa:
2010, Anuncios generales, Google, Planeta educativo, Spanish, Web Semántica, buscadores alternativos, cloud computing-web 4.0, comunidades, e-goverment, herramientas semánticas, periodismo ciudadano, web3.0
Flickcurl C API to Flickr 1.17 Released
In the last few days I released Version 1.17 of my Flickcurl C library interface to the Flickr API. It has new complete support for three new recent sets of new APIs.
Added 15 new functions for the new Stats API calls
announced 2010-03-03:
flickr.stats.getCollectionDomains,
flickr.stats.getCollectionReferrers,
flickr.stats.getCollectionStats,
flickr.stats.getPhotoDomains,
flickr.stats.getPhotoReferrers,
flickr.stats.getPhotosetDomains,
flickr.stats.getPhotosetReferrers,
flickr.stats.getPhotosetStats,
flickr.stats.getPhotoStats,
flickr.stats.getPhotostreamDomains,
flickr.stats.getPhotostreamReferrers,
flickr.stats.getPhotostreamStats,
flickr.stats.getPopularPhotos and
flickr.stats.getTotalViews.
Added 8 new functions for the new People and “photos of” people API calls announced 2010-01-21:
flickr.photos.people.add,
flickr.photos.people.delete,
flickr.photos.people.deleteCoords,
flickr.photos.people.editCoords and
flickr.photos.people.getList,
flickr.people.getPhotosOf.
Added 3 new functions for the new, unannounced (and seems incomplete) Gallery API calls:
flickr.galleries.addPhoto,
flickr.galleries.getList and
flickr.galleries.getListForPhoto
.
Updated the flickcurl(1) to support the new gallery,
people photos and stats API calls.
See the Release Notes for full details.
Get it at: http://download.dajobe.org/flickcurl/flickcurl-1.17.tar.gz (GPL2 / LGPL2 / Apache2.)
This is what I do for fun between releasing Redland RDF libraries more of which soon…
Predicate Based Services
sameAs.org is a great service on a number of different levels. It provides a much needed piece of Semantic Web infrastructure and it achieves that through a simple clean interface and API. You don’t even need to know anything about RDF to get value from the service. In short it’s one of those nice web services that do one thing and do it really well.
I use the service as a frequent example in my talks and training sessions on Linked Data. For example, while it’s useful to review techniques for linking together datasets, in practice you can achieve a lot by simply doing a series of look-ups against sameAs.org. I’ve had some happy experiences of discovering connections between datasets without having to do any manual linking.
More than a few times recently I’ve been thinking that it would be useful to repeat what Hugh Glaser and Ian Millard achieved with sameAs.org, but for a number of other common RDF predicates.
In my opinion there are a small number of general predicates that will act as the backbone for the web of data. At the head of the predicate long tail we’ll find properties like: owl:sameAs, but also useful properties like dc:subject, foaf:knows and foaf:primaryTopic.
The topic based predicates (dc:subject, foaf:primaryTopic, foaf:topic, et al) are particularly useful for discovering documents and material that relate to a specific resource. An index of these would be extremely useful for inter-linking between content from different news and media organisations for example. I’d envisage that “topicOf.org” might index a range of different topic related predicates and expose some useful discovery tools, relations and equivalencies. Dan Brickley has a nice diagram that shows how these different predicates inter-relate.
“topicOf” is currently top of my list of these predicate based services. But the same approach would work in other contexts. For example a service that indexed foaf:knows would be useful for social networking applications. But I think that this area is already well-served by existing services already. But what about:
- “reviewsOf.org” — find reviews about a specific resource. I believe Tom Heath has thought about doing something like with for Revyu
- “depictionsOf.org” — find pictures of a specific resource (
foaf:depiction), e.g. person, place or thing (and reliably, not like the Flickr Wrapper) - “madeBy.org”> — find documents, photos, or other resources that were made by a particular person (
dc:creator,foaf:maker)
I can think of all sorts of useful purposes for these services. I also think that they could offer additional ways of engaging with the broader developer community and getting them to buy into the Linked Data vision.
Anyone want to have a crack at implementing some of these?
Why OpenCalais?
Why OpenCalais?
Over the last few months you’ve probably seen a number of announcements about how OpenCalais has been chosen by one organization or another to support its business.
In a number of recent meetings I’ve been asked the (very fair) question, Why OpenCalais and not one of the other entity extraction services out there?
Given that the question seems to be coming up more often as the number of extraction services increases, I thought I’d get my best understanding of why many major players we’ve announced (and an equal number we haven’t) have chosen to go with OpenCalais. And – at the end – I’ll mention a few reasons why others haven’t chosen OpenCalais.
So, in no particular order, why do organizations choose Calais?
Thomson Reuters
OpenCalais is provided by Thomson Reuters – the largest professional information organization in the world.
If you’re interested in kicking around some semantic technologies in your spare time this doesn’t really matter. If you’re incorporating those technologies deep within your business – or, as is true with many users – actually building a new business on top of them, this becomes pretty important. Basically – you need to know that the service is going to be there for you.
Facts & Events
With the increase in structured content assets like Wikipedia / DBpedia, it’s become pretty easy to knock out a basic entity extraction tool. And – while we like entity extraction as much as anyone else – it’s really just the tiniest starting point in what you can and will need to do.
OpenCalais extracts a wide range of facts and events from unstructured content and lets you know what’s happening in your content – not just tags for things.
- Facts are things like “John Doe is CEO of XYZ Corporation.”
- Events are things like “XYZ Corporation today announced that it would acquire ACME Corporation.”
OpenCalais is the only service that does this in a production-strength manner.
Reliability
OpenCalais stays up. It’s hosted in mirrored data centers thousands of miles apart from each another. It’s monitored 7*24. It basically doesn’t go down – even during system upgrades and maintenance. We stopped adding 9s after we got beyond 99.99% uptime.
Accuracy
We’ve been building the tools underneath OpenCalais for over a decade. They’ve been used by hundreds of organizations and many many thousands of end users. One of the things we’ve learned is that accuracy matters. While no NLP system is perfect, we’re convinced ours is the best and we have some ideas in the pipeline to increase accuracy even more.
Integration
We basically focus on providing great semantic plumbing. But we know that not everyone wants to be a plumber. We’ve worked to integrate (or motivate others to integrate) OpenCalais with a wide range of tools including Drupal, WordPress, WordPress Multiuser, Oracle, Lucene, Coldfusion, Flash, Firefox, Prolog, Lisp, Django, Java, PHP, Python, Alfresco, Perl, .NET, Ruby, TopBraid and a few others.
From content management systems to language-specific libraries – there are lots of ways to get started quickly.
Linked Data
We’re serious about Linked Data. We’re also worried about the proliferation of incorrect links and the effects of link rot. So, rather than just pointing to Linked Data assets out on the cloud and risking that they’ll go stale, we host our own Linked Data cloud, which is kept up to date with both Thomson Reuters contributed content as well as regularly validated links to other sources such as DBpedia, Freebase and others.
SocialTags
Pure semantic extraction is great – but sometimes you need more. If you’re writing about Porsches and Ferraris you’d probably like to have categorization concepts like “sports cars” and “automobiles” returned to you with your semantic metadata. OpenCalais does this via our ever-improving SocialTags concept tagging capability. It’s good now, and it’s going to get a lot better soon.
Focus
OpenCalais is here to provide great semantic plumbing. We’re not trying to sell ads. We’re not trying to provide the prettiest decorations for blogs. We build the plumbing – you architect the solutions.
Now, in a spirit of transparency, here’s why some people don’t choose OpenCalais:
Languages
We’re great in English and okay in French and Spanish (we extract entities but neither facts nor events in these two languages). We intend to implement more languages in the future – but for the time being we’re concentrating our efforts on improved functionality and accuracy in English.
Complexity
OpenCalais isn’t a simple tagging tool. What it returns to the calling application is a reasonably complex RDF construct. It takes a little time to get up to speed on RDF and how to use it in your applications. We think it’s worth it because it’s the most flexible and powerful format we know of.
Performance in Knowledge Domain ‘x’
Where ‘x’ is fashion or square dancing or rugby. OpenCalais is optimized for performance in the general world of business – that’s where we excel.
We have extended OpenCalais to take steps in other areas (such as sports, media, etc.) – but if you need deep semantic extraction capabilities related to protein binding – there are better places to look.
English, Official Blog, OpenCalais Strengths, OpenCalais Value
New HTML+RDFa draft published
Semantic Technologies Monthly Review. February 2010
English, Spanish, Uncategorized, monthly review, semantic technologies
Linked data and Semantics
English, Linked Data, Semantic Trends, Spanish, semantic technologies, semantics
Pellet 2.0.2: Maintenance Release
We’re happy to announce the second Pellet maintenance release of the 2.0 series, Pellet 2.0.2. This release fixes several issues and includes the updates to support the latest reasoner interfaces in OWLAPI version 3.0.0. Complete set of tickets closed for this release are listed at the Trac page for this release. Pellet 2.0.2 is available for download.
We’ve also release an updated Pellet Reasoner Plug-in for Protégé 4 to work with Pellet 2.0.2.
Bueda API Turns Tags into RDF URIs
A large percentage of content that users deal with on a daily basis is created by other users. Every minute more than 90,000 videos and images are uploaded to YouTube, Flickr and other social media websites, yet this represents a relatively small revenue percentage when compared with traditional media. We believe that one reason for this is the publisher's lack of ability to understand high density content that lacks the adequate description. With mobile platforms providing users with easy methods for rich media upload, this problem will rapidly increase.
Tags are an attempt to mitigate this problem. They allow users an easy way to label content with the labels that make sense to them. Its strengths rely in the simplicity for the user and the ability of the user to use anything as tag, enabling an accurate description of content from the user's perspective. Yet, the strength of tags is also a weakness when it comes to the publisher's ability to understand that content. A tag is, realistically speaking, any sequence of characters. It could be a well formed word, a company name, a person name, an ISBN number, a concatenated version of dates and words, etc. The problem of coverage and disambiguation makes a hard problem to solve.
Bueda addresses this problem by presenting a new solution in the form of an API that can be used by developers to get clean information from noisy tags. It provides a low friction way of tapping into the latest in semantic analysis for tags in a scalable platform.
Bueda provides actionable information that enables targeted advertising, content recommendation, search engine optimization and semantic search, amongst other things. Even though the biggest impact might be in high-density content, such as rich media and pictures, the platform is open to any application and use case.
Bueda is a CMU spin-off and uses proprietary technology for Semantic Resource integration, enabling the integration of heterogeneous data sources that enable open domain coverage in a distributed and scalable framework. Bueda is also an Alphalab alum and currently funded by Innovation Works.
Bueda is currently in private beta. However, Semantic Focus readers have access to some exclusive API keys.
Got something to say? Leave a comment!
The future of research and the research library
According to a recent report from DEFF, Denmark’s Electronic Research Library:
There are three aspects of the functions of the research library that can be seen as providing potential scenarios. The library as a learning centre focusing on the provision of learning materials and support for learning processes. The library as a knowledge centre being a co-creator in the production of knowledge closely connected to active research groups. The library as a meta-knowledge institution working as a catalyst for knowledge synthesis, the organisation, evaluation and consolidation of knowledge.
As well as exploring this typology in greater detail, the report The future of research and the research library also describes a couple of more concrete and familiar scenarios.
Firstly, one that might have benefited from a deeper exploration in the report:
… up-to-date physical locations where the students can study with other students and in that way get a sense of a working day and a working community. In that way, the library will become more of a social zone, instead of the quiet room for lonely absorption which it is traditionally known for.
And secondly, one that is very much informed by the information literacy role of modern university libraries:
“’The touching library’, i.e. a research library which can touch and move its users through its competence to select and qualify knowledge, and which is touched and moved by its users in order to deliver the best possible product.”
What about the report itself?
It’s ambitious. Very ambitious. It’s also universal in its scope – only occasionally delving into Denmark-specific structures and scenarios. I can’t hope to do justice to the richness of its content in one single blog, so I can only present a subjective take.
Essentially, the report seeks to answer the following questions:
- Does the research library have a future?
- What future roles are open to the research library?
- Would a roadmap be useful?
Instinctively I draw away from the idea of a roadmap. There are simply too many variables and broad forces over which we have so little control, notwithstanding the excellent framework that this report has provided. I’m unsure after reading the report twice whether it has answered these questions, Certainly no roadmap is forthcoming. Nevertheless, for those of us who spend time pondering over the future of the university library, it provides excellent food for thought.
Seismic change and disruption
It’s especially useful in terms of the material it presents for understanding the scale of disruption that the research library is undergoing.
Massive technological changes in the area of research, knowledge production, publishing and communication are influencing the way research is done and the functions of the research library in supporting and facilitating research and learning. Digital technology in its many forms is at the centre of the changes. The old functions of the research library are thus served in new ways. New forms of research emerge and new ways of learning too, and consequently not only new ways of serving old functions but also new functions serving new needs.
On the historical value of the research library, the report states:
The original form of value creation of the research library was based on minimising expenditure for acquisition and availability of books and journals. By having a central store it was possible to acquire fewer entities and by making these available it was possible to maximise their use. Books were expensive and few could afford large private libraries.
The report goes on to make the point that this cost-effectiveness is found today in licensing of e-journals and database, but the value is surely diminished where the number of users is factored into the cost of the licence, in a way that was not the case with a printed monograph.
There are also broader changes in terms of the research and educational systems, not least the expansion of higher education which is a global phenomenon, and the role that digital technology is perceived as a means of resolving the resultant problems and tensions. In research too there is much change – more collaborative styles and the ascendant trends towards interdisciplinary research being two obvious examples.
I know that one bright and joyous day I will pick up a report that talks about the impact of cultural relativism on an institution (the library) that has served as an absolutist custodian of authoritative artefacts. Sadly, that day is not today, and I just have to live with that (or write my own).
What history tells us
By and large, this isn’t an easy read. It’s highly theoretical and enormously broad as I’ve said. However, the report does present a very digestible history of the research library. Space constraints preclude even an attempt to do this justice, but what I will say is that it clarified in my mind many unanswered questions about how precisely the research library model has been disrupted. As is so often the case, it is not simply the case that the Internet has somehow thrown a deadly missile into a centuries-old static model, and instead should be seen as the latest and most disruptive change in the history of the research library, following on the heels of other catalysts such as the shift away from books in favour of scientific journals.
Research library and the innovation economy
The other interesting thing about the historical narrative of this report is that it presents a degree of historical continuum in the relationship between the research library and more focused problem-driven innovative activities in the broader economy. The report notes that a massive amount of research is being done in the knowledge-intensive private sector. It makes a very valid point that the limitations experienced in terms of access to digital resources (being mainly restricted to academia) is problematic, especially for SMEs.
What about curiosity driven research?
The report states that:
The British sociologist of science Steve Fuller has made a distinction between two ways in which research and universities create value. One is the direct creation of knowledge that can be used in making processes and products available in a market. This is the role of research in innovation. It contributes to the creation of financial capital. In this knowledge is seen as instrumental. The other way is through the creation of degree programmes and public education and making knowledge publicly available.
It wasn’t clear to me when reading the report where curiosity-driven research sits in this model, and indeed in the report as a whole. Yet it is surely of vital importance, even in today’s instrumental thinking around research and economic innovation. You could even argue that it assumes an even greater importance – we surely need to make huge leaps in our thinking to achieve the necessary scale of economic restructuring in most Western economies, and thinking needs to be as unrestrained as possible.
The central dilemma of the intermediary
The report provides some valuable pointers in terms of the role of the librarian and the competences that will be required. Our old friend disintermediation plays a major role in the discomfort that librarians have experienced for many years now:
New players are appearing as important and can take over some of the functions or parts of these. Publishers can provide access to journals on-line via their own servers, and universities and scientific groups or societies can provide access to digital repositories of papers and books.
As one interviewee said:
The dilemma is that you on one hand do something for the user and make yourself indispensable, and on the other hand you create the user in your own picture [sic] and thus make yourself dispensable.
This quotation surely goes to the heart of the pain of disintermediation, and reminded me forcibly of my days as a special librarian in the metals industry.
To my mind, the most optimistic statement in the whole report was this one:
Our belief about who we are does influence what we perceive as possible.
It really is true that even in adverse conditions, a little bit of self-belief can make a lot of difference, and this report has at least delivered some clarity to a highly complex landscape.
HTML5 and Semantics
Me entiendes?
Everyware, Internet of things, visualización de datos y Sociedad científica
Os dejo lo que será complemento del material de base que preparo para un posgrado próximo, en el apartado de Internet de las cosas. Es también un Insight 3.0 de los que hemos ido apuntando aquí últimamente.
Entre la Internet de las cosas, la visualización creativa de datos, el netart, un repaso a algunos de los proyectos presentados en Visualizar09, nos acerca a la idea de sociedad medible, previsible, transparente, de cualidades científicas hacia la que nos vamos acercando:
La visualización de datos es una disciplina transversal que utiliza el inmenso poder de comunicación de las imágenes para explicar de manera comprensible las relaciones de significado, causa y dependencia que se pueden encontrar entre las grandes masas abstractas de información que generan los procesos científicos y sociales.
La web al cuadrado que definía Tim Berners Lee se referiría al aumento exponencial de datos que supone este proceso de integración entre web y realidad.
In the Air, de Nerea Calvillo, es una de las propuestas. La representación de los diferentes niveles de contaminantes en la ciudad de Madrid se transformará en una visualización en tres dimensiones. Existen dos medios para representarlos uno virtual, en Internet, y otro tangible mediante una instalación temporal. Estos dos medios representan dos escalas de la ciudad: completa y el ámbito cercano a medialab. La intención es formar una malla de puntos de datos en la ciudad para hacer una superficie visualizable. Estos datos se refrescan a diario, semanal, y mensualmente a través de la web municipal.
En el mismo sentido pero en otro contexto, esta vez fuera del festival, el World map monitorizando el Cambio climático global ofrece una interesante perspectiva.
Lazarillo GPS, de Carolina Paola Caluori Funes aporta un enfoque social al concepto de visualización. Se trata de asistir a diferentes colectivos de discapacitados mediante un sistema de mapeado a través de gps. Se trabaja con personas discapacitadas que realizan rutas turísticas con receptores gps e informarán de las zonas que están preparadas para su acceso y las que están interrumpidas arquitectónicamente.
El tema es el que trata, a nivel de usuario “doméstico”, Everyware: The Dawning Age of Ubiquitous Computing, uno de los pocos libros publicados sobre la Internet de las cosas (la web al cuadrado y sus datos que crecen de forma exponencial cuando se trata de reflejar la compleja realidad), que nos acerca a cómo será el mundo si todo estuviera conectado a internet (“everyware”).
Fue publicado hace 4 años, así que me resultaba interesante una reciente entrevista de actualización de conceptos y tendencias a Adam Greenfield, su autor:
¿Qué ha cambiado?
- Telefonía Móvil:
La telefonía móvil, el rumbo que ha tomado, ha sido lo que resultaba imprevisible cuando se escribió el libro. La aparición del iPhone en 2007 ha sido tan disruptiva, ha cambiado en tal medida el panorama web, que ni siquiera Greenfield, un early adopter de blogging móvil (“moblog”) podía haberlo imaginado.
- RFID
La cara opuesta de la moneda la constituyen las tecnologías RFID. Greenfield nos alerta de que les queda un largo recorrido. Todavía nos queda ver RFID en tarjetas de crédito y “Home theatres”.
Aún así, es posible que las tecnologías RFID evolucionen y puedan ser sustituidas en algún momento por tecnologías de identificación y tracking más sofisticadas.
- Ciudades
Adam Greenfield trabaja ahora en su próximo libro, The City Is Here For You To Use (“La ciudad está aquí para que la uses”) y describe, entre otras cosas, qué ciudades le impresionan más en cuanto al uso de Internet de las cosas.
Corea y Singapur y algunos municipios del este de Asia parecen ser las que han trabajado más su identidad, presencia, creación de lo que se me ocurre que podríamos llamar el espejo digital.
Ciudades con nuevas demandas, tecnologías (como Cisco o Intel respondiendo con productos y servicios de la Internet de las cosas son algunas de las cosas que relatará.
- Ipad:
La cultura asiática explica la diferencia, el éxito de la Internet de las cosas. La calidad de vida puede venderse como servicio en Corea, con cosas como Internet fridge (la idea de una nevera “conectada” que nos transmite las propiedades de los alimentos y realiza dietas, combinaciones, etc… con ellas) que en occidente nos pueden parecer absurdas para requerir de un dispositivo dedicado.
Por contra, es posible que el mercado occidental, opina Adam, valore positivamente un dispositivo ubicuo, intermedio entre distintos objetos (y sus sensores) y la red. El Ipad, portátil entre distintos espacios domésticos y en interacción con frigoríficos pero también red de coches, etc…, podría ser la clave.
Imaginemos el caso del frigorífico inteligente…. y el Ipad animándonos a seguir una dieta equilibrada cuando lo interponemos entre nosotros y los alimentos…. o el mismo Ipad como interface portátil para alquilar coches como servicio (Zipcars), como uno más de los recursos que la red puede ofrecer.
2010, 3.0 insights, Evolución, Net-art, curiosidades en la red, Planeta educativo, Spanish, adam greenfield, cibercultura, curiosidades, datos, everyware, futurismo, innovación, internet de las cosas, internet of things, ipad, netart, prospectiva, video-documentales, visualización, web3.0
Book Review - Pull
I really wanted to hate this book. A book about the Semantic Web for 'business'. With a pretentious title. I expected misinformation and fluff.
That was the only disappointment about this book. From about the third page, I came to realize that Siegel not only knows Semantic Web technology well, he also knows how to do research. Among other things, this book is a treasure trove of relevant technology trends, all cited and referenced. The notes alone are worth the purchase price.
I expected a hodge-podge of technology promises like the ones we usually hear - just like Google, but better. Will make sense of all your documents, so you don't have to. What I got instead was a coherent story of a future information culture - as different from what we know today, as today's world is different from what we knew before the web.
This isn't written as a business book, but as a futurist book. In many ways, the apparently pretentious title is actually conservative - in the Future Siegel paints, a lot more things are transformed than just your business. But I have to say, had he called it "The Semantic Web will Change your Life," I might not have picked it up.
This book goes into sophisticated detail on a lot of points I have made at various points in my own blog, but it does it in more depth and more courage than I have done. You'll find out why the Semantic Web vision doesn't fit into Google's current business model. Why Natural Language Processing is interesting, but tangential. Why IBM is still around, and will be for a while. And a whole lot more. I have recommended this book to just about everyone I know.
If you read this book, and you don't have a few ideas about start-ups you might try, then you know you really don't have an entrepreneurial bone in your body. This is what the Semantic Web can really be.
RIF Production Rules Dialect Revised; Last Call for Comments
During the implementation phase of the Rule Interchange Format (RIF), the Working Group discovered a problem with the design of the Production Rules Dialect. This problem is addressed with a new Last Call Working Draft that changes the way actions are handled to more closely match existing production rule engines. Please send comments and RIF implementation reports to public-rif-comments@w3.org.
Cierre de 2009: balance y promesas para 2010
Google se lleva de nuevo el premio para la mejor gran empresa de 2009. Después de haber coleccionado reconocimientos en 2004 y 2006, este año la compañía de Mountain View se ha destacado gracias a la introducción en el mercado de la plataforma Android, Google Apps, Google Chrome , Google Maps y Google Voice, demostrando un modelo de negocio flexible y proponiendose como nuevo competidor en aquellos mercados (como la telefonía móvil) donde todavía no había llegado.

Entre las pequeñas empresas la triunfadora ha sido Aardvark: un motor de búsqueda social que combina inteligencia artificial, procesamiento de lenguaje natural y colaboración social. Es posible preguntar cualquier cosa a Aardvark, que intentará buscar la persona adecuada para proporcionar una respuesta. A diferencia de Yahoo Answers esta aplicación no mantiene un repositorio de "preguntas-respuestas" sino intenta buscar un experto en la tu propria red social.
Para el 2010, siempre según ReadWriteWeb, no hay que perder de vista Wolfram|Alpha, buscador lanzado en Mayo del año pasado y que ha decepcionado mas de un experto. La gran expectación venía de la diferencia con Google: Wolfram|Alpha proporciona respuestas y no enlaces.
En los últimos meses se ha ampliado su base de conocimiento para encarar con nuevas fuerzas el año que viene.
En 2010 y los próximos años serán calve aquellas aplicaciones que permitan usar, acceder e integrar los datos fructos del trabajo colaborativo de los internautas en el marco de la filosofía de la Web 2.0. Wolfram|Alpha es una seria candidata a ser la aplicación del año para 2010.
Cinco años para las Interfaces gestuales, 20 (de media) para que la inteligencia artificial se iguale a la humana
Os dejo hoy algunos 3.0 insights más, esta vez sobre HCI (Interacción persona ordenador) e Inteligencia artificial.
¿Recordáis Minority report? Pues andamos a cinco años, dicen en NYT, de que lo que véis en las imágenes y los vídeos sea realidad. Las Interfaces gestuales de las que algunos privilegiados/as disfrutábamos hace muy poco en el laboratorio de Innovación de la UOC, estarán, dicen, a la orden del día.
John Underkoffler, un consultor científico en la película ha trabajado durante los últimos 10 años en Oblong Industries para conseguir la interface gestual que vemos, llamada g-speak Spatial Operating Environment, que presentaba el pasado viernes en la TED conference anual que se desarrolla estos días.
Escultura, alfarería virtual, no había pensado hasta la fecha en la relación de todo ello con el aprendizaje, pero parece que será útil tanto en la simulación de procesos manuales precisos (¿cirugía?) como para cosas más complejas, como la interacción tangible, manual, física, con bases de datos. En este sentido, el Tangible Media Group del MIT trabaja en la interface g-stalt, que diluye todavía más cualquier frontera entre lo digital y lo real, que nos acerca un poco más a la Web al cuadrado:
Parece que este tipo de interfaces se han probado ya en Universidades y empresas en el top Fortune Estadouniense y que son varias las empresas de PCs y Consolas que lanzarán interfaces gestuales antes de final de año, según el Times.
Impresionante, pero no menos que el curioso ejemplo de Inteligencia colectiva sobre inteligencia artificial que nos presentan en Hplusmagazine:
Un número significativo de personas informadas sobre Inteligencia Artificial (AI), creen que es probable que la inteligencia artificial en general (AGI) llegue al nivel humano o más allá a mediados de este siglo. Se pedía a un grupo de expertos, reunidos en un congreso de AI cuándo estimaban que la Inteligencia artificial lograría cada una de estas metas:
- Pasar la prueba de Turing, llevar una conversación tan bien como un humano.
- Solucionar problemas al mismo nivel que alumnos de 3 º de primaria.
- Elaborar un trabajo de investigación científica de nivel de Nobel.
- Ir más allá de la inteligencia humana, hacia la inteligencia sobrehumana.
El gráfico matiza algo más los resultados:
En fin…no sé si os quedarán ganas de seguir mirando al futuro. Si es así, los 4 vídeos imprescindibles para entender la web 3.0 que publicamos hace un tiempo, pueden completar el tiempo visionario que habéis dedicado a este artículo.
Actualización: El dato proviene de una encuesta a un grupo de científicos expertos en AI y se refiere a la media de años a que creen que estamos de cada uno de los 4 indicadores (Pasar la prueba de Turing, llevar una conversación tan bien como un humano, solucionar problemas al mismo nivel que alumnos de 3 º de primaria, elaborar un trabajo de investigación científica de nivel de Nobel, ir más allá de la inteligencia humana, hacia la inteligencia sobrehumana.
Refleja el optimismo de la mayoría, no ningún tipo de predicción científica más allá de la confianza que depositemos en la inteligencia colectiva de los expertos.
2010, 3.0 insights, Evolución, Planeta educativo, Spanish, TRABAJOS DESTACADOS, Vídeos, Web 3.0, dispositivos, futurismo, hci, ia, innovación, inteligencia artificial, inteligencia colectiva, interfaces, prospectiva, singularidad, video-documentales, web al cuadrado, web3.0
Linking Open Data to Thesaurus Management
The Vienna-based company punkt. netServices is just about to release a demo version of their PoolParty service, a SKOS-based thesaurus management tool with linked data capabilities. I had the chance to pre-read a white paper and test their service. Here is a brief overview. You can also try a demo.
Purpose
Poolparty was conceived to facilitate various applications like
- Semantic search engines
- Recommender systems (similarity search)
- Corporate bookmarking
- Annotation- & tag recommender systems
- Autocomplete services and facetted browsing.
These use cases can be either achieved by using PoolParty stand-alone or by integrating it with existing Enterprise Search Engines and Document Management Systems or Enterprise Wikis.
Thesaurus Management
PoolParty is aiming to be easy to use for people without a strong Semantic Web background or special technical skills. The GUI is entirely web-based and utilizes AJAX so the user can e.g. quickly merge two concepts via drag & drop. An overview over the thesaurus can be gained with a tree or a graph view on the concepts.
PoolParty also helps to semi-automatically add concepts to a thesaurus as it can be used to analyse documents (e.g. web pages or PDF files) relevant to a thesaurus’ domain in order to glean candidate terms. This is done by the key-phrase extractor of KEA. The extracted terms can be selected by the user, thereby becoming “free concepts” which later can be integrated into the thesaurus, turning them into “approved concepts”.
Documents can be searched in various ways – either by keyword search in the full text, by searching for their tags or by semantic search and similarity search. The latter takes not only a concept’s preferred label into account, but also its synonyms and the labels of its related concepts are considered in the search. The user might manually remove query terms used in semantic search. Boost values for the various relations considered in semantic search may also be adjusted. In the same way the recommendation mechanism for document similarity calculation works.
PoolParty by default also publishes a Semantic Wiki version of its thesauri, which provides an alternative way to browse and edit concepts. Through this feature anyone can get read access to a thesaurus, and optionally also edit, add or delete labels of concepts. Search and autocomplete functions are available here as well. The Wiki’s XHTML source is also enriched with RDFa, thereby exposing all RDF metadata associated with a concept to be picked up by RDF search engines and crawlers. (See two examples: Cocktail thesaurus & Standard Thesaurus for Economics)
PoolParty also supports the import of thesauri in SKOS (including several consistency checks) or Zthes format. Those functionalities can also be consumed as stand-alone web services via PoolParty SKOS Services. Additionaly, lists of concepts and their labels can also be imported via CSV files.
Linked (Open) Data
PoolParty not only publishes its thesauri as Linked Open Data (in addition to a SPARQL endpoint), but it also consumes LOD in order to expand thesauri with information from LOD sources.
Concepts in the thesaurus can be linked to e.g. DBpedia via a service like Georgi Kobilarov’s DBpedia lookup service, which takes the label of a concept and returns possible matching candidates. The system suggests relevant resources from DBpedia and the user can select the one that matches the concept from his thesaurus, thereby creating a skos:exactMatch relation between the concept URI in PoolParty and the DBpedia URI. The same approach can be used to link to other SKOS thesauri available as Linked Data.
Other triples can also be retrieved from the target data source, e.g. the DBpedia abstract can become a skos:definition and geographical coordinates can be imported and be used to display the location of a concept on the map, where appropriate. The DBpedia category information may also be used to retrieve additional concepts of that category as siblings of the concept in focus, in order to populate the thesaurus.
PoolParty is capable of importing a SKOS thesaurus from a Linked Data server, and may also receive updates to thesauri imported this way. This feature has been implemented in the course of the KiWi project funded by the European Commission. KiWi also contains SKOS thesauri and exposes them as LOD. Both systems can read a thesaurus via the other’s LOD interfaces and may write it to their own store. This is facilitated by special Linked Data URIs that return e.g. all the top-concepts of a thesaurus, with pointers to the URIs of their narrower concepts, which allow other systems to retrieve a complete thesaurus through iterative dereferencing of concept URIs.
Additionally KiWi and PoolParty publish lists of concepts created, modified, merged or deleted within user specified time-frames. With this information the systems can learn about updates to one of their thesauri in an external system. They then can compare the versions of concepts in both stores and may write according updates to their own store.
This means each system decides autonomously which data it accepts and there is no risk of a system pushing data that might lead to inconsistencies into an external store. Data transfer and communication are achieved using REST/HTTP, no other protocols or middleware are necessary. Also no rights management for each external systems is needed, which otherwise would have to be configured separately for each source.
Technology
The software is written in Java and utilizes the SAIL API, so it can be used with various triple stores. The thesaurus management itself (viewing, creating and editing SKOS concepts and their relationships) can be done in an AJAX Frontend based on Yahoo User Interface (YUI). Editing of labels can alternatively be done in a Wiki style HTML frontend. For key-phrase extraction from documents PoolParty uses a modified version of the KEA 5 API, which is extended for the use of controlled vocabularies stored in a SAIL Repository (this module is available under GNU GPL). The analysed documents can be stored and indexed in Lucene/Solr or any other (enterprise) search system along with extracted and semantically related concepts.
Corporate Semantic Web, English, KiWi, Knowledge Management, Linked Data, Linked Data & Open Data, PoolParty, RDFa, SKOS, Semantic Web, Simple Knowledge Organization System, Software Development, dbpedia, kiwiknows, search engines, semantic web applications
Rasqal 0.9.18 RDF Query Library Released
Update: you want 0.9.19 not 0.9.18 after package configuration issue found. Links fixed.
This release of Rasqal adds draft syntax support for the SPARQL 1.1 Update language being developed by the W3C SPARQL Working Group. The SPARQL 1.1 Update W3C Working Draft of 2010-01-26 introduces the first syntax design with some uncertainties and gray areas still present (no grammar spec section yet). I added what I thought would work, avoiding the ambiguous WITH forms where everything is optional. Since this is draft work, this extra parsing is only done when the ‘laqrs’ query language syntax is chosen. LAQRS stands for LAQRS adds to Querying RDF in SPARQL.
This is just syntax and API support in Rasqal, so it means you can prepare the upload queries, but there is no code to execute it. The API allows getting access to the decoded sparql update (INSERT, DELETE with or without DATA) and graph operations (CLEAR, DROP etc.). There is still more to do, when the syntax gets changed in later drafts and there is no API to stream triple insert/deletes during parsing, to handle uploading and downloading large triple blocks. That would required a rewrite of the SPARQL parser to use a different technology than flex+bison (maybe lemon, maybe Ragel) as well as new APIs.
Rasqal has several things to finish for SPARQL 1.0 support (UNION and nested OPTIONALs don’t work) but the recent rewrite of the query engine internals should make other SPARQL 1.1 parts such as aggregate functions and nested queries, a lot easier to do than with the old query engine. I will probably remove the old query engine from the codebase soon.
The second substantial change is a set of APIs moved from private to public in rasqal.h to enable the construction of query result sets and query result set rows (rasqal_row) via the public API. This allows query results to be read from a syntax or constructed by API as well as serialized to result formats, without any query being executed. Rasqal can be used with this addition to provide the sparql results syntax support for other applications that may have created query results via a different method. It can read query results formats from the SPARQL XML format (the standard format), and write or serialize them to SPARQL XML, SPARQL JSON, CSV, TSV and an ASCII Table format. This functionality is all available via Triplr where you can make HTTP GET URLs for saved queries.
The final change is in the area of resilience. The functions in the public API have been updated so that when invalid or NULL pointers are given, the functions return failure or NULL / false rather than try to use the pointer and probably crash. Hopefully I caught all of them. The release testing (as usual) included valgrind memory leak checking of all of the 100s of tests and there were no leaks or buffer overruns found.
This is also the first Rasqal release since switching to GIT as the source control for the Redland libraries so the source pointers have moved to git.librdf.org where details of how to check it out can be found.
So in summary, the main changes in this release are:
- 0.9.19: Fix rasqal.pc to Requires raptor again.
- Add initial draft parsing and API (NOT execution) support for SPARQL 1.1 Update W3C Working Draft of 2010-01-26.
- Add public APIs (row, results, result formatter, variables table) so that query results can be built, read and written without a query.
- Add API resilience checks for invalid NULL pointer arguments.
- Many other bug fixes and improvements were made.
Fixed Issues:
- 0000320: Add a void* user_data field to rasqal_variable
- 0000323: Official MIME Type for JSON isn’t text/json
- 0000343: Mime type for ‘table’ results format is text/plan
- 0000345: MIME Type and URI for TSV and CSV
- 0000347: rasqal linking fix
See the Rasqal 0.9.19 Release Notes for the full details of the changes.
Download: at http://download.librdf.org/source/rasqal-0.9.19.tar.gz
Talis News for Public Libraries February 2010
This month Talis is proud to announce its accreditation on the e4libraries scheme, recognising our ongoing commitment to electronic trading. We also bring you a couple of reports from across the sector on Web 2.0 and social software, as well as a round-up of public library news.
We welcome your feedback on any of the topics covered, email marketing@talis.com.
Alison Kershaw
Head of Products, Talis
News
Facebook as a library tool
Libraries, in the past few years, have begun to examine the possibilities that social networking sites like MySpace and Facebook provide as a tool for library awareness and marketing. This report examines reported versus actual use of Facebook in libraries to identify discrepancies between intended goals and actual use.
A guide to using Web 2.0 in libraries
SLIC and CILIPS have recently released a guide on using Web 2.0 in libraries. These guidelines have been provided to highlight the potential of social media within library services and to encourage organisations to reassess restrictive practices regarding access.
British Library to offer 65,000 ebooks for free
The British Library will make over 65,000 works of fiction from the 19th century available to download for free as ebooks. The project, funded by Microsoft, will only initially make the ebooks available for owners of the Amazon Kindle, although these will include the original typeface and illustrations. Paperback versions will also be available to purchase from Amazon and will also look like the original 19th century first editions.
How the library world reacted to the iPad announcement
It is only a couple of weeks since the announcement of the iPad, but the library world has already had its say on the matter. The Digital Librarian comments on the potential of the iPad, whereas Talis’ own Tom Heath isn’t getting excited by it at all. The Bookseller reports on UK publishers hailing the iBook moment, but Scott Douglas doesn’t think the ‘Kindle Killer’ is particularly killer at all.
Talis gains e4libraries accreditation
Talis is now an accredited e4libraries supplier, under a scheme introduced by BIC. The accreditation acknowledges the strengths of Talis’ supply chain management suite, comprising Talis Gateway (which supports the full EDI procurement cycle) and Talis Keystone’s finance and CRM integration.
New Talis Prism catalogues
Two more public libraries have gone live with their Talis Prism 3 implementations. Leicestershire Libraries are now live with their new catalogue, parallel running both Prism 2 and Prism 3. Most recently, Highlands Libraries launched Prism 3, also running it alongside Prism 2. Read more about these implementations on the Prism blog. In our latest blog we have a report on January Prism 3 usage and a latest list of Talis customers who have launched versions of Prism3.
Events
Talis Decisions Open Day, Birmingham
The Talis Decisions Open Day is a free event taking place on Tuesday 20 April at the Talis office in Birmingham. Come and learn how to get the most out of your management reporting.
Talis Integration Open Days – Solving the integration conundrum
Taking place on 18 February and 6 May 2010 at the Talis offices, the day will explore ways in which our integration solution can deliver savings and service enhancements by linking with finance, CRM and identity management systems.
For further details and to register your free place visit our Integration events page.
Talis Open Day – Your library on the Web
The next in our successful series of Talis Open Days on 4 March 2010 focuses on how to optimise your library’s presence on the web. Join us at the Talis offices for this free event to discover how Talis products can promote your library among your users, and across the wider authority.
Improving efficiency within public libraries – an event with Bridgeall Libraries
The pressures facing public libraries to offer best value are putting even the most efficient library services in the UK under severe scrutiny. Innovative solutions that bring further efficiencies into the library service therefore are crucial at this time. Join us at this ½ day session on 26 February 2010 to discover how smartsm, the unique and powerful stock lifecycle management solution can help libraries to get more from their current stock holdings. To register or for more information, please email: brendan.pearce@smartsm.com.
Across the sector
A library with a difference
Lying on a beach reading a book is possibly one of life’s greatest pleasures. In Australia this was made easier than ever when Ikea set up a library on Bondi Beach giving sunbathers and surfers alike the chance to read whilst getting a tan. Read more…
Talis News for Academic Libraries February 2010
This month Talis is proud to announce its accreditation on the e4libraries scheme, recognising Talis’ ongoing commitment to electronic trading.
Our products are also moving forward – the Talis Decisions Universes are available for download, and Talis Assure 1.3 is progressing well through beta test. And we’d love to hear from you if you’re interested in beta testing the Talis Alto Client Release during March.
Alison Kershaw,
Head of Products
News from Talis
Talis gains e4libraries accreditation
Talis is now an accredited e4libraries supplier, under a scheme introduced by BIC. The accreditation acknowledges the strengths of Talis’ supply chain management suite, comprising Talis Gateway, which supports the full EDI procurement cycle, Talis Keystone finance and CRM system integration, and RFID interoperability.
University of Chichester goes live with Prism 3
The University of Chichester has gone live with Prism 3.The university will run in parallel with Prism 2 for a short trial period, before moving to Prism 3 as its default catalogue. If you’d like to know more, a recent Talis Prism 3 development webinar is now available to view or download.
Talis Assure is in beta test
Talis Assure 1.3 beta test is making good progress in the three participating libraries, and is expected to be available on general release at the end of February.
Talis Alto Client Release – Call for beta testers
We are now working on a client-only release of Talis Alto, which will not involve a server upgrade. Libraries must already be running Talis Alto 5.0 to take this release. If your library is interested in beta testing this release during March, please contact Anne Stacey.
Upgrading to Talis Alto 5.0
Fourteen academic libraries have now upgraded to Talis Alto 5.0. We advise those customers thinking of upgrading during the Easter or summer holidays to contact their account manager to schedule a date.
New Alto 5 Decisions Universe
The latest release of Talis Decisions Universes, complementing Alto 5, are now available for download.
University of Manchester implements Talis Bridge Pro for Sorter
The University of Manchester has successfully installed Talis Bridge Pro for sorters. This has enabled them to implement 2CQR’s 7-bin sortation unit for processing self-returns at its John Rylands Library. In operation since December 2009, the unit is currently processing around 38,000 items per month, and is part of a broader initiative to convert the library ground floor into a social space.
Come and meet us at these events
Talis Decisions Open Day, Birmingham
The Talis Decisions Open Day is a free event taking place on Tuesday 20 April 2010 at the Talis office in Birmingham. Come and learn how to get the most out of your management reporting.
Talis Integration Open Day – Solving the integration conundrum
Taking place on 29 April 2010 at the Talis offices, the day will explore ways in which our integration solution can deliver savings and service enhancements by linking with finance, CRM and identity management systems.
For further details and to register your free place visit our Integration events page.
Authority Control workshop, Birmingham
Talis will be holding an Authority Control workshop at the Talis offices on Wednesday 10 March 2010 at the Talis office in Birmingham. Come and share your practices with other libraries, and ensure that we’ll be meeting your ongoing requirements in this area. To register for this workshop, please email Talis Events.
LILAC 2010, Limerick
The popular annual LILAC conference, presenting new perspectives on information literacy, takes place this year at the Limerick Strand Hotel from 29th to 31st March 2010. As sponsor, Talis will be selecting the best paper in the Innovative Practice theme.
8th Annual JISC conference, London
JISC’s 8th annual conference will underline the imperative of integrating technology into all aspects of universities’ strategic planning to ensure survival. Visit our exhibition stand at the conference which takes place on 12-13 April 2010 at the Queen Elizabeth II Conference Centre, London.
UKSG 33rd Annual Conference and Exhibition, Edinburgh
UKSG’s 33rd annual conference, bringing together the library and the publishing communities, will take place from 12th to 14th April 2010.
Developments across the sector
How the library world reacted to the iPad announcement
- Talis’ Tom Heath isn’t in the least bit excited about the iPad
- Phil Bradley’s first thoughts on the iPad
- Roy Tennant on how the iPad knocks the Kindle into a cock hat
- UK publishers hail the iBook moment according to Bookseller
- Early thoughts on the iPad from Library Web Chic
- Josh Greenberg on the implications of iPad and books in iTunes for libraries
- The Digital Librarian on the potential of the iPad despite its shortcomings
- Scott Douglas’ disappointment with the iPad
Perceptions report
In the Perceptions 2009 survey, the most popular library management system is available exclusively on a Software as a Service basis. The survey also contends that interest in open source library management systems is weak outside the community of early adopter libraries.
James Clay talks with Talis
In this podcast, Sarah Bartlett talks with James Clay from Gloucestershire College, ALT’s Learning Technologist of the Year, 2009. Among his responsibilities are 2 library sites, each attracting over 1,000 learners a day. It’s particularly useful to hear James characterise his students who will presently be making their way to university, in terms of their relationship to technology.
Horizon report
A brief summary of the 2010 Horizon report, covering emerging technologies in higher education, is available on the Talis Education blog.
Elsewhere in the blogosphere
- Good commentary from Eric Hellman on the acquisition of LibLime by PTFS
- James Clay nominates Classics his iPhone App of the Week
- Marshall Breeding reports that King’s Fund goes live with Koha open source LMS
- Scott Leslie proposes a wikipedia OPAC mash-up
- Selina Lock’s 3 part write-up of the Innovations in Reference Management event at Open University
- Dorothea Salo on the implications of EBSCO’s exclusive access to several popular e-journals
- The e.corrado.us blog examines the relevance of the Horizon report to libraries
New Search Experience at Hakia
With today’s update at hakia.com, we are coming out of a period of silence during which we made several updates to our offerings on the Web and in enterprise search.
We worked on two elements of progress: (1) automation and (2) relevancy. In both cases, semantic technology is the enabler.
On the automation front, the new hakia.com brings 10 full sets of search results with a single click. You can see the quick progress as the segments come in. These search result segments include Web, Galleries, Credible sources, Pubmed, News, Blogs, Twitter, Wikipedia, Images, and Videos. (Twitter and Wikipedia will be available next week.)
Instead of displaying blurbs from such segments, which is a common practice today, we thought the user should have the full result set in one click, available to him/her for each search.
Although the process of displaying 10 segments may look slow, it is faster than doing 10 searches seperately using any search engine. Furthermore, the increased bandwidth and faster CPUs will make this step instantaneous in the near future.
For those minimalists, the SERP has accordion buttons (little triangles). You can chose what to view and what to hide by opening or closing the segments, as shown below. Your preferences are remembered next time you search, or visit hakia.com.
We believe that the future of search will shift from the domination of a single recepie to the presentation of different segments, almost like restaurants having different menus. Automation is the key for progress in this direction.
On the relevancy front, the relevancy of search results is elevated via our semantic technology at various levels depending on the segment. While Galleries have the highest level of semantic treatment, Credible, Pubmed, News, and Blogs have moderate levels of semantic treatment. All these segments are QDEXed content. The remaining segments receive light level of semantic treatment, mostly on-the-fly, via our SemanticRank algorithm.
At hakia, we are also working on exciting real-time and enterprise search products where the impact of semantic technology is most visible. Stay tuned and expect related announcements in coming weeks.
Date and place of the ‘RDF Next Step’ Workshop settled
A few weeks ago W3C announced the organization of an "RDF Next Steps" Workshop. At the time of the announcement the dates and the place of the Workshop were not settled yet.
They are now... The Workshop will indeed take place on the 26 and 27 of June, 2010, and hosted by the National Center for Biomedical Ontology (NCBO), at Stanford University. Note that those dates are on the week-end after the SemTech2010 conference, held nearby in San Francisco.
The call for paper of the Workshop has been updated, and also includes details on the way of submitting position papers.
Algorithmic recruitment with GitHub
In my new job in Berlin I’ve been asked to hire some people to help prototype new, secret projects. Berlin has a superb tech scene but as I’m new in town it’s taking me a little time to get to know everyone. While that’s going on, I wrote some code to help me explore Berlin’s developer community.
When I’m hiring, one of the things I always want to see is evidence of personal projects. Over the last two years, GitHub has become an amazing treasure trove of code, with the best social infrastructure I’ve ever seen on a developer site. GitHub profiles let the user set their location, so I started with a few web searches for Berlin developers. This finds hundreds of interesting people, but how do I prioritise them?
Another thing that I look for when building a good team is someone’s personal network. I’ve always believed strongly in spending lots of time at conferences meeting passionate people who are smarter than me. A good developer can make themselves even more productive by knowing who to email, IM or DM to answer a question when they’re stuck.
A recent article by Stowe Boyd on centrality and influence in social networks reminded me of some of the network analysis we use behind the scenes calculating recommendations for the Dopplr Social Atlas. So I wrote some code to query the GitHub API and analyse the social graph of the Berlin subset of their users.
The JRuby code uses Yahoo BOSS to do the web search. After querying the GitHub API for each user’s followers it builds an in-memory graph using the Java Universal Network/Graph Framework. Then it ranks each user node in the graph using the Betweenness Centrality algorithm. You can see the simple source code on my github.
To sanity-check the results I ran it for a couple of cities I already know well: London and San Francisco. Here are the top 5 for each, which seem quite plausible to me:
San Francisco
- Chris Wanstrath, GitHub
- Tatsuhiko Miyagawa, Six Apart
- Leah Culver, Six Apart
- Square Inc
- Aman Gupta, ruby eventmachine maintainer
London
- James Darling
- London Ruby User Group
- Mark Norman Francis
- Dan Webb (recently moved to Twitter in SF)
- Carlos Villela, Thoughtworks
My choice of metric biases these lists towards connectedness and influence — it can’t measure ability. It’s only measuring GitHub users, and they are biased towards Ruby, Perl and Javascript. But seeing names there that I trust gives me confidence that it’ll help me find interesting people in Berlin.
Hopefully some of those people are reading this blog post right now. Others outside Berlin might be interested to know that Nokia does a superb job of relocating people, with everything taken care of by shipping companies and local agents. If you love the web, Javascript, mobile, user experience, social networks, location, enormous datasets and currywurst, you should get in touch.
Update of the ‘vCard in RDF’ document
Announcing Our Upcoming MeetUp: Business Opportunities from Semantic Technologies with Author David Siegel
English, Knowledge Management, Knowledge Representation, Linked Data, Semantic Web, meetup, semantic web meetup
And we’re back! OpenCalais 4.3 is running on all servers
We are happy to report that we have resolved the bug that we identified in the initial 4.3 release, and that the new and improved OpenCalais 4.3 is up and running on all servers.
As a quick reminder, here are the new features and expanded capabilitites of the OpenCalais service. As always, please let us know if you run into any issues or have any questions.
New in OpenCalais 4.3
Improved ‘Social Tags’: We are expanding on our popular Social Tags categorization technique by adding more generalized, aggregate tags.
For example, if a blogger is comparing the racing performance of sports cars like the Ferrari 308 GTB and Porsche 959, OpenCalais 4.3 will suggest auto racing and motorsport as Social Tags, in addition to the more obvious sports cars.
NEW! ‘News Names’: We are instituting a process of name normalization that represents a first step toward our more robust vision for person disambiguation. Whenever a partial or extended name appears in content, OpenCalais 4.3 will return the names it finds as usual, but will now also suggest the most commonly used form of that same name.
For example, for articles containing Barack Obama, Obama or Barack Hussein Obama, OpenCalais will suggest not only the partial or extended name it found, but also the more frequently used Barack Obama.
New Entities, Facts and Events in English, including:
- New Natural and Manmade Disaster attributes that reveal these disasters’ effects
- Supporting data for upcoming events that will enable OpenCalais to recognize new Movies, Music Albums, etc., as well as anticipated Medical Treatments
- More Political Events and new items such as Diplomatic Relations, Political Endorsements, Poll Results and Voting Results
- Enhanced Person Career extraction that includes political party affiliations where those are included in the text.
The 4.3 release also features improved Simple Format and Microformat outputs, as well as several extraction bug fixes. For technical details, please see the full release notes here.
3.0 insights: Windows phone 7, Mobile world congres en Barcelona y alternativas a Hulu
Filtrado, Valoración, Intermediación crítica del conocimiento, hablábamos hace poco del concepto de Content Curator. Su tarea os recuerda seguro a la de los blogueros/as que trabajamos en la línea de ofrecer lo que podríamos denominar “insights” o “highlights” (prefiero lo primer), visiones rápidas y generativas (que inspiren ideas o actuaciones) sobre lo más relevante en la actualidad tecnológica.
Con la etiqueta “3.0 insights” (“inspiraciones”) iré dejando lo que voy considerando que debemos saber durante esta tercera década de la web.
Estoy en algunos proyectos (en educación, gestión del conocimiento y social media) para los cuales realizo este tipo de actualización, de cura de los últimos contenidos sobre tendencias aparecidos en las principales publicaciones anglosajonas, adaptados a contextos concretos.
Creo, de hecho, que a pesar de que se utilice tan a menudo la falacia de que la tecnología no es importante, su evolución determina también los usos que hacemos de ella, en una cada vez más interesante y compleja interacción entre querer (lo que proyectamos desde lo social) y poder (lo que va permitiendo hacer lo tecnológico). Así que “here we go”, en un nuevo formato, sobre los que considero algunos de los últimos temas o tendencias relevantes:
Actualmente podemos ver Hulu ocultando desde fuera de los US nuestra localización. Pero eso no deja de ser un truco “geek” que no todos/as quieren o pueden llevar a cabo. Por ello, TVGorge lanzaba recientemente un site agregador de otros pero sin ese tipo de restricciones. Existen dudas sobre si es legal, cuestión que creo que solucionan informando de que respetan los derechos de propiedad de terceros (Hulu, CBS, Tv.com, TVDuck, TVGuide, etc…)
Sea como sea y dure lo que dure, la cantidad y calidad de los contenidos que agrega este mashup es impresionante: todos o casi todos los episodios de series de éxito internacional, como Californication, 30 Rock, Heroes, Lost, CSI, Mad Men, Anatomía de Grey, The Simpsons, y hasta 128 shows de TV.
(Via Download Squad)
- Windows Phone 7 (con Windows 7 para dispositivos móviles), será presentado en el Mobile World Congres (antes conocido como 3GSM World Congres). Será en Barcelona, del 15 al 18 de febrero y constituye una de las ferias internacionales más importantes de la industria de la telefonía móvil.
Aplicaciones más interactivas, participativas, potenciación del contenido móvil generado por el usuario, sinergias con la televisión, podrían ser, con los nuevos dispositivos como el nuevo teléfono de Microsoft – Windows, las novedades más destacadas. Un giro de los fabricantes de software hacia el hardware, debido probablemente a la tendencia a los servicios, más que a aplicaciones móviles. Empieza a ser importante, como destaca Vinton Cerf, la vieja lucha, la vieja utopía de los datos y APIs abiertos y la interoperabilidad entre plataformas (podemos verlo también en Ipad y Iprop y la web generativa)
Pueden interesaros, por último, los vídeos de la edición de 2009 en el canal Mobile World Live. Tenéis un resumen en este vídeo:
Feliz domingo.
3.0 insights, Net-art, curiosidades en la red, Planeta educativo, Spanish, congresos, content curator, innovación, lifestreaming, móviles, realida aumentada, telefonía móvil, video-documentales, web3.0

























![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=4251823d-5925-4c7d-8d67-e74c82af33f9)