Thursday, April 02, 2009

New Bio2RDF query services

The 0.3 release provides the ability to link to licence providers, so the applicable license for a namespace may be available by following a URL. The URL syntax for this is /license/namespace:identifier . It was easier to require the identifier to be present than to not have it. So far, the identifier portion is not being used, so it merely has to be present for the URL resolution to occur, but in future there is the allowance to have different licenses being given based on the identifier, which is useful for databases which are not completely released under a single license.

We provide countlinks and countlinksns which count the number of reverse links to a particular namespace and identifier, from all namespaces, or from within a given namespaces respectively. Currently these only function on virtuoso endpoints due to their use of aggregation extensions to SPARQL. The URL syntax is /countlinks/namespace:identifier and /countlinksns/targetnamespace/namespace:identifier

There is also the ability to count the number of triples in each SPARQL endpoint that point to a given Bio2RDF URI (or its equivalent identifier for non-Bio2RDF SPARQL endpoints). This ability is provided using /counttriples/namespace:identifier

We also provide search and searchns, which attempt to search globally using SPARQL (aren't currently linked to the rdfiser search pages which may be accessed using certain searchns URI's), or search within a particular namespace for text searches. The searches are all performed using the virtuoso fulltext search paradigm, ie, bif:contains, and other sparql endpoints haven't yet been implemented even with regex because it is reasonably slow but it would be simple to construct a query if people thought it was necessary. The URL syntax is /search/searchTerm and /searchns/targetnamespace:searchTerm

The coverage of each of these queries over the current Bio2RDF namespaces can be found here.

If anyone has any (possibly already SPARQL) queries on biology related databases that they regularly execute that can either be parameterised or turned into Pipes then it would be great to include them in future distributions for others to use.

No comments: