Tuesday, September 12, 2006

Try querying Bio2RDF 50 millions triple store with SeRQL

It is now possible to query Bio2RDF triple store using Sesame's SeRQL query engine. The actual triple store contains all annotations about human and mouse from UniProt, Affymetrix and GeneID. It also contains all GO term definitions and OMIM disease description. It contains 50 millions triples and the native RDF store in sesame weigths 3 Go. Watch for the speed of it ! Thanks for the Sesame team for their great work.

This question illustrates the power of this semantic web technology :

What are the most significant GO terms associated to Paget disease according to UniProt and GeneID annotations using Affymetrix annotations as a crosstable ?
This is the SeRQL corresponding query :

select distinct b1, u1, g1
from
{<http://bio2rdf.org/omim:602080>} <http://bio2rdf.org/bio2rdf#xOMIM> {b},
{b} rdfs:label {b1},
{b} <http://bio2rdf.org/bio2rdf#xOMIM> {c},
{d} <http://bio2rdf.org/affymetrix#xOMIM> {c},
{d} <http://bio2rdf.org/affymetrix#xSwissProt> {u},
{u} rdfs:label {u1},
{u} <http://bio2rdf.org/uniprot#xGO> {g},
{g} rdfs:label {g1}
union
select distinct b1, u1, g1
from
{<http://bio2rdf.org/omim:602080>} <http://bio2rdf.org/bio2rdf#xOMIM> {b},
{b} rdfs:label {b1},
{b} <http://bio2rdf.org/bio2rdf#xOMIM> {c},
{d} <http://bio2rdf.org/affymetrix#xOMIM> {c},
{d} <http://bio2rdf.org/affymetrix#xEntrez_Gene> {u},
{u} rdfs:label {u1},
{u} <http://bio2rdf.org/bio2rdf#xGO> {g},
{g} rdfs:label {g1}


Here is the where to submit it (click on SeRQL-S menu option)
:


Export result in tabulated format and analyse it with your favorite spreadsheet's pivot table tool.


Possibilities are limitless. Enjoy.