[OTDev] Dataset RDF

Christoph Helma helma at in-silico.de
Wed Dec 2 20:05:58 CET 2009


Dear all,

I have tried to implement Ninas proposal for the dataset RDF, but found
the solution with ot:dataEntry not very straightforward to use. In fact
I would prefer simple "compound, feature, value" triples in the datasets,
this seeems to be also easier for merging datasets from differnent
origins).

What we want within a dataset service is to manage a set of graphs
(every dataset is a graph) and to attach metadata (source, identifier,
title, ...) to each graph (dataset). So a collection of datasets could
look like this in N3:

# data
dataset1 {
	compound1, feature, value1.
	compound2, feature, value2.
	...
}

dataset2 {
	compound1, feature, value1.
	compound2, feature, value2.
	...
}
...

# metadata
dataset1, owl:title, title1.
dataset1, owl:souce, source1.
...
dataset2, owl:title, title2.
...

A quick search showed that some solutions have been proposed for such a
situation (e.g. named graphs, contexts, quads).  My impression is that
storage is not a problem (Redland e.g. has contexts), but XML/RDF does
not support the straightforward serialization of named graphs (N3 does
and several extensions of RDF formats have been proposed (Trig, Trix)).

To ensure interoperatibility with other systems I would prefer to stick
to RDF/XML despite its limitations (although N3, Turtle or JSON would
fit my taste better).

My first idea to circumvent the extensive use of blank nodes is to
solve the problem on the API level (as it was previously suggested by
Nina):

GET /dataset/{id} 					would return the plain dataset (without metadata)
GET /dataset/{id}/metadata	would return the metadata

This would allow us to exchange datasets that contain mainly "compound,
feature, value" triples in a simple standard format without loosing the
possibility to obtain dataset metadata.

It would also allow algorithm and models services to request data only,
while applications could work with metadata alone. 

What do you think?
Christoph

PS The RDF specifications for models and algorithms worked well so far.



More information about the Development mailing list