[OTDev] Dataset RDF

Wed Dec 2 20:30:40 CET 2009

Dear Christoph, All,

Christoph Helma wrote:
> Dear all,
>
> I have tried to implement Ninas proposal for the dataset RDF, but found
> the solution with ot:dataEntry not very straightforward to use. In fact
> I would prefer simple "compound, feature, value" triples in the datasets,
> this seeems to be also easier for merging datasets from differnent
> origins).
>   
Could you explain with more details what are the issues ?   How the
alternative options makes better/worse merging ?    Java implementation
was straightforward and they are already running services.  Merging data
with RDF is one of its strengths, nothing extraordinary is needed.

Please note one can not have simple compound, feature,value in RDF , and
being able to relate features to other ontologies, which was one of the
most important reasons of moving to RDF.
> What we want within a dataset service is to manage a set of graphs
> (every dataset is a graph) and to attach metadata (source, identifier,
> title, ...) to each graph (dataset). So a collection of datasets could
> look like this in N3:
>
> # data
> dataset1 {
> 	compound1, feature, value1.
> 	compound2, feature, value2.
> 	...
> }
>
>   
The problem here, as already explained , is that feature can not be both
predicate and a resource.  Thus, we can't assign other properties to the
feature, nor relate it to objects from other ontologies.

> dataset2 {
> 	compound1, feature, value1.
> 	compound2, feature, value2.
> 	...
> }
> ...
>
> # metadata
> dataset1, owl:title, title1.
> dataset1, owl:souce, source1.
> ...
> dataset2, owl:title, title2.
> ...
>
> A quick search showed that some solutions have been proposed for such a
> situation (e.g. named graphs, contexts, quads).  My impression is that
> storage is not a problem (Redland e.g. has contexts), but XML/RDF does
> not support the straightforward serialization of named graphs (N3 does
> and several extensions of RDF formats have been proposed (Trig, Trix)).
>
>   
IMHO all serializations are equivalent, and if something could be
described in RDF, it could be seiralized in any of its formats.
Otherwise, it is just not valid RDF (one of the reasons is explained above).

> To ensure interoperatibility with other systems I would prefer to stick
> to RDF/XML despite its limitations (although N3, Turtle or JSON would
> fit my taste better).
>   
Serialization is completely irrelevant, the first step is a valid data
model.
> My first idea to circumvent the extensive use of blank nodes is to
> solve the problem on the API level (as it was previously suggested by
> Nina):
>
> GET /dataset/{id} 					would return the plain dataset (without metadata)
> GET /dataset/{id}/metadata	would return the metadata
>   
This is independent of the proposal of RDF representation of objects.
> This would allow us to exchange datasets that contain mainly "compound,
> feature, value" triples in a simple standard format without loosing the
> possibility to obtain dataset metadata.
>   
It would be best if you could design RDF data model with Protege or
other modeling tool , implementing your ideas. Otherwise, there will be
not much sense in using RDF,if it could not be used for importing in
triple stores and use querying. 

> It would also allow algorithm and models services to request data only,
> while applications could work with metadata alone. 
>
> What do you think?
>   
I don't mind changing the proposal, but I would insist on a valid data
model, designed with a modeling tool and able to be serialized to any of
RDF formats, allowing to use any of RDF tools available.

Otherwise we are just designing another custom format under the RDF
disguise.

> Christoph
>
> PS The RDF specifications for models and algorithms worked well so far.
>   
These are relying on Features being resources, not predicates, so it is
not clear how it fits with the compound,feature,value proposal.

Best regards,
Nina
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>