[OTDev] Dataset RDF

Christoph Helma helma at in-silico.de
Thu Dec 3 19:21:52 CET 2009


Dear Nina, All,
> > Maybe some of my confusion arises also from the fact that
> >
> > - I have to insert triples (and create anonymous nodes) "by hand" with
> >   Redland (AFAIK there is no automated mechanism to create more complex
> >     statements - but the documentation is very sketchy)
> >   
> Same for other languages - I have put some examples last days at
> http://opentox.org/data/documents/development/RDF%20files/JavaOnly/JenaExamples,
> <http://opentox.org/data/documents/development/RDF%20files/JavaOnly/JenaExamples
> >
> these should be more or less similar for all languages.
> <http://opentox.org/data/documents/development/RDF%20files/JavaOnly/JenaExamples

I will have look at the examples - JavaOnly did put me off ;-)

> 
> > - I have problems to translate the syntactic sugar of your examples into
> >   bare-bones triples
> >   
> Well, this is a good point, I can add examples in NTriple format.  
> Personally, I switch into "triple" mode in Protege to examine triples.

I will try that (although I prefer to work with a text editor).

> > I am proposing two things to reduce the complexity:
> >   
> 
> > The most straightforward solution to handle sets of graphs (i.e.
> > multiple datasets) is to use named graphs, context, quadruples (you name
> > it - the concepts are more or less the same). Most RDF
> > libraries/datastores support this, but it is not straightforward to
> > express these concepts in RDF/XML. Instead of using a workaround that
> > complicates things, I would suggest to let the dataset service handle
> >   
> It is the recommended way to create data models with triples, one could
> model lot more complicated things with simple predicate logic...
> > datasets (see my previous post). The beneficial side effect is, that
> > we can simplify the RDF model to a large extend. The first dataset
> >   
> Thus we simplify the syntax, with the expense of  losing an essential
> functionality , which was the original reason to use RDF.  

I am not sure, if this is true, as it is easy to merge RDFs from differnt
origins (i.e. datsets) on demand.

> Regarding the quads, IMHO , it complicates the setup, because we can't
> use the most popular serialization formats and not all libraries have
> support for contexts.

Yes, my point (although I would prefer them if they would be more
"standard").

> And we have a rather simple data structure (set with some structured
> entries within), which needs just one additional predicate to be modeled
> without involving named graphs.
> 
> In fact I've tried couple of times to simplify the current proposal in
> Protege, but without success. This is just a non-binary relationship,
> which can't be modeled with single predicate.  One can try using
> rdfs:Containers for dataset, instead of predicate relating dataset and
> dataentry, but this results in going into OWL-Full  language, where
> automatic reasoning is much harder than OWL-DL. 
> Advice from experts is highly appreciated. 
> > example can be eg. rewritten without any loss of information as
> >
> > # multiple features/compound, simple features
> >     <http://myservice/compound/{id1}> dsstox:MultiCellCall "true"^^xsd:boolean .
> >     <http://myservice/compound/{id1}> lazar:MultiCellCallPredicted "true"^^xsd:boolean .
> >
> > (assuming that dsstox:MultiCellCall, lazar:MultiCellCallPredicted
> > provides the feature definitions).
> >   
> This is what I am trying  to tell since a while - the assumption is
> wrong. One can't mix predicates and objects.  Once you have used
> dsstox:MultiCellCall in the place of predicate (property), it can't be
> considered a resource anymore, you can't have statements
> dsstox:MultiCellCall  owl:sameAs something, nor dsstox:MultiCellCall 
> dc:title "something" nor  dsstox:MultiCellCall  ot:units "something" .
> You can't relate this feature to Models, Validation objects, etc.

This is the point I did (do) not understand. I had the assumption that
subjects, predicates and objects are URIs and that it is ok to make
statements about predicate URIs by inserting them as subjects.  I have
also tried to make statements about predicate URIs in Redland without
getting errors. I do not know however, if this has any effect on
querying/reasoning.

> If we go this direction, we simply abandon the power of RDF/OWL
> (querying, reasoning) for features/datasets and are treating it as pure
> serialization format, not much different than ARFF or  MS Excel.  We
> could have stayed with XML as well and not lose couple of months for
> educating ourselves.
> 
> If it is fine for other partners, OK.  Implementation-wise there is not
> problem for ambit, I am not changing the internal structures anyway,
> just adding more code to generate different serializations.  But we just
> lose lot of nice querying options , ability to linking to external
> ontologies, etc.

If we move to RDF, we definitively should not treat it as another
serialization format, but use these facilities. Maybe you can explain in
more detail, why it is impossible to make statements about URIs, that
have been used as predicates (just for my own understanding of RDF).

Best regards,
Christoph



More information about the Development mailing list