[OTDev] Experiments with RDF

Nina Jeliazkova jeliazkova.nina at gmail.com
Fri Oct 1 07:32:36 CEST 2010


Hi Pantelis, All,


On Thu, Sep 30, 2010 at 9:33 PM, chung <chvng at mail.ntua.gr> wrote:

> Hi all,
>  During a round (rectangle in fact) table discussion in Rhodes we
> questioned the efficiency of web services based on RDF and in particular
> its OWL-DL variant. I gathered some statistics using ToxOtis while
> experimenting with downloading and parsing datasets. Also we've tested
> the performance  of ToxOtis in converting dataset objects into weka
> objects (weka.core.Instances); the latter are useful to users of Weka.
> These are preliminary results and we must not jump into conclusions but
> we can start a discussion around some performance issues. Java
> developers may use ToxOtis as a kind of client-profiler for their
> services. Find attached a draft report that attempts to correlate the
> size of a dataset with the computational effort needed for its parsing.
>


Would it be possible to run further experiments - in particular:

- Split the reported time into time, necessary to download the RDF
representation from the server, and time, necessary to parse and load the
RDF as Jena object.  The reason for asking is these two parts can be
optimized by different approaches (minimizing file size by prefixing or
compression for the former and exploiting different Jena storage models for
the later).

- Report time to parse RDF into different in-memory Jena models (ones from
http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/ontology/OntModelSpec.html(not
sure which is being used in the  tests now)

- Report timings, using slightly different approach to convert to weka
instances, namely , retrieve URIs of compounds first and then retrieve
features for each compound in subsequent calls.

- Report timings, when using Jena persistent storages , instead of in-memory
one (http://openjena.org/TDB/, http://openjena.org/SDB/ )

If we find an optimal setting after these experiments, the next step would
be trying to work with datasets, comparable with size to the raw malaria
data.  Ideally, would be nice to compare with RDF libraries, other than
Jena, but this may require more efforts.

Best regards,
Nina



>
> Best regards,
> Pantelis S.
>
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>
>



More information about the Development mailing list