[OTDev] Experiments with RDF

Wed Oct 6 15:32:37 CEST 2010

Hi Nina,

On Wed, 2010-10-06 at 15:52 +0300, Nina Jeliazkova wrote:

> Hi Pantelis,
> 
> I guess there is a typo in your report , where you say "Jena was about 14
> seconds faster than StAX based on 32 successive measurements that are
> presented in the following figure", but on the figure response times using
> Jena (red line) are higher than StAX.

  You're right, that was just a typo! Indeed your implementation of StAX
is faster and I'm also interested in using it for serializing datasets
and other objects into RDF. Could you send me a link and maybe some
hints on how to use your source code? If possible let me know of any
dependencies I need.

> 
> Also, the statement "It was shown that StAX outperforms the internal
> implementation of Jena for parsing RDF documents" is not entirely correct,
> as currently StAX is used for writing (serializing), not for parsing RDF
> documents.
> 

Yes, I will rephrase that.

> Finally, as we discussed off-list, would be good to split the response time
> into a download time and RDF parse time.
> 

That is the next step...

Best regards,
Pantelis

> Best regards,
> Nina
> 
> On Mon, Oct 4, 2010 at 6:12 PM, chung <chvng at mail.ntua.gr> wrote:
> 
> > Hi Christoph,
> >   What is dimension (number of features and compounds) in this dataset?
> > By the way, I have made some more measurements that you will find
> > attached.
> >
> > Best regards,
> > Pantelis
> >
> > On Mon, 2010-10-04 at 10:21 +0200, Christoph Helma wrote:
> >
> > > Dear all,
> > >
> > > I have just returned from holidays. In the attachment I am sending you a
> > > few benchmarks for OWL-DL serialisation for various libraries (RDF.rb,
> > > Redland with Ruby bindings, Redland with SWIG/Ruby bindings, direct
> > > serialisation to strings (ntriples)) I have made before our meeting.
> > > All of them use some internal housekeeping to avoid duplicate triples
> > > (Triples creation ...). Algorithms are not 100% comparable (Objects are
> > > sometimes created during triple creation, sometimes during triples
> > > insertiion), but in general the bottleneck is the creation of the RDF
> > > graph (Triples insertion into model ...). Serialisation itself is rarely
> > > a problem (also not parsing).
> > >
> > > For future experiments I would suggest to share some benchmark datasets
> > > (large, medium, small) - I will still have to read all the
> > > messages/attachments of this thread in detail.
> > >
> > > Best regards,
> > > Christoph
> > > _______________________________________________
> > > Development mailing list
> > > Development at opentox.org
> > > http://www.opentox.org/mailman/listinfo/development
> >
> >
> >
> > _______________________________________________
> > Development mailing list
> > Development at opentox.org
> > http://www.opentox.org/mailman/listinfo/development
> >
> >
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>