[OTDev] OWL-DL performance/scalability problems

Mon Sep 6 11:23:31 CEST 2010

Further reduction of the below RDF/XML can be done with:

On Mon, Sep 6, 2010 at 10:49 AM, Nina Jeliazkova
<jeliazkova.nina at gmail.com> wrote:
> <rdf:RDF

xmlns="http://what.ever.matches/ot:"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"

>  xml:base="http://ambit.uni-plovdiv.bg:8080/ambit2/">
>
>  <ot:Dataset rdf:about="dataset/1">

including that first @xmlns on the root element would make this simply:

<Dataset rdf:about="dataset/1">

>      <ot:dataEntry>
>      <ot:DataEntry>
>        <ot:values>
>          <ot:FeatureValue>
>            <ot:value rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>            >formaldehyde</ot:value>

The first + second @xmlns:xsd would conver this line into:

<value rdf:datatype="xsd:string">formaldehyde</ot:value>

Though you could also argue that the rdf:datatype can be left out, as
that knowledge would be encoded in the ontology...

Otherwise, I am still not sure I understand where the exact bottleneck
is... this exercise seems to indicate it is the volume of the RDF/XML
serialization...

Christoph, what does your full dataflow look like? How often is the
RDF/XML serialized and deserialized? What generates the data and how
to do create the RDF? Would it be possible to skip RedLand and any
other RDF library at all? That is, it is merely serialization, or do
you process the data too somewhere?

Egon

-- 
Dr E.L. Willighagen
Post-doc @ Uppsala University (only until 2010-09-30)
Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers