[OTDev] Are there some sample dataset services available ?

Nina Jeliazkova nina at acad.bg
Fri Feb 19 20:27:59 CET 2010


Hi ,

surajit ray wrote:
> Hi Nina,
>
> Ok got it working with attached file (included the code from lines : 56-65)
>
> Which brings me to two questions :
>
> a) The SMILE seems to be the value of a feature. Therefore when posting
> dataset URI to the Maxtox algo, should another parameter called feature name
> be included ? (otherwise how would we know which feature contains smiles)
>   
If  owl:sameAS  points to  <http://www.opentox.org/api/1.1#SMILES
<http://ambit.uni-plovdiv.bg:8080/ambit2/feature?sameas=http%3A%2F%2Fwww.opentox.org%2Fapi%2F1.1%23SMILES>> 
, then it is a SMILES.

See 
http://ambit.uni-plovdiv.bg:8080/ambit2/feature?sameas=http%3A%2F%2Fwww.opentox.org%2Fapi%2F1.1%23SMILES


But if you are asking how to retrieve structures, then you should not
rely on this field.  The structures are retrieved by compound URL, which
you could find in RDF in the form
/compound/{id} or
/compound/{id}/conformer/{id}

You could retrieve SMILES or MOL or other representations from these
URLs , providing the corresponding MIME types (see wiki  on dataset)
> b) how do I retrieve the value contained in the Feature node in the
> statement :
>
>                 RDFNode value = fv.getProperty(OT.value).getObject();
>                 out.write(String.format("%s=%s\n",
>                         //Feature
>                         fv.getProperty(OT.feature).getObject().toString(),
>                         //Value
>                         value));
>
> Value is a string representation of the node like so
> : Cc1ccc(N=Nc2c(O)ccc3ccccc23)c(c1)N(=O)O^^
> http://www.w3.org/2001/XMLSchema#string
>
> I would like to retrieve the just the value (the smile string ...)
>   
See above, don't rely on feature field for retrieving structures.  This
is explained in the Compound and Dataset pages of OpenTox API. 


Regards,
Nina
> Thanks
> Surajit
>
> On Fri, Feb 19, 2010 at 11:06 PM, Nina Jeliazkova <nina at acad.bg> wrote:
>
>   
>> Hi Surajit,
>>
>> A quick answer - the problem is  on line 56  OntResource dataset =
>> OT.OTClass.Dataset.getOntClass(jenaModel);
>> Here in dataset variable you have the RDF node that is the declaration
>> of the type Dataset, not the dataset entry itself. Consequently, there
>> are no ot:dataEntry properties and it doesn't iterate over the model.
>>
>> To get the dataset entry itself, you might again use SimpleSelector and
>> look for nodes of type ot:Dataset , i.e.  (null, rdf:type , ot:Dataset
>> )   - replace with Jena types.
>>
>> Regards,
>> Nina
>> surajit ray wrote:
>>     
>>> Hi Nina,
>>>
>>> I was trying to read the dataset off your link in the RDF format .
>>>
>>> Attached is a java prog to do the same. However though in line 59 it does
>>> throw out the jena model ... it is not iterating over the model using the
>>> parsedataset function (all the code I have used is of the OpenTox
>>>       
>> website).
>>     
>>> Could you please tell me what I am doing wrong here ?
>>> (The Java code has a main function so you can run it standalone)
>>>
>>> Thanks
>>> Surajit
>>>
>>> On Fri, Feb 19, 2010 at 7:51 PM, Christoph Helma <helma at in-silico.de>
>>>       
>> wrote:
>>     
>>>       
>>>> Excerpts from Jörg Kurt Wegner's message of Mon Feb 15 23:52:22 +0100
>>>>         
>> 2010:
>>     
>>>>> Nina, Surajit,
>>>>>
>>>>>
>>>>>           
>>>>>> http://ambit.uni-plovdiv.bg:8080/ambit2/dataset
>>>>>> The formats  (RDF, MOL, SMILES, CSV, arff, CML) can be retrieved via
>>>>>> specifying the corresponding mime type.
>>>>>>
>>>>>>             
>>>>> Nice, I admit I am not reading all the posts on this list and you might
>>>>>
>>>>>           
>>>> have
>>>>
>>>>         
>>>>> answered this already earlier.
>>>>> Anyway, I gotta ask:
>>>>>
>>>>> 1. Some of the data sets are simply empty, at least the first few in
>>>>>           
>> the
>>     
>>>> list.
>>>>
>>>>         
>>>>> Why?
>>>>>
>>>>> 2. Cross-indexing could be clearly enriched by enabling InChIKeys
>>>>> http://www.iupac.org/inchi/release102final.html
>>>>> and then using one of the services around for puling more indices and
>>>>>
>>>>>           
>>>> data, e.g.
>>>>
>>>>         
>>>>> http://inchis.chemspider.com/
>>>>> http://cactus.nci.nih.gov/chemical/structure
>>>>>
>>>>> 3. In other words just in-case some structures might need curation I
>>>>>
>>>>>           
>>>> would
>>>>
>>>>         
>>>>> rather prefer seeing the correct ones pulled from ChemSpider and you
>>>>>           
>> just
>>     
>>>> host
>>>>
>>>>         
>>>>> identifiers and tox endpoints ;-)
>>>>>
>>>>> 4. Finally, are there json data fetching options, too? I guess this is
>>>>>
>>>>>           
>>>> easier
>>>>
>>>>         
>>>>> for (me) linking multiple sources in a browser, scripting, or wrapper.
>>>>> approach. Again, a universal chemistry ID like InChIKey or ChemSpiderID
>>>>>
>>>>>           
>>>> is much
>>>>
>>>>         
>>>>> appreciated.
>>>>>
>>>>>           
>>>> +1 for JSON/YAML
>>>>
>>>> I have initially used InChiKeys as identifers for compounds but have
>>>> reverted to plain InChIs (despite URI encoding problems), because there
>>>> is no way to calculate structures from InChiKeys (except by storing them
>>>> in a database). I do not understand, why it is necessary to use
>>>>         
>> encryption
>>     
>>>> instead of say URI safe base64 encoding.
>>>>
>>>> Regards,
>>>> Christoph
>>>> _______________________________________________
>>>> Development mailing list
>>>> Development at opentox.org
>>>> http://www.opentox.org/mailman/listinfo/development
>>>>
>>>>
>>>>         
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Development mailing list
>>> Development at opentox.org
>>> http://www.opentox.org/mailman/listinfo/development
>>>
>>>       
>> _______________________________________________
>> Development mailing list
>> Development at opentox.org
>> http://www.opentox.org/mailman/listinfo/development
>>
>>     
>
>
>
>   




More information about the Development mailing list