[OTDev] Are there some sample dataset services available ?

Christoph Helma helma at in-silico.de
Fri Feb 19 15:21:26 CET 2010


Excerpts from Jörg Kurt Wegner's message of Mon Feb 15 23:52:22 +0100 2010:
> Nina, Surajit,
> 
> > http://ambit.uni-plovdiv.bg:8080/ambit2/dataset
> > The formats  (RDF, MOL, SMILES, CSV, arff, CML) can be retrieved via
> > specifying the corresponding mime type.
> Nice, I admit I am not reading all the posts on this list and you might have
> answered this already earlier.
> Anyway, I gotta ask:
> 
> 1. Some of the data sets are simply empty, at least the first few in the list.
> Why?
> 
> 2. Cross-indexing could be clearly enriched by enabling InChIKeys
> http://www.iupac.org/inchi/release102final.html
> and then using one of the services around for puling more indices and data, e.g.
> http://inchis.chemspider.com/
> http://cactus.nci.nih.gov/chemical/structure
> 
> 3. In other words just in-case some structures might need curation I would
> rather prefer seeing the correct ones pulled from ChemSpider and you just host
> identifiers and tox endpoints ;-)
> 
> 4. Finally, are there json data fetching options, too? I guess this is easier
> for (me) linking multiple sources in a browser, scripting, or wrapper.
> approach. Again, a universal chemistry ID like InChIKey or ChemSpiderID is much
> appreciated.

+1 for JSON/YAML

I have initially used InChiKeys as identifers for compounds but have
reverted to plain InChIs (despite URI encoding problems), because there
is no way to calculate structures from InChiKeys (except by storing them
in a database). I do not understand, why it is necessary to use encryption
instead of say URI safe base64 encoding.

Regards,
Christoph



More information about the Development mailing list