[OTDev] [quixote-qcdb] Re: [BlueObelisk-discuss] Quixote project for computational chemistry and future possibilities for Blue Obelisk

Peter Murray-Rust pm286 at cam.ac.uk
Mon Oct 25 12:28:34 CEST 2010


This is tremendous!

On Mon, Oct 25, 2010 at 10:33 AM, Nina Jeliazkova <jeliazkova.nina at gmail.com
> wrote:

>
>
> On 25 October 2010 12:16, Egon Willighagen <egon.willighagen at gmail.com>wrote:
>
>> On Mon, Oct 25, 2010 at 11:13 AM, Nina Jeliazkova
>> <jeliazkova.nina at gmail.com> wrote:
>> >> Perhaps as a start the data can just be pulled and put in one of the
>> >> current OpenTox servers?
>> >
>> > No problem with this - do you have in mind particular piece of data to
>> start
>> > with ?
>>
>> Check here... http://quixote.wikispot.org/Front_Page
>>
>
>
> OK, thanks.
>
> OpenTox dataset services accepts CML for upload (it's just POST via
> OpenTox API), so this is trivial.
>

We see CML as the primary data representation for complex objects
(molecules, spectra, matrices, etc.) but are also creating RDF for the
annotations and for searching

>
> This one
>
>
> http://neptuno.unizar.es/files/public/datasets/Quixote/examples/HCO-L-Ala-NH2.cml
>
> is already here
>
> http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/5552
>
> It could be GET in several formats , including RDF, CML, MDL SMILES ( HTTP
> Accept header)
>

We also use content negotiation

>
> For gaussian files it should be feasible as well, does CDK reads gaussian
> files ?
>

JUMBO does and the transofrmation is designed to be lossless by using a
dictionary structure.

>
> Once uploaded it automatically becomes structure searchable and searchable
> by any properties in the uploaded file, not mentioning all OpenTox
> algorithms can be applied.
>

WOW!!!!!!! Windmills here we come!

This is now unstoppable.

Basically the Blue Obelisk has - over the years - built the components.
Because they use interoperable technology it's easy to link them. Because we
are delighted to re-use other people's work we scale rapidly. Because we
cover all the major areas we are now in N**2 mode at least.

[To make the compchem clear - the CML representation is designed to capture
the whole logfiles - frequencies, nmr, energies, convergence, populations,
etc. This is much more than the extraction of coordinates. Also, because we
use dictionaries it becomes possible to search for properties calculated by
any code. Please join the Quixote effort if you are interested in creating
or re-using compchem results.]



P.

-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069



More information about the Development mailing list