[OTDev] Uploading non-standard datasets

chung chvng at mail.ntua.gr
Mon Sep 27 17:03:34 CEST 2010


On Mon, 2010-09-27 at 20:03 +0530, surajit ray wrote:
> Hi,
> 
> Well having them as features will not cut it - for the simple reason
> that the "feature" in this case belongs to the the input dataset (or
> whichever set is being worked upon). 

As far as I know as it is generally conceived in OpenTox and as far as
the implementation in AMBIT is concerned, features are separated from
datasets and can be standalone. That is, you can have a feature that
does not appear in any datasets. You might only have a pointer to a
dataset using the object property 'ot:hasSource' but this does not
somehow bind the feature to the dataset. However, I'm not sure if I
understood well.

> The compound itself may not have
> a substructure but it may be a a part of a dataset which when examined
> will have the substructure appearing while doing the pairwise
> comparisons.

If you need to establish a relationship between a such a feature and a
compound (so that given the feature you can retrieve the
fragment/compound to which it refers in any supported MIME type), then
we can extend the range of the property ot:hasSource to include also
ot:Compound and assign a compound URI to such features. i.e. something
like:

/feature/123 
	a ot:Feature
	ot:hasSource /compound/435

But then I'm not sure whether the following are also needed:

1. Declare that the feature above is a ot:SubstructureFeature (new) or
at least declare that it is boolean.

2. Make it explicit that the above compound is a ot:Fragment (new)

Maybe we can go without introducing extra classes. 

> 
> Using the features system in this manner is not (IHMO) the solution to
> this problem. It will be really be cumbersome to maintain the feature
> URIs which may number in many thousands and will be extremely
> transient. In effect it will be lot of resources being hogged by a
> system which could do with a much more simpler implementation.

That is not really a problem. A feature is a very small entry in a
database. There are enterprises that maintain databases of some tens of
TeraBytes or even more.

> Moreover a certain feature in such a system will be a part of a
> compound if its a part of Dataset A and may not be a part of the same
> compound when examined in Dataset B.
> 

This is true. For example if a compound does not contain C=O it is
obvious it will not contain CC=O or in general RC=O.

> Summing up heres a few things I would like in the next API
> 
> a) Ability to upload bulk compounds from scratch, using a dataset
> construct (and not posting single compounds)

I think this is supported. You can POST a dataset with a set of new
compounds. If one or more compounds are not found in the database of the
server they should be created.

> b) Ability to assign features to datasets

You mean "to append" features or have some structured meta information
about the dataset itself?

> c) Ability to have non-standard datasets/compounds which contain
> substructures rather than molecules.
> 
> Regards
> Surajit

Best regards,
Pantelis
> 
> On 27 September 2010 18:31, chung <chvng at mail.ntua.gr> wrote:
> > Hi Surajit,
> >   As far as I can understand you have a problem similar to the one I
> > was discussing with Alexey from IBMC. You need  a way to define which
> > substructures are present in a certain structure. For this purpose you
> > have to use features and not compounds. So you need a collection of
> > features each one of which corresponds to a certain substructure.
> > However in Ambit you can create a new compound by POSTing it
> > to /compound in a supported MIME (e.g. SMILES, SDF etc) for example
> > 'curl -X POST --data-binary @/path/to/file.sdf -H Content-type:blah/blah
> > +sdf http://someserver.com/compound'. What is needed in OpenTox though
> > is a collection of substructures in a feature service and a way to
> > lookup for a certain feature according to its structure (e.g. providing
> > its SMILES representation).
> >
> > Best Regards,
> > Pantelis
> >
> > On Mon, 2010-09-27 at 14:18 +0530, surajit ray wrote:
> >
> >> Hi Nina,
> >>
> >> Need to upload some fragments (have smile representations) into a
> >> dataset. Is this possible in the current framework ?
> >>
> >> To be more elaborate -
> >> Currently I am uploading a dataset with compounds as the links to the
> >> respective compound URIs (which happens at the end of the online
> >> MaxtoxTest service). How would I upload new compounds (with smile/mol
> >> representations) ? And secondly if these (the upload set) happen to be
> >> fragments (and not molecules) is there a way to store such information
> >> using the ambit dataset service ?
> >>
> >> Thanx
> >> Surajit
> >> _______________________________________________
> >> Development mailing list
> >> Development at opentox.org
> >> http://www.opentox.org/mailman/listinfo/development
> >>
> >
> >
> > _______________________________________________
> > Development mailing list
> > Development at opentox.org
> > http://www.opentox.org/mailman/listinfo/development
> >
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
> 





More information about the Development mailing list