[OTDev] API extension summary
Tobias Girschick tobias.girschick at in.tum.deMon Jan 18 14:12:00 CET 2010
- Previous message: [OTDev] API extension summary
- Next message: [OTDev] [Fwd: Re: Feature Generation Algorithms: Avoiding duplicates]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Pantelis, All, > [...] Starting with a brief definition, a DoA service tells whether a > > compound can be used by a model. I don't agree here. The service should tell whether a model can be used to predict a compound that is in the model's AD. > So, at first, a POST operation at : > > > > /algorithm/{doa_id} > > > > of a model_uri, will return another model uri. Clients can now post > > datasets on that model uri to get another dataset which has an extra > > feature (call it for instance http://sth.com/feature/doa ) which is > > boolean and 1 corresponds to "compound belongs to the doa of the > > underlying models" and to the opposite. I am not sure I do understand the process you propose completely. Can you give a detailed work flow? The extra feature you are proposing will be dependent on the training dataset of the prediction model. This will have to be encoded. > > > Seems fine, besides that output of AD model might be probability -based, > rather than yes/no. Agreed. > This could be handled via multiple feature_uris, > returned by the model. > > This proposal assumes no modification of the current API and if there is > > no objection on that we could implement a doa service based on the > > method of "leverages". > > > If accepted, algorithm types ontology will need to be extended with a > subclass for applicability domain. Agreed. Best regards, Tobias > > Best regards, > Nina > > > > Best regards, > > Pantelis > > > > > > > > On Mon, 2010-01-18 at 11:46 +0200, Nina Jeliazkova wrote: > > > >> Hello All, > >> > >> Some discussion points for today meeting: > >> > >> 1. Data processing Algorithms. All algorithms are subclasses of > >> http://www.opentox.org/api/1.1#Algorithm > >> > >> Generic input parameters: > >> dataset_uri (as with other algorithms) > >> parameters > >> > >> a) Data cleanup algorithms. Algorithm, which is a subclass of > >> http://www.opentox.org/algorithms.owl#DataCleanup > >> input parameters: generic > >> output parameters: dataset_uri > >> > >> b) Feature selection algorithms , subclass of > >> http://www.opentox.org/algorithmTypes.owl#FeatureSelection > >> input parameters: generic > >> output parameters: feature_uris[] > >> > >> c)Supervised learning algorithms , subclass of > >> http://www.opentox.org/algorithmTypes.owl#Supervised > >> input parameter: prediction_feature > >> output parameters: dataset_uri > >> > >> d)Descriptor calculation algorithms subclass of > >> http://www.opentox.org/algorithms.owl#DescriptorCalculation > >> > >> input parameters: generic > >> output parameters: dataset_uri > >> > >> http://opentox.org/dev/apis/api-1.1/Algorithm entry is (partially) updated > >> > >> > >> 3) How to identify features, generated by an algorithm and specific set > >> of parameters: > >> > >> According to current opentox.owl, a Feature can be assigned Algorithm, > >> Model or Dataset as its origin (via property ot:hasSource). There is > >> no support for Algorithm + Parameters, except if the specific case of a > >> Model can be regarded as Algorithm + Parameter instance. > >> > >> One possible solution could be: > >> - define superclass A, which is determined by Algorithm + Parameters > >> - Make Model subclass of A > >> - define domain of ot:hasSource as classes A and Dataset > >> - Find a nice name for the superclass A > >> > >> This will be searchable via ontology service. > >> > >> Question: Can we directly use Model to denote descriptors, especially > >> descriptors, which require datasets to be calculated? > >> > >> 3. Dataset API > >> Reminder: the dataset API 1.1 allows specifying feature URI and compound > >> URI on GET operations: > >> > >> http://opentox.org/dev/apis/api-1.1/dataset > >> Query a dataset GET /dataset/{id} *compound_uris[]* and/or *feature_uris[]* to select compounds and features; > >> > >> These are very flexible means to get slices of a dataset (features = columns, compounds = rows ), or merging data across different datasets, without the need to download/upload dataset content. > >> > >> However, there have been some concerns, regarding the length of the URL. The proposal is to extend the same approach to allow POST and PUT operations to specify datasets via dataset_uri, compound_uris and feature_uris. > >> > >> > >> Create a new dataset POST /dataset > >> Dataset representation in a supported MIME type. MIME type to be > >> specified via *Content-type* header. > >> New URI /dataset/{id} or redirect to task URI (for large uploads) > >> 200,202,400,503 > >> > >> Update a dataset PUT /dataset/{id} > >> Data representation in a supported MIME type; entries for existing > >> compound/feature pairs will be overwritten, entries for new > >> compound/features will be added > >> Dataset or task URI > >> 200,202,400,404,503 > >> > >> > >> *Proposal: * > >> 3.1. If MIME type is *application/www-form-urlencoded*, allow > >> dataset_uri , feature_uris[] and compound_uris[] are input parameter for > >> PUT and POST operations. This will facilitate assigning new dataset > >> id to client specified subsets of data. URL length is not an issue > >> anymore, since parameters are passed via POST content body. > >> > >> example: > >> POST /dataset > >> dataset_uri=http://myservice/dataset/1 > >> feature_uris[]=/selectedfeature1 > >> feature_uris[]=/selectedfeature2 > >> > >> 3.2. For file uploads, agree on fixed name for file upload parameter > >> in *application/www-form-urlencoded *- e.g. *file_upload*. > >> When uploading content other than RDF (e.g. MOL, SDF, SMILES), there are > >> currently no means how to assign metadata (even file name is not > >> available when POSTing content other than RDF). > >> > >> 4. Query API. There is currently no agreed API on querying for . > >> There are some custom implementations: > >> > >> Query for property/identifier value > >> http://ambit.uni-plovdiv.bg:8080/ambit2/compound?property=CAS&search=50-00-0 > >> <http://ambit.uni-plovdiv.bg:8080/ambit2/compound?search=55-55-0> > >> or > >> /compound?search=phenolphthalein > >> <http://ambit.uni-plovdiv.bg:8080/ambit2/compound?search=phenolphthalein> > >> > >> Proposal: /compound?search=value&sameas=http://url_from_an_ontology , e.g. > >> > >> > >> /compound?search=50-00-0&sameas=http://www.opentox.org/api/1.1#CASRN > >> > >> Substructure > >> /query/smarts?search=c1ccccc1O&max=100 > >> <http://ambit.uni-plovdiv.bg:8080/ambit2/query/smarts?search=c1ccccc1O&max=100> > >> > >> Similarity > >> /query/similarity?search=c1ccccc1&threshold=0.8 > >> <http://ambit.uni-plovdiv.bg:8080/ambit2/query/similarity?search=c1ccccc1&threshold=0.8> > >> > >> > >> > >> AFAIK, IST implementation uses /compound/{id} API , which seems > >> reasonable for first two cases, but there might be issues with embedding > >> non-ascii symbols in {id} (e.g. InChI , Smiles) > >> > >> Best regards, > >> Nina > >> > >> > >> > >> _______________________________________________ > >> Development mailing list > >> Development at opentox.org > >> http://www.opentox.org/mailman/listinfo/development > >> > >> > > > > > > _______________________________________________ > > Development mailing list > > Development at opentox.org > > http://www.opentox.org/mailman/listinfo/development > > > > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development -- Dipl.-Bioinf. Tobias Girschick Technische Universität München Institut für Informatik Lehrstuhl I12 - Bioinformatik Bolzmannstr. 3 85748 Garching b. München, Germany Room: MI 01.09.042 Phone: +49 (89) 289-18002 Email: tobias.girschick at in.tum.de Web: http://wwwkramer.in.tum.de/girschick
- Previous message: [OTDev] API extension summary
- Next message: [OTDev] [Fwd: Re: Feature Generation Algorithms: Avoiding duplicates]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list