[OTDev] API extension summary
Nina Jeliazkova nina at acad.bgMon Jan 18 10:46:03 CET 2010
- Previous message: [OTDev] Questions to feature generation and feature selection
- Next message: [OTDev] API extension summary
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello All, Some discussion points for today meeting: 1. Data processing Algorithms. All algorithms are subclasses of http://www.opentox.org/api/1.1#Algorithm Generic input parameters: dataset_uri (as with other algorithms) parameters a) Data cleanup algorithms. Algorithm, which is a subclass of http://www.opentox.org/algorithms.owl#DataCleanup input parameters: generic output parameters: dataset_uri b) Feature selection algorithms , subclass of http://www.opentox.org/algorithmTypes.owl#FeatureSelection input parameters: generic output parameters: feature_uris[] c)Supervised learning algorithms , subclass of http://www.opentox.org/algorithmTypes.owl#Supervised input parameter: prediction_feature output parameters: dataset_uri d)Descriptor calculation algorithms subclass of http://www.opentox.org/algorithms.owl#DescriptorCalculation input parameters: generic output parameters: dataset_uri http://opentox.org/dev/apis/api-1.1/Algorithm entry is (partially) updated 3) How to identify features, generated by an algorithm and specific set of parameters: According to current opentox.owl, a Feature can be assigned Algorithm, Model or Dataset as its origin (via property ot:hasSource). There is no support for Algorithm + Parameters, except if the specific case of a Model can be regarded as Algorithm + Parameter instance. One possible solution could be: - define superclass A, which is determined by Algorithm + Parameters - Make Model subclass of A - define domain of ot:hasSource as classes A and Dataset - Find a nice name for the superclass A This will be searchable via ontology service. Question: Can we directly use Model to denote descriptors, especially descriptors, which require datasets to be calculated? 3. Dataset API Reminder: the dataset API 1.1 allows specifying feature URI and compound URI on GET operations: http://opentox.org/dev/apis/api-1.1/dataset Query a dataset GET /dataset/{id} *compound_uris[]* and/or *feature_uris[]* to select compounds and features; These are very flexible means to get slices of a dataset (features = columns, compounds = rows ), or merging data across different datasets, without the need to download/upload dataset content. However, there have been some concerns, regarding the length of the URL. The proposal is to extend the same approach to allow POST and PUT operations to specify datasets via dataset_uri, compound_uris and feature_uris. Create a new dataset POST /dataset Dataset representation in a supported MIME type. MIME type to be specified via *Content-type* header. New URI /dataset/{id} or redirect to task URI (for large uploads) 200,202,400,503 Update a dataset PUT /dataset/{id} Data representation in a supported MIME type; entries for existing compound/feature pairs will be overwritten, entries for new compound/features will be added Dataset or task URI 200,202,400,404,503 *Proposal: * 3.1. If MIME type is *application/www-form-urlencoded*, allow dataset_uri , feature_uris[] and compound_uris[] are input parameter for PUT and POST operations. This will facilitate assigning new dataset id to client specified subsets of data. URL length is not an issue anymore, since parameters are passed via POST content body. example: POST /dataset dataset_uri=http://myservice/dataset/1 feature_uris[]=/selectedfeature1 feature_uris[]=/selectedfeature2 3.2. For file uploads, agree on fixed name for file upload parameter in *application/www-form-urlencoded *- e.g. *file_upload*. When uploading content other than RDF (e.g. MOL, SDF, SMILES), there are currently no means how to assign metadata (even file name is not available when POSTing content other than RDF). 4. Query API. There is currently no agreed API on querying for . There are some custom implementations: Query for property/identifier value http://ambit.uni-plovdiv.bg:8080/ambit2/compound?property=CAS&search=50-00-0 <http://ambit.uni-plovdiv.bg:8080/ambit2/compound?search=55-55-0> or /compound?search=phenolphthalein <http://ambit.uni-plovdiv.bg:8080/ambit2/compound?search=phenolphthalein> Proposal: /compound?search=value&sameas=http://url_from_an_ontology , e.g. /compound?search=50-00-0&sameas=http://www.opentox.org/api/1.1#CASRN Substructure /query/smarts?search=c1ccccc1O&max=100 <http://ambit.uni-plovdiv.bg:8080/ambit2/query/smarts?search=c1ccccc1O&max=100> Similarity /query/similarity?search=c1ccccc1&threshold=0.8 <http://ambit.uni-plovdiv.bg:8080/ambit2/query/similarity?search=c1ccccc1&threshold=0.8> AFAIK, IST implementation uses /compound/{id} API , which seems reasonable for first two cases, but there might be issues with embedding non-ascii symbols in {id} (e.g. InChI , Smiles) Best regards, Nina
- Previous message: [OTDev] Questions to feature generation and feature selection
- Next message: [OTDev] API extension summary
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list