[OTDev] Some Questions
chung chvng at mail.ntua.grMon Dec 21 11:01:23 CET 2009
- Previous message: [OTDev] Some Questions
- Next message: [OTDev] Some Questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Nina, On Mon, 2009-12-21 at 01:36 +0200, Nina Jeliazkova wrote: > Dear Pantelis, All, > > chung wrote: > > > On Fri, 2009-12-18 at 11:59 +0200, Nina Jeliazkova wrote: > > > > > > > Dear Pantelis, All, > > > > > > chung wrote: > > > > > > > > > > However... > > > > > > > > * When building a new model, the service has to generate a new feature > > > > that will be used as a pointer to the predictions made by this model. Is > > > > this operation implemented? > > > > > > > > > > > > > > > > > > Provided we don't have currently API for creating new features (these > > > are more or less embedded in datasets), there are two options I can > > > think of: > > > > > > 1) Extend the API to include Feature creation (independent of a > > > dataset), e.g. /feature POST rdf-representation-of-a-feature (similar > > > to feature_definition POST in API 1.0) > > > > > > > > > > > > The current version of the API (see > > http://opentox.org/dev/apis/api-1.1/Feature ) includes that operation. > > The client posts an RDF representation and creates a new feature URI. > > > > > > OK. This is now implemented in > http://ambit.uni-plovdiv.bg:8080/ambit2/feature . POST any RDF > representation of a Feature object, as specified in opentox.owl with > application/rdf+xml , or text/n3 content type to create a new > feature. > > > > 2) Embed the RDF representation of the new feature into the RDF > > > representation of the dataset, which is then POST-ed to a dataset service. > > > > > > e.g. if a feature is described with triples and there is no feature > > > URI, the dataset service will assume a new feature is to be created; > > > > > > > > > > > > That's also a solution, but the embedded feature will not be globally > > accessible. I think the first solution is better. > > > > > > I was assuming the dataset service will create a feature with globally > accessible URI. This also could be used with > http://ambit.uni-plovdiv.bg:8080/ambit2/dataset > > To create a dataset, post an RDF representation of a dataset to > http://ambit.uni-plovdiv.bg:8080/ambit2/dataset with content-type > application/rdf+xml , or text/n3. > > > > In fact the two options are not mutually exclusive, with the first more > > > generic and the later requiring a single POST call to the dataset > > > service, instead of two (one for feature creation and one for posting a > > > dataset). > > > > > > > > > > > > Indeed in the first case we need two requests but the feature creation > > will not be very time consuming and the Content-length the client has to > > post is not that big, so I think the first option is not prohibitive. > > > > > > > > > Are there any preferences? > > > > > > > > > > > > I'm in favor of the first solution. So if there are no objections I > > think we should proceed this way (because we're running out of time...) > > > > > > OK, please test any of the above. > I managed to create some features on the ambit server using the following curl command: curl -v -i -X POST -H 'Content-type:application/rdf+xml' -d ' <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ot="http://www.opentox.org/api/1.1#" xmlns:j.0="http://purl.org/net/nknouf/ns/bibtex#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" > <rdf:Description rdf:about="http://www.opentox.org/api/1.1#hasSource"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/> </rdf:Description> <rdf:Description rdf:about="http://www.opentox.org/api/1.1#units"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DatatypeProperty"/> </rdf:Description> <rdf:Description rdf:about="http://ambit.uni-plovdiv.bg:8080/ambit2/reference/11889"> <rdfs:seeAlso>http://sth.com/feature/0</rdfs:seeAlso> <dc:title>http://sth.com/feature/0</dc:title> <dc:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://ambit.uni-plovdiv.bg:8080/ambit2/reference/11889</dc:identifier> <rdf:type rdf:resource="http://purl.org/net/nknouf/ns/bibtex#Entry"/> </rdf:Description> <rdf:Description rdf:about="http://www.opentox.org/api/1.1#Feature"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/> </rdf:Description> <rdf:Description rdf:about="http://purl.org/net/nknouf/ns/bibtex#Entry"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/> </rdf:Description> <rdf:Description rdf:about="http://ambit.uni-plovdiv.bg:8080/ambit2/feature/13001"> <dc:type>http://www.w3.org/2001/XMLSchema#string</dc:type> <ot:hasSource rdf:resource="http://ambit.uni-plovdiv.bg:8080/ambit2/reference/11889"/> <owl:sameAs>http://www.opentox.org/api/1.1#http://sth.com/feature/0</owl:sameAs> <ot:units></ot:units> <dc:title>http://sth.com/feature/0</dc:title> <dc:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://ambit.uni-plovdiv.bg:8080/ambit2/feature/13001</dc:identifier> <rdf:type rdf:resource="http://www.opentox.org/api/1.1#Feature"/> </rdf:Description> </rdf:RDF>' http://ambit.uni-plovdiv.bg:8080/ambit2/feature This generates a feature with a name I specify myself (that is http://ambit.uni-plovdiv.bg:8080/ambit2/feature/13001 ). Is it possible that this name is automatically assigned by the service to the created feature resource? > > After several trials and errors, I finally managed to use an ambit > dataset to create MLR model, as specified here > https://opentox.ntua.gr/index.php?p=guide > > It seems the NTUA algorithm service expects parameters dataset_uri and > target to be within the posted content, rather than in the URL (my > initial assumption). Do we have this specified in the API ? I think this is compliant with http://opentox.org/dev/apis/api-1.1/Model (Is it?). I assume that the target is a parameter of the algorithm defined within the RDF representation of the algorithm. These parameters are provided within the posted content (-d 'dataset_uri=...&target=... ). > > It would help with troubleshooting if in case of missing input the > service return client_error_bad_request with some explanation, than > internal server error (500). > > Here is the successful call > 1) curl -X POST -d > 'dataset_uri=http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/30&target=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/12913' http://opentox.ntua.gr:3000/algorithm/mlr > > The dataset itself is a copy of http://opentox.ntua.gr/ds.rdf, created > via POSTing its RDF/XMLrepresentation to > http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/ This request fails in the case of svm models on opentox.ntua.gr but it works fine on my localhost. I will deploy the latest version and I think this will fix any bugs. > > 2)Unsuccessful call - here the dataset contains not only numerical, > but also string columns. > > ambit:/home/nina# curl -X POST -d > 'dataset_uri=http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/6&target=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11951' http://opentox.ntua.gr:3000/algorithm/mlr -v > * About to connect() to opentox.ntua.gr port 3000 (#0) > * Trying 147.102.82.32... connected > * Connected to opentox.ntua.gr (147.102.82.32) port 3000 (#0) > > POST /algorithm/mlr HTTP/1.1 > > User-Agent: curl/7.18.2 (x86_64-pc-linux-gnu) libcurl/7.18.2 > OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.8 libssh2/0.18 > > Host: opentox.ntua.gr:3000 > > Accept: */* > > Content-Length: 122 > > Content-Type: application/x-www-form-urlencoded > > > < HTTP/1.1 500 empty String > < Content-Type: text/html; charset=ISO-8859-1 > < Content-Length: 284 > < Date: Sun, 20 Dec 2009 23:09:53 GMT > < Server: Noelios-Restlet/2.0m3 > < Connection: close > < > <html> > <head> > <title>Status page</title> > </head> > <body> > <h3>empty String</h3><p>You can get technical details <a > href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1">here</a>.<br> > Please continue your visit at our <a href="/">home page</a>. > </p> > </body> > </html> > * Closing connection #0 The dataset http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/6 does not contain purely numerical entries because they are declared to be of type xsd:string, so internally I handle these as strings, not as numbers. A modification of this dataset, changing these datatypes to xsd:double would fix this problem. However, I should return an explanatory message and a proper Status Code. The text/x-arff representations you provide include some string and numeric declarations for the features of the dataset. So I think we should do something like that in the RDF. RDF representations, structurally, contain much more (meta)information about the objects they describe than ARFFs, so this piece of information in the text/x-arff (the datatype of each feature) IMHO has to be included in the RDF or at least - in order not to modify the RDF standards we adopted in API 1.1 - we should use proper XSD datatypes for every value. After all, its not 1^^double, 1^^string and 1^^nominal is not the same and won't (shouldn't) be handled the same way by a training algorithm. > > 3) Unsuccessful call: > If the dataset URI contains query parameters (in this case specifying > to include only 3 numerical features), I am not sure if it is parsed > correctly by the NTUA service, or feature_uris[] parameter is > perceived as a separate one to the dataset_uri parameter. The entire > dataset URI should read: > 'dataset_uri=http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/6?feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11938&feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11947&feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11951' > > The entire (unsuccessful) call : > ambit:/home/nina# curl -X POST -d > 'dataset_uri=http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/6?feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11938&feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11947&feature_uris[]=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11951&target=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11951' http://opentox.ntua.gr:3000/algorithm/mlr -v > > * About to connect() to opentox.ntua.gr port 3000 (#0) > * Trying 147.102.82.32... connected > * Connected to opentox.ntua.gr (147.102.82.32) port 3000 (#0) > > POST /algorithm/mlr HTTP/1.1 > > User-Agent: curl/7.18.2 (x86_64-pc-linux-gnu) libcurl/7.18.2 > OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.8 libssh2/0.18 > > Host: opentox.ntua.gr:3000 > > Accept: */* > > Content-Length: 329 > > Content-Type: application/x-www-form-urlencoded > > > < HTTP/1.1 500 empty String I haven't implemented those feature_uris[]=... yet :-) > > 4)Unsuccessful call (same as above, but with dataset_uri URL encoded) > > ambit:/home/nina# curl -X POST -d 'dataset_uri=http%3A%2F% > 2Fambit.uni-plovdiv.bg%3A8080%2Fambit2%2Fdataset%2F6%3Ffeature_uris%5B > %5D%3Dhttp%3A%2F%2Fambit.uni-plovdiv.bg%3A8080%2Fambit2%2Ffeature% > 2F11938%26feature_uris%5B%5D%3Dhttp%3A%2F%2Fambit.uni-plovdiv.bg% > 3A8080%2Fambit2%2Ffeature%2F11947%26feature_uris%5B%5D%3Dhttp%3A%2F% > 2Fambit.uni-plovdiv.bg%3A8080%2Fambit2%2Ffeature% > 2F11951&target=http://ambit.uni-plovdiv.bg:8080/ambit2/feature/11951' > http://opentox.ntua.gr:3000/algorithm/mlr -v > * About to connect() to opentox.ntua.gr port 3000 (#0) > * Trying 147.102.82.32... connected > * Connected to opentox.ntua.gr (147.102.82.32) port 3000 (#0) > > POST /algorithm/mlr HTTP/1.1 > > User-Agent: curl/7.18.2 (x86_64-pc-linux-gnu) libcurl/7.18.2 > OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.8 libssh2/0.18 > > Host: opentox.ntua.gr:3000 > > Accept: */* > > Content-Length: 409 > > Content-Type: application/x-www-form-urlencoded > > > < HTTP/1.1 500 For input string: "NC" > > Most important question so far is - is the way of specifying > parameters as asciii data content and using syntax like below agreed > and sufficient? > > dataset_uri=aaaa&target=bbbbb > > Do the services expect these parameter values to be URL encoded - As far as I know, you may use non-URL encoded parameters. > otherwise it is impossible to use e.g. URIs with query parameters. I guess you can do that but I have to check this out. Best Regards, Pantelis > > Best regards, > Nina > > > > Best Regards > > Pantelis > > > > > > > > > > * When a client posts a dataset on a model to make a prediction, then > > > > the service generates a new dataset which (according to the API) should > > > > be posted to a dataset service. Is this operation available? > > > > > > > > > > > > > > ambit services accept SDF datasets on POST currently, and RDF upload > > > will be available later today (if everything works right). > > > > > > > > > > * How can I calculate a feature value for a certain compound URI? Is > > > > there an example (e.g. curl command)? > > > > > > > > > > > > > > > > > > Perhaps we need "compound_uri" parameter for algorithm API, similar to > > > Model API ? > > > > > > AFAIK TUM are developing descriptor calculation service, it will make > > > sense to synchronize parameter names. > > > > > > Hope this helps, > > > Nina > > > > > > > > > > > > > > > > Best Regards, > > > > Pantelis > > > > > > > > _______________________________________________ > > > > Development mailing list > > > > Development at opentox.org > > > > http://www.opentox.org/mailman/listinfo/development > > > > > > > > > > > > > > _______________________________________________ > > > Development mailing list > > > Development at opentox.org > > > http://www.opentox.org/mailman/listinfo/development > > > > > > > > > > > > > >
- Previous message: [OTDev] Some Questions
- Next message: [OTDev] Some Questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list