[OTDev] Significant milestone reached -- MLR model training
Vedrin Jeliazkov vedrin.jeliazkov at gmail.comFri Jan 1 18:44:37 CET 2010
- Next message: [OTDev] Significant milestone reached -- MLR model training
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Pantelis, 2009/12/31 chung <chvng at mail.ntua.gr>: > The command: > > curl -v http://opentox.ntua.gr:3000/model/20761/predicted > > returns the predicted uri: > > http://ambit.uni-plovdiv.bg:8080/ambit2/feature/http%3A%2F% > 2Fsomeserver.com%2Ffeature%2F101Default > > But then it seems that this resource returns a status code 404 - not > found. This URI was returned by a post on your feature creation service. > Could you take a look at that? I've just checked that the above mentioned URI returns code 200 and the associated rdf when accessed either by curl, FF or IE: D:\curl-7.19.6-ssl-sspi-zlib-static-bin-w32>curl -iv http://ambit.uni-plovdiv.bg:8080/ambit2/feature/http%3A%2F%2Fs omeserver.com%2Ffeature%2F101Default * About to connect() to ambit.uni-plovdiv.bg port 8080 (#0) * Trying 194.141.27.28... connected * Connected to ambit.uni-plovdiv.bg (194.141.27.28) port 8080 (#0) > GET /ambit2/feature/http%3A%2F%2Fsomeserver.com%2Ffeature%2F101Default HTTP/1.1 > User-Agent: curl/7.19.6 (i386-pc-win32) libcurl/7.19.6 OpenSSL/0.9.8k zlib/1.2.3 > Host: ambit.uni-plovdiv.bg:8080 > Accept: */* > < HTTP/1.1 200 OK HTTP/1.1 200 OK < Server: Apache-Coyote/1.1 Server: Apache-Coyote/1.1 < Date: Fri, 01 Jan 2010 16:49:16 GMT Date: Fri, 01 Jan 2010 16:49:16 GMT < Vary: Accept-Charset, Accept-Encoding, Accept-Language, Accept Vary: Accept-Charset, Accept-Encoding, Accept-Language, Accept < Accept-Ranges: bytes Accept-Ranges: bytes < Server: Restlet-Framework/2.0m6 Server: Restlet-Framework/2.0m6 < Content-Type: application/rdf+xml;charset=UTF-8 Content-Type: application/rdf+xml;charset=UTF-8 < Transfer-Encoding: chunked Transfer-Encoding: chunked < <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ot="http://www.opentox.org/api/1.1#" xmlns:j.0="http://purl.org/net/nknouf/ns/bibtex#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" > [RDF contents skipped] > Note that the same RDF representation for feature, when posted to your > feature creation service, sometimes returns a valid URI for the created > features while once every now and then the above URI is returned and no > feature seems to be created. Nina made some amendments including the > upgrade from Restlet m3 to m6 which improved the performance of the > service (such issues have become much more rare) but it seems there is > still a problem. Yes indeed, there are some further problems that we still have to fix (please see below), however they are not due to the restlet POST bug, which we have hopefully solved with the upgrade from 2.0M3 to 2.0M6. > Another issue is that curl > http://opentox.ntua.gr:3000/model/20767/predicted returns the URI > http://ambit.uni-plovdiv.bg:8080/ambit2/feature/29065 which is not > included in the uri-list at > http://ambit.uni-plovdiv.bg:8080/ambit2/feature > > Check out: > > curl -H 'Accept:text/uri-list' > http://ambit.uni-plovdiv.bg:8080/ambit2/feature | grep 29065 Well, you're both right and wrong. What happens here is that we have defined a default maximum number of returned resources, which is currently set to 100. The rationale is that we're trying to avoid overloading our development server with queries which could return unexpectedly large responses (e.g. consider the EINECS dataset, which has more than 100000 records...). This default limit can be further tuned by adding a max=<some number> parameter to the URI, which would help retrieving the full uri-list in this particular case: curl -H 'Accept:text/uri-list' http://ambit.uni-plovdiv.bg:8080/ambit2/feature?max=100000 | grep 29065 http://ambit.uni-plovdiv.bg:8080/ambit2/feature/29065 It might be worth mentioning that: 1) we have to better document this URI parameter; 2) we should perhaps consider applying such policy only to a subset of URIs, in particular avoiding any limits for uri-lists; 3) we're planning to set up a (more scalable) production server by the end of Feb 2010 and might revise this policy or remove it altogether at that time; > P.S. See the following for reference: > > A. Buggy: > curl http://opentox.ntua.gr:3000/model/20761/predicted > curl http://opentox.ntua.gr:3000/model/20763/predicted > curl http://opentox.ntua.gr:3000/model/20764/predicted > curl http://opentox.ntua.gr:3000/model/20765/predicted > > B. Correct: > curl http://opentox.ntua.gr:3000/model/20766/predicted > curl http://opentox.ntua.gr:3000/model/20767/predicted > curl http://opentox.ntua.gr:3000/model/20760/predicted > curl http://opentox.ntua.gr:3000/model/20762/predicted The problem here is even more subtle. When our service receives a feature POST request it first checks whether this particular feature already exists in the database. In case that it exists it returns a URI like those in the "correct" set, pointing to the existing feature (e.g http://ambit.uni-plovdiv.bg:8080/ambit2/feature/29064). In case that the feature doesn't exist, than it creates it and returns a URI like those in the "buggy" set (e.g. http://ambit.uni-plovdiv.bg:8080/ambit2/feature/http%3A%2F%2Fsomeserver.com%2Ffeature%2F101Default). In fact both URIs are correct (they point to the relevant resource), however there's still one big problem. It consists in the fact that all these above mentioned features probably should have been recognized as identical (because they've been generated by identical operations, run by SmokePing) and perhaps only one feature should have been created in the database and returned to all subsequent POST requests. So in this sense all of the above mentioned URIs could be considered buggy. The difference is that for those from the second set, some features have been found to be identical only by chance... In order to solve this issue it would be very helpful if you could send us an example RDF for the feature POST request you're sending and/or the code that generates it. Last but not least, it would be nice if you could put some relevant value in the dc:title property. In cases when this value is absent (as it is currently in your feature POST requests), we assign the RDF node id as feature name. This is a bug that we're going to fix (we'll assign the sameAs URI you're providing instead). However, for the user interface it would be much better to have an appropriate dc:title value. Kind regards, Vedrin
- Next message: [OTDev] Significant milestone reached -- MLR model training
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list