[OTDev] Location of permanently stored data

Mon Oct 12 13:11:46 CEST 2009

Excerpts from Nina Jeliazkova's message of Thu Oct 08 10:42:24 +0200 2009:
> Hello All,
> 
> I have posted a question related to this issue at yahoo REST group,
> hopefully we'll have some enlightenment from REST gurus
> http://tech.groups.yahoo.com/group/rest-discuss/message/13728 .
> 
> Thinking aloud, here is a proposal :
> 
>     * Introduce a Prediction resource (well, REST says if you have
>       troubles how to map something to the REST style, invent a new
>       resource). This is basically a Dataset, generated by applying a
>       Model on another Dataset, so the representation formats are the
>       same as for the Dataset resource.
>     * GET on Prediction has the same behaviour as for the Dataset.
>     * POST on Prediction accept as parameters Dataset URI and Model URI
>       and essentially creates a new Dataset . Upon creation, the Model
>       resource will be contacted , it will generate the predictions and
>       return them in some representation. The representation will be
>       used to create the Prediction URI. Now for this to work POST on
>       Model resource should return representation of the predictions,
>       not an URI.   There can be even separate implementation of the
>       Prediction resource, depending on if Model is remote or a local
>       resource, but this is transparent for the outside.
> 
> This decouples the locations of the original and predicted dataset and
> the model.  The drawback I see is Model may be less RESTfull (e.g. not
> creating URI upon POST, but I think this is acceptable for a POST
> operation).
> 
> What do you think?

I am presently using something along these lines, but without an
explicit prediction resource. The basic workflow is 

descriptor calculation:

POST /algorithm/{descriptor_calculation_id} dataset_uri: returns feature_dataset_uri

model creation:

POST /algorithm/{model_creation_id} dataset_uri, feature_dataset_uri: returns model_uri

descriptor calculation for unknown compounds:

POST /algorithm/{descriptor_calculation_id} new_dataset_uri: returns new_feature_dataset_uri

prediction:

POST /model/{model_id} new_feature_dataset_uri: returns prediction_dataset_uri

get predictions:

GET prediction_dataset_uri

The only problem that I have so far with this procedure, is that I want
to provide supporting information (in my case neighbors, relevant
features, ...) together with my prediction. This does not fit into our
compound - features model of a dataset, so I am thinking about using a
separate prediction (or model/{id}/prediction/{id}) resource for this
purpose.

Algorithm and model services read the location of the dataset service
from a configuration file.

Best regards,
Christoph