[OTDev] Wrapper/Super Services and Descriptor Calculation

Wed Jan 26 15:56:06 CET 2011

Hi Pantelis, All,

Your proposal for passing parameters seems to be generic and
straightforward - I would suggest to move it into API 1.2.

Let me try to explain my conceptions about algorithms, models and
superservices once again to make it clearer and to avoid further
confusion. I will try to look at it from a client point of view without
caring about implementation details:

Algorithms:

Almost every algorithm depends on other algorithms (either through
library calls or by using external REST services). For this reason it
does not make much sense to separate "Superalgorithms" from algorithms
(I think we have agreed on that for API 1.2).

For the ToxCreate and model validation use cases we need algorithms that
take
  - a training dataset (with optional parameters) as input and
  - provide a prediction model (more on its porperties below) as output.

As a client I do not care if the "Supermodel" is a one trick pony (with
hardcoded sub-algorithms) or a generic workflow system as in your
proposal, as long as it creates a prediction model from a training
dataset. For this reason there will be no generic "Superalgorithm"
interface, model parameters and usage will have to be documented by the
service devlopers.

Models:

For the ToxCreate and model validation use cases we need models that
  - take chemical structure(s) (without additional information) as input and
  - create a prediction dataset as output
  - are *immutable*, i.e. there should be no possibility to modifiy models once they are created (everything else would invalidate validation results, and would open possibilities for cheating))

A model can use a variety of algorithms (internal or through
webservices), it might use other models (e.g. consensus models) or
datasets (instance based predictions).  But as a client I do not want to
be bothered with these details (we store references to algorithms and
datasets in the model representation, but YMMV). All I need is a straightforward
interface with compound(s) as input and a dataset as output. Can we
agree on this interface for API 1.2? 

Pantelis: Your proposal seems to be focused on a generic (linear)
workflow implementation. While it would be worthwile to have such an
implementation, I do not hink we have to specify workflow systems at the API level.
(BTW: Parallel workflows (e.g. for creating consensus models) and
generic DAG workflows (for experimental/data analysis that
involves merging, splitting) could also be interesting).

Best regards,
Christoph