[OTDev] Normalization and PMML

Thu Nov 5 13:36:37 CET 2009

Dear All,
 I want to provide PMML representations of SVM models but prior to
training I scale the training data so that each attribute has values
within [-1,1]. This is done via weka using the class
weka.filters.unsupervised.attribute.Normalize. I get the coefficients of
the trained model via libSVM and finally I want to create a PMML
representation for my model (version 3.2. or higher). 
  Normalization is in fact a linear transformation of the initial data
according to a relation:

   X_k(i) = a_k*x_k(i)+ b_k

where X_k is the k-th attribute of the dataset and X_k(i) its i-th
element. So, I guess I have to use the element <LinearNorm> of PMML but
I cannot understand (and the documentation doesn't help!) if the "norm"
stands for the coefficient a_k and if "orig" is the bias b_k.

Another thing... Is it correct that the <VectorInstance> elements are
the scaled data?

Does anybody have any examples for SVM models with scaled data???

Best Regards,
Pantelis