[OTDev] Missing values [was Re: DataSet]

Rajarshi Guha rajarshi.guha at gmail.com
Tue Oct 6 16:50:16 CEST 2009


On Tue, Oct 6, 2009 at 10:39 AM, Nina Jeliazkova <nina at acad.bg> wrote:

>
> IMO, dataset with missing values is a valid one. There exists algorithms
> in machine learning that can deal with missing values, usually
> preprocessing ones (there are Weka implementations as well).
> In any case, one might need to create a dataset without missing values
> from a dataset with missing values (by ignoring empty entries or
> applying something else).  I am not sure if there should be API on
> dataset level, on algorithms level, or model level.  As understand,
> Christoph is in favour of dataset level API - am I right?
>

I like the idea of having missing values represented within the dataset.

One thing that would be useful, would be to have consistent notation to
indicate a missing value. Something like 'NA' etc

-- 
Rajarshi Guha
NIH Chemical Genomics Center



More information about the Development mailing list