[OTDev] Missing values [was Re: DataSet]
Tobias Girschick tobias.girschick at in.tum.deWed Oct 7 10:22:58 CEST 2009
- Previous message: [OTDev] Missing values [was Re: DataSet]
- Next message: [OTDev] Missing values [was Re: DataSet]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, On Tue, 2009-10-06 at 14:54 -0400, Rajarshi Guha wrote: > On Tue, Oct 6, 2009 at 1:41 PM, chung <chvng at mail.ntua.gr> wrote: > > > Dear Nina, Christoph, All, > > > > Datasets with missing values are valid, I also think that datasets that contain missing values should be considered valid. > however we have to bear in mind > > some density/sparsity criteria at least for the time. Its absolutely > > impossible to train a model (even a "bad" one), using the following > > "diagonal" dataset: > > > > But wouldn't the model development stage involve data cleaning to remove (or > impute) missing values? Something like that should definitely happen before the model is built. IMO the question is more if we do consider this kind of data cleaning for the first prototype (if yes, I would propose to use something simple like removing the descriptor or inserting some default or easily calculable value)? > And if there isn't sufficient information content, > why would one build a model in the first place? I agree here. The question is, where is the border? At what percentage of missing values do I say: No, I don't build a model here? Regards, Tobias > -- Dipl.-Bioinf. Tobias Girschick Technische Universität München Institut für Informatik Lehrstuhl I12 - Bioinformatik Bolzmannstr. 3 85748 Garching b. München, Germany Room: MI 01.09.042 Phone: +49 (89) 289-18002 Email: tobias.girschick at in.tum.de Web: http://wwwkramer.in.tum.de/people/girschic
- Previous message: [OTDev] Missing values [was Re: DataSet]
- Next message: [OTDev] Missing values [was Re: DataSet]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list