[OTDev] Some questions on the RDF for Datasets
chung chvng at mail.ntua.grThu Dec 10 15:41:04 CET 2009
- Previous message: [OTDev] Java Example for parsing a dataset
- Next message: [OTDev] Some questions on the RDF for Datasets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Nina, All, First of all, thanks a lot for the code!!! I'm almost done with the parser but I still have a query about the RDF. What I currently do, to retrieve the datatype of a feature is that for each feature, I pick a "values" node and get the XSD datatype of its value. [This is more or less fine for Regression Algorithms! ] * I think that in case a featureValue appears in different data entries with incompatible datatypes (e.g. date and double) then this should be an exception. For example it is not normal to say that the Molecular weight of the compound A is 100 and for the compound B is "XYZ". What do you think? * If a feature value shares the same data type with all other values of the same feature (in all dataentries) then the data type can be thought of as a property of the feature too. So I think that it would be convenient to declare the datatype on every feature, e.g. <http://someservder.com/feature/100> <dc:type> <XSD type URI> This will not perturb the structure of the RDF for the Datasets. It becomes more clear in the following case: * When training a classification model, both Weka and LibSVM (as well as other libraries) need to know the range of the dependent variable a priori. A solution would of course be to get the different values of that variable one by one (Iterating over all values for that feature). However it would be again more convenient if the datatype was a property of the feature itself. * Do we have a formal way for denoting missing values or they will not appear at all? Opinions? Best Regards, Pantelis On Wed, 2009-12-09 at 23:21 +0200, Nina Jeliazkova wrote: > chung wrote: > > Dear Nina, Tobias, > > I'm trying to get access to anonymous Resources of an RDF document > > using Jena. I want to iterate over all FeatureValue nodes and read their > > value and the URI of the corresponding compound. Do you have any idea > > how I could do that? > > > > Is there a way to retrieve these information in a List or via an > > ExtendedIterator somehow? > > > > > Jena example at > http://opentox.org/data/documents/development/RDF%20files/JavaOnly/JenaExamples/#section-22 > > Best regards, > Nina > > Do you think that it would be more convenient if we didn't use anonymous > > nodes. For example, dataEntries could be URIs in the form > > http://opentox.org/dataEntry/xyz ... (Well I'm not either sure about > > that). > > > > Tobias, I believe we're also working on parsing RDF documents (i.e. > > using RDF representations to generate weka.core.Instances objects), so > > we could collaborate on that. > > > > My source code can be found at http://github.com/sopasakis/yaqp . > > > > Best Regards, > > Pantelis > > > > _______________________________________________ > > Development mailing list > > Development at opentox.org > > http://www.opentox.org/mailman/listinfo/development > > > > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development >
- Previous message: [OTDev] Java Example for parsing a dataset
- Next message: [OTDev] Some questions on the RDF for Datasets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list