[OTDev] RDF for dataset representation
Tobias Girschick tobias.girschick at in.tum.deMon Nov 2 12:15:53 CET 2009
- Previous message: [OTDev] RDF for dataset representation
- Next message: [OTDev] RDF for dataset representation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Christoph, On Thu, 2009-10-29 at 12:34 +0100, Christoph Helma wrote: > Excerpts from Nina Jeliazkova's message of Wed Oct 28 14:45:27 +0100 2009: > > > - Do we need to make distinction between different e.g. XLogP > > implementations (I would say yes) ? Is it possible to handle this via > > BO ontology, or we need an extension? > > Egon? > > > - What would be the best way to extend BO ontology (this is more a > > question to Egon)? > > - How would we handle quantities, defined in existing data sets (e.g. > > all LogP flavours available in EPA DSSTOX), not calculated via OpenTox, > > For DSSTOX we can use the URI of the field definitions, e.g. > > <http://www.epa.gov/ncct/dsstox/StandardChemFieldDefTable.html#STRUCTURE_MolecularWeight> or > <http://www.epa.gov/ncct/dsstox/CentralFieldDef.html#ActivityOutcome_CPDBAS_Rat> > > > or an user uploaded dataset. > > If the user is unable to link to an existing ontology, (s)he still can > use local links (e.g. <#my_new_algorithm>) as predicates (although that > will not be very useful to put the results into a meaningful context, > but it can be sufficient for computational experiments). > > > - How to handle quantities, calculated via some algorithm, but with > > different parameters (e.g. eHOMO calculated with AM1 or PM3). > > I think that this could be resolved at the ontology level (e.g. > ontology:eHOMO/AM1 vs ontology:eHOMO/PM3 or > ontology:eHOMO?parameters=AM1). See also the next point. > > > I would prefer that the property (e.g. blueobelisk:xlogp) refer to a > > specific implementation, rather to the algorithm itself (same concept > > as algorithm/model split we already invented). > > The implementation itself will be linked to the algorithm. > > The predicate (i.e. property) could be the URI of the service that has > calculated the value. To make the process completely reproducible, we > would need to provide the POST URI together with all parameters - I am > not sure if RDF supports this. > > > Looking into the current list of feature definitions in Ambit > > (http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition ), most of > > them can be mapped to existing or to-be-developed ontologies, but we > > need to extend your proposal in a way to keep track of the source of the > > data. > > > > For example it is important to know that feature MolWeight > > <http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/12109>is > > representing Molecular weight, but I would not want to lose the > > information it came from ISSCAN_v3a_1153_19Sept08.1222179139.sdf > > <http://www.epa.gov/NCCT/dsstox/sdf_isscan_external.html> > > > > http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/12109 > > <http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/11945> > > This was the primary reason to invent feature definition to consist of > > name + reference - I am sure this can be described in RDF as well. > > > > Ah, now I get the idea behind the feature-definition. > > > Actually I was thinking of an (extensible) ontology for SMARTS defined > > fragments; ChEBI ontology has lot of predefined groups that can be > > used. Read across use case will benefit from that :) > > Yes, but this should support also arbitrary SMARTS > substructures that come e.g. from supervised graph mining. > > > We would need a way to handle dynamically defined properties and even > > ontologies. I am particularly thinking of user-defined datasets. > > I agree, but I am not sure how to keep user defined ontologies > consistent. We would need a curation process (who is responsible?), but > maybe a simple tagging system could also work. > > > There are several Java libraries , even Restlet in 2.x has some support > > (no querying) - graph structure with serialization to several formats. > > RDF support in Ruby could be better. Redland (http://librdf.org) seems > to be fairly powerful and has Ruby (as well as Perl, PHP, Python and C) > bindings, but it requires manual compilation of at least 3 libraries > (i.e. no convenient 'gem install redland'). > > > > I suspect that RDF could be also useful for the representation of other > > > OpenTox objects (Algorithms, Models, ...). Regarding RDF. As this is not some format but more a concept to describe knowledge it should be possible, otherwise I'd say RDF doesn't deliver on it's promises. And if we use RDF for features and feature_definitions...it would be nice to have algorithms and models consistent. > > > > > Yes. Could we have a closer look into Algorithm object in BO dictionary > > and decide if it can be reused in OpenTox ? You were thinking of this dictionary, I suppose: http://qsar.sourceforge.net/dicts/blue-obelisk/index.xhtml The Algorithms listed and described there are as far as I can see solely for descriptor/property calculation purposes (except maybe 2D Layout and 3D Geometry). There is a the moment no categorization for learning algorithms and related stuff. I don't see any mapping or category overlap of this dictionary with the algorithms we have implemented so far. Probably there are some descriptor calculations in CDK and JOELib2 that map to the dictionary. On the other hand this does not mean it can't be reused. But we would have to add a lot of categories and classifications (e.g. regression). Best Regards, Tobias > > Munich ? > > Best regards, > Christoph > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development -- Dipl.-Bioinf. Tobias Girschick Technische Universität München Institut für Informatik Lehrstuhl I12 - Bioinformatik Bolzmannstr. 3 85748 Garching b. München, Germany Room: MI 01.09.042 Phone: +49 (89) 289-18002 Email: tobias.girschick at in.tum.de Web: http://wwwkramer.in.tum.de/people/girschic
- Previous message: [OTDev] RDF for dataset representation
- Next message: [OTDev] RDF for dataset representation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list