IT606: Measuring the importance of annotation granularity to the detection of semantic similarity between phenotype profiles
Title | IT606: Measuring the importance of annotation granularity to the detection of semantic similarity between phenotype profiles |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Manda P, Balhoff JP, Vision TJ |
Conference Name | International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016) |
Date Published | 11/30/16 |
Publisher | CEUR-ws.org Volume 1747 |
Other Numbers | Vol-1747|urn:nbn:de:0074-1747-1 |
Abstract | Inphenotypeannotationscuratedfromthebiolog-icalandmedicalliterature,considerablehumaneffortmustbeinvestedtoselectontologicalclassesthatcapturetheexpressivityoftheoriginalnaturallanguagedescriptions,andannotationgranularitycanalsoentailhighercomputationalcostsforpartic-ularreasoningtasks.Docoarseannotationsforcertainapplications?Here,wemeasurehowannotationgranularityaffectsthestatisticalbehaviorofsemanticsimilaritymetrics.Weusearandomizeddatasetofphenotypeprdrawnfrom57,051taxon-phenotypeannotationsinthePhenoscapeKnowledgebase.WecomparedqueryprhavingvariableproportionsofmatchingphenotypestosubjectdatabaseprusingbothpairwiseandgroupwiseJaccard(edge-based)andResnik(node-based)semanticsimilaritymetrics,andcomparedstatisticalperformanceforthreedifferentlevelsofannotationgranularity:entitiesalone,entitiesplusattributes,andentitiesplusqualities(withimplicitattributes).Allfourmetricsexaminedshowedmoreextremevaluesthanexpectedbychancewhenapproximatelyhalftheannotationsmatchedbetweenthequeryandsubjectprwithamoresuddendeclineforpairwisestatisticsandamoregradualoneforthegroupwisestatistics.Annotationgranularityhadanegligibleeffectonthepositionofthethresholdatwhichmatchescouldbediscriminatedfromnoise.Theseresultssuggestthatcoarseannotationsofphenotypes,atthelevelofentitieswithorwithoutattributes,maybetoidentifyphenotypeprwithstatisticallysemanticsimilarity. |
URL | http://ceur-ws.org/Vol-1747/IT606_ICBO2016.pdf |