Similarity Assessment of Data in Semantic Web

dc.contributor.advisorMarek Z. Reformat (Electrical and Computer Engineering)
dc.contributor.authorDehleh Hossein Zadeh, Parisa
dc.contributor.otherChang-Shing Lee (National University of Tainan, Taiwan)
dc.contributor.otherKen Wong (Computing Science)
dc.contributor.otherPetr Musilek (Electrical and Computer Engineering)
dc.contributor.otherMarek Z. Reformat (Electrical and Computer Engineering)
dc.contributor.otherWitold Pedrycz (Electrical and Computer Engineering)
dc.contributor.otherDi Niu (Electrical and Computer Engineering)
dc.date.accessioned2025-05-29T03:21:28Z
dc.date.available2025-05-29T03:21:28Z
dc.date.issued2016-06
dc.description.abstractThe web is a constantly growing repository of information. Enormous amount of available information on the web creates a demand for automatic ways of processing and analyzing data. One of the most common activities performed by these processes is comparison of data – it is done to find something new or confirm things we already know. In each case there is a need for determining similarity between different objects and pieces of information. The process of determining similarity seems to be relatively easy when it is done for a numerical data, but it is not so in the case of a symbolic data. In order to make the data stored on the Internet more accessible, a new model of data representation has been introduced – Resource Description Framework. Linked data provides an open platform for representing and storing structured data as well as ontology. This aspect of data representation has been fully utilized for providing fundamentals for the new forms of Internet, Linked Data and Semantic Web. In this thesis, we investigate the problem of determining semantic similarity between entities in which not just lexical and syntactical information of entities are used, but the whole existing knowledge structure including the instantiated ontology is exploited. The idea is based on the fact that entities are interconnected and their semantics is defined via their connections to other entities as well as the metadata expressed as ontology. We propose feature-based methods for similarity assessment of concepts represented in ontology as well as in a less constrained Resource Description Framework. Membership functions are used to capture the importance of connections between entities at different hierarchy levels in ontology. We leverage importance weighted quantifier guided operator to aggregate the similarity values related to different groups of properties. In another proposed approach, we use concepts of possibility theory to determine lower and upper bounds of similarity intervals. In addition, we address contextual similarity assessment when only specific context is taken into consideration. The idea of ranking entities’ features according to their importance in describing an entity is introduced. We propose an approach that calculates similarly measures for these categories of features and then aggregates them using fuzzy-expressed weights that represents rankings of these categories. The promising results of our developed similarity method have encouraged us to extend it to a more comprehensive approach. As a result, we propose a technique for automatic identification of the importance of features and ranking them accordingly. Finally, we tackle the problem of application of heterogeneous feature types for defining entities. A method is described utilizing fuzzy set theory and linguistic aggregation to compare features of different types. We deploy this technique in a practical pharmaceutical application, where the proposed similarity assessment is shown to be capable of finding relevant entities – drugs in this case, in spite of heterogeneous features used to define them.
dc.identifier.doihttps://doi.org/10.7939/R3W08WT0G
dc.language.isoen
dc.rightsThis thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
dc.subjectFuzzy set theory
dc.subjectOntology
dc.subjectResource description framework
dc.subjectSemantic web
dc.subjectEntity matching
dc.subjectSimilarity
dc.subjectRDF
dc.subjectInformation retrieval
dc.subjectLinked data
dc.titleSimilarity Assessment of Data in Semantic Web
dc.typehttp://purl.org/coar/resource_type/c_46ec
thesis.degree.disciplineSoftware Engineering and Intelligent Systems
thesis.degree.grantorhttp://id.loc.gov/authorities/names/n79058482
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
ual.date.graduationSpring 2016
ual.departmentDepartment of Electrical and Computer Engineering
ual.jupiterAccesshttp://terms.library.ualberta.ca/public

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dehlehhosseinzadeh_Parisa_201601_PhD.pdf
Size:
3.79 MB
Format:
Adobe Portable Document Format