Conceptual Indexing using Latent Semantic Indexing: A Case Study
Independent thesis Basic level (university diploma), 20 credits / 30 HE creditsStudent thesis
Information Retrieval is concerned with locating information (usually text) that is relevant to a user's information need. Retrieval systems based on word matching suffer from the vocabulary mismatch problem, which is a common phenomenon in the usage of natural languages. This difficulty is especially severe in large, full-text databases since such databases contain many different expressions of the same concept. One method aimed to reduce the negative effects of the vocabulary mismatch problem is for the retrieval system to exploit statistical relations. This report examines the utility of conceptual indexing to improve retrieval performance of a domain specific Information Retrieval System using Latent Semantic Indexing (LSI). Techniques like LSI attempt to exploit and model global usage patterns of terms so that related documents that may not share common (literal) terms are still represented by nearby conceptual descriptors. Experimental results show that the method is noticeable more efficient, compared to baseline, for relatively complete queries. However, the current implementation did not improve the effectiveness of short, yet descriptive, queries.
Place, publisher, year, edition, pages
2015. , 58 p.
Engineering and Technology
IdentifiersURN: urn:nbn:se:uu:diva-263029OAI: oai:DiVA.org:uu-263029DiVA: diva2:856529
Ashcroft, MichaeGällmo, Olle