Home | University of Tehran | Faculty of Engineering | ECE Department | CIPCE | Contact Us
Course Seminar Report
Title: Tuning the Slope and Pivot Parameters for Persian Document Length Normalization

Course: Intelligent Information Retrieval
Instructor: Farhad Oroumchian
Worked by: Fatemeh Alavizadeh
Problem Definition, goal and Importance:

Abstract


Document length can affect calculated relevance between a document and a query. Yet, information retrieval systems must deal with documents regardless of their length. Thus, document length normalization can be used to abstract relevance form length. A good retrieval effectiveness results when a normalization strategy retrieves documents with chances similar to their probability of relevance. Pivot length normalization tries to achieve this goal. It can be used to modify any normalization function, thereby reducing the gap between the relevance and the retrieval probabilities. Performance of a pivot length normalization system highly depends on the values of the two parameters: pivot and slope. These values are collection dependent. In order to adjust these values for best performance many works have been done on English language collections; however there is no similar work in Persian language. In this study, we will tune these values for Hamshahri document collection.

 
Final Report:

final report

 

 
Related Links:

 

Related Publications:

 

References:

[1] Singhal, A., Buckley, C., Mitra, M.: Pivoted Document Length Normalization, Proceedings of the 19th Annual International ACM SIGIR Conference (1996) 21-29
[2] Miller C., University of Glasgow: An investigation of Length Normalization in Information Retrieval
[3] Polettini N., University of Trento: The Vector Space Model in Information Retrieval, Term Weighting Problem
[4] Ed Greengrass: Information Retrieval: A Survey, 2000.
[5] G. Salton and M. McGill: An Introduction to Modern Information Retrieval, McGraw-Hill, 1983..


Copyright ɠ2007 DBRG-UT. All rights reserved. Designed by Aresh Dadlani, Mohamad Hasan Ahmadi