10 Jul

On Ranking Techniques for Desktop Search

PubDate(2008), PubPlace(TIS) Author(Cohen,Domshlak,Zwerdling)
keyword(Desktop search,Learning to rank,)

Summary

Desktop search experiment with known-item search task.

Content

Background

  • Stuff I’ve seen(SIS) : users sort result by last-update date more frequently than by IR ranking
    • The older the data, the less often it is used
  • Type of information stored in desktop
    • Ephemeral : reminder, short-lived
    • Working : related to ongoing work
    • Archive : long-term resource

Contribution

  • Novel Feature
    • Level : the distance of a file from uppermost directory
    • DirRank : The probability to open a file in specific directory is proportional to the number of files previous opened from this (and its sub) directory. (normalized by files in directory)
  • Selectivity ; combine values of content-similarity feature by the inverse of of the file number with non-zero feature value.
    • e.g. If user query was match with the filename field of 100 files, the score of name field is divided by 100.

Experiment

  • Queries are grouped by the no. of results returned
  • As more results are returned by each query, date-related features became more useful. (selectivity)
  • Combination of result by selectivity was proven to be as effective as learning-based methods

Future Work

Comment

Reference

Tags : Paper,IR Print Comments Trackback