ECIR2021.An Enhanced Evaluation Framework for Query Performance Prediction #31

soroush-ziaeinejad · 2022-04-01T12:07:39Z

Why did I choose this paper? QPP is concerned with estimating the effectiveness of a query within the context of a retrieval model. Thus, reviewing the evaluation methods for QPP can be highly related to the second part of my thesis.

Main problem:

This paper examines the existing evaluation methods that are commonly used for Query Performance Prediction. In addition, the authors propose a new approach for QPP.
Applications:

Feedback to the user (how to change the query)
Feedback to the system (query expansion)
Conversational search (clarifying question)
Query suggestion

QPP tasks:

Classifying queries
Predicting the effectiveness of retrieval
ranking queries based on effectiveness

Existing work:

Pre-retrieval predictors: analyze query and corpus statistics prior to retrieval
Post-retrieval predictors: analyze the retrieval results
GAPS
- Single value as the performance
- hard to interpret
- Unable to find the hard queries
- Cannot completely analyze the effects of different components
- Hard to generalize

Inputs:

A list of queries
A corpus of documents
Retrieval models

Outputs:

the distribution to show the performance of input query set

Method:

Advantages:

outputs a distribution for the QPP task instead of relying on point estimates.
explores the benefits of using multiple query formulations and ANalysis Of VAriance (ANOVA) modeling
The ability of factor analysis
The Ability of failure analysis

Experimental Setup:

Dataset:
TREC Robust 2004 (Robust04): 528K documents, 249 topics

Parameters:

16 QPP models
4 different stoplists + no stop
2 different stemmers + no stem
Average Precision (AP) to measure the effectiveness of the different retrieval pipelines

Metrics:

Avg Precision (AP) or nDCG
Correlation
AP induced scaled Absolute Rank Error (sARE_AP )

Baselines:

Pre-retrieval
- SCQ, AvgSCQ, MaxSCQ: CF-iDF to corpus and query terms
- SumVAR, AvgVAR, MaxVAR: CF-iDF variability to corpus and query terms
- AvgIDF, MaxIDF: iDF value for query terms.
Post-retrieval
- Clarity: Language model divergance (for top documents and whole corpus)
- NQC: Standard deviation of top documents
- WIG: mean retrieval score comparison between top documents and whole corpus
- SMV: scoresو standard deviationو and magnitude (?)
- UEF: similarity of the initial result list with the re-ranked list

Code:

The code of this paper is available on: https://github.com/Zendelo/QPP-EnhancedEval

Presentation:

The presentation of this paper is available on: https://www.youtube.com/watch?v=TOd1W1rujbg

hosseinfani · 2022-04-01T12:31:44Z

@soroush-ziaeinejad where is the body?!

soroush-ziaeinejad added the literature-review Summary of the paper related to the work label Apr 1, 2022

soroush-ziaeinejad self-assigned this Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECIR2021.An Enhanced Evaluation Framework for Query Performance Prediction #31

ECIR2021.An Enhanced Evaluation Framework for Query Performance Prediction #31

soroush-ziaeinejad commented Apr 1, 2022 •

edited

Loading

hosseinfani commented Apr 1, 2022

ECIR2021.An Enhanced Evaluation Framework for Query Performance Prediction #31

ECIR2021.An Enhanced Evaluation Framework for Query Performance Prediction #31

Comments

soroush-ziaeinejad commented Apr 1, 2022 • edited Loading

Main problem:

Existing work:

Inputs:

Outputs:

Method:

Experimental Setup:

Baselines:

Code:

Presentation:

hosseinfani commented Apr 1, 2022

soroush-ziaeinejad commented Apr 1, 2022 •

edited

Loading