Skip to content

The repository for the course 'Web Information Retrieval'(https://digital-sciences.de/en/modules/web-information-retrieval/) under the Digital Sciences Master's Degree at TH Köln.

License

Notifications You must be signed in to change notification settings

AH-Tran/DSC_WIR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[WIR] Applying Science Models for Re-Ranking in IR

Introducing bibliometric-enhanced metadata to IR
This is the repository for the course 'Web Information Retrieval' for the study 'Digital Sciences' at the University of Applied Sciences Cologne.
It contains the project code for the project 'Applying Science Models for Re-Ranking in IR'.

An exploration of utilizing power laws in combination with graph-based metrics and bibliometrics such as coreness in an IR setting

  • Four Graph Implementations: Co-citation Graph, Citation Graph, Lotka-Inspired Graph, Journal Graph
  • Metrics: Degree Centrality, Closeness, Betweenness, Distance to most popular node
  • Comparing BM25 Baseline vs. Re-ranking with Boosting Factor
  • Dataset: TREC-COVID / Cord19 dataset
  • Metadata Enrichment: SemanticScholar API

Structure of this Repository

  • data\: Metadata used for the project
  • doc\: Documentation and presentation of the project
  • scripts\: Jupyter notebooks used for the project

Notebook Description

Filename Description
Experiments.ipynb Creating, experimenting & evaluating retrieval perfomance with the graphs
Scrape_metadata.ipynb Scraping relevant metadata using SemanticScholarAPI

Sources

cord19
Semantic Scholar
pyterrier

About

The repository for the course 'Web Information Retrieval'(https://digital-sciences.de/en/modules/web-information-retrieval/) under the Digital Sciences Master's Degree at TH Köln.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published