Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Citesummary #3

Open
paurkedal opened this issue Jun 15, 2020 · 3 comments
Open

Citesummary #3

paurkedal opened this issue Jun 15, 2020 · 3 comments

Comments

@paurkedal
Copy link

Is there an efficient way of extracting what corresponds to the Citesummary of the old site? In particular, we have been using queries like http://old.inspirehep.net/search?ln=en&ln=en&p=find+cn+atlas+and+d+2019&of=hcs&action_search=Search&sf=&so=d&rm=&rg=25&sc=0 to extract annual metrics for the ATLAS and ALICE collaborations,

  • "Total number of papers analyzed"
  • "Total number of citations"
  • "Average citations per paper"
  • "hHEP index [?]"

The solution I can see with the documented API is to request the full set of entries and fetch the citations entry of each paper. That may be feasible if we cache time-sliced partial results as we update, though I'm hoping there is a better way.

@michamos
Copy link
Contributor

Currently we don't have a better way, and it will require thousands of request to compute those stats for those large experiments. We will probably expose the citation summary we're using on the website (as appears here) through the API at some point, but I can't tell you when that will happen as it's a bit more tricky than anticipated.

@paurkedal
Copy link
Author

Thanks for the info. The website renders the numbers with JavaScript, so it does not look like we can resurrect our solution of parsing HTML. I might still look into computing it, since if we store intermediate result per day, the rate of requests should be limited, but it's not so urgent that it can't wait a few months.

@paurkedal
Copy link
Author

As long as the old site is operational, we can still use our current solution though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants