Skip to content

a UI based workflow for manage arxiv papers with notion.

Notifications You must be signed in to change notification settings

Xiang-cd/arxiv-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arxiv auto workflow

This is a automatic workflow for manage arxiv papers with notion.

if you like this project, please give me a star✨!

background & motivation

  • arxiv becomes a popular platform for sharing scientific papers, as a AI researcher, I get papers almost from arxiv
  • there is no efficient way to manage papers for files, citations, notes, and other information. Endnote manage citations mainly, Readpaper dones a good job for notes, files and inplace translations, but lacks of self-defined field and efficient search.
  • some paper release with arxiv, but soon will be accepted by a conference or journal,how to update the bibtex and other information if you want to cite it in your paper?
  • my solution is to build a visualized database for papers via notion, define my field and tags for papers, and use Readpaper to read them.

The problem is: when getting an interesting title of a new paper, I may do:

  • opening url to search
  • create a new page in notion
  • copy and paste title, abstract, and other information manually
  • manually download pdf and store to local directory

it is very time-consuming and error-prone.

solution and features

My solution is to build up a workflow,drop title or arxiv id,program will automatically search arxiv and get the paper information, then create a new page in notion with the information, and also download the pdf file and store it to local directory.

1. search arxiv and get paper information

when find a paper in abs/pdf url, just modify url using predefined api, then your auto workflow will be launched:

before after
image image
files will be downloaded, metainfos will be uploaded to notion!

2. auto bibtex refresh

Access 127.0.0.1:8000/bibtex?refresh=true to refresh bibtex by semanticscholar api, and update the bibtex field in notion. As there are rate limit for semanticscholar api, we choose to start a new thread in background to refresh bibtex with a long sleep interval. Access 127.0.0.1:8000/bibtex?refresh=true&all=true to refresh all bibtex in the database, no matter the item has an bib entry or not.

start refresh check refresh
image image
check fetch.log to see if refresh is successful.

3. export bibtex file for all your papers

export bibtex file for all your papers by accessing 127.0.0.1:8000/bibtex. bib

how to use

prepare notion database and notion token

  1. refer to my released notion template, and add to your workspace.
  2. get the database id accroding to notion doc
  3. get the notion access token according to notion doc
  4. test the notion api with curl command:
curl -X GET https://api.notion.com/v1/databases/{database_id} \
  -H "Authorization: Bearer {token}" \
  -H "Notion-Version: 2021-08-16"

from source code

pip install -r requirements.txt
export NOTION_TOKEN=<your_notion_token>
export NOTION_DATABASE_ID=<your_notion_database_id>
export DOWNLOAD_DIR=<your_download_directory>
export SS_KEY=<your_semanticscholar_api_key> # using an semanticscholar api key to get higher rate limit
export SS_SLEEP_INTERVAL=<your_semanticscholar_api_sleep_interval> # default 200s with random range -40 t0 40s
fastapi run server.py

using docker

docker build -t arxiv-workflow .

export NOTION_TOKEN=<your_notion_token>
export NOTION_DATABASE_ID=<your_notion_database_id>
export DOWNLOAD_DIR=<your_download_directory>
export SS_KEY=<your_semanticscholar_api_key>
export SS_SLEEP_INTERVAL=<your_semanticscholar_api_sleep_interval> # default 200s with random range -40 t0 40s

docker run -it --rm -e NOTION_TOKEN=$NOTION_TOKEN \
    -e NOTION_DATABASE_ID=$NOTION_DATABASE_ID \
    -e DOWNLOAD_DIR=/download \
    -v $DOWNLOAD_DIR:/download \
    -p 8000:8000 \
    arxiv-workflow

TODOs

  • release my notion database template
  • bibtex auto refresh
  • export bibtex file for all your papers,
  • support export bibtex file for specific paper with alias you've added
  • rest API documentation and CLI tools if needed
  • if system becomes complex, add config system

About

a UI based workflow for manage arxiv papers with notion.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published