Skip to content

Latest commit

 

History

History
34 lines (23 loc) · 924 Bytes

README.md

File metadata and controls

34 lines (23 loc) · 924 Bytes

ScholarScrape

Octave script to pull PDFs from google scholar

Functionality

  • pull pdfs from google scholar
  • ability to select the number of pages to pull PDFs from (this acts as the proxy for the number of articles)
  • ability to set publication start and end dates
  • (optional) filter out citations and patents if wanted (unsure of what effect this has on the product of the script)
  • Proper Filenames

Usage

ScholarScrape uses Octave 4.4.1 as its base however it should be executable in any recent versions of Octave, if you run into problems, submit and Issue.

Run the

Future Implementation

  • Citation Scraping
  • Data Parsing
  • Feedback/progress Bar
  • Pull "all" entries option
  • Automatic Directory Creation
  • Ability to run script in GUI mode vs CLI mode
  • Separate script and scraping
    • Script will have input definitions, function will actually pull the files