Skip to content

LinkedIn Scraper (working for the new website 2019)

License

Notifications You must be signed in to change notification settings

murilobom/scrapedin

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status Coverage Status NPM version

Scraper for LinkedIn full profile data.
Unlike others scrapers, it's working in 2019 with their new website.

Install via npm package manager: npm i scrapedin

Check your version!

We need to update at every LinkedIn change. Please check if you have the latest version.

  • Latest release: v1.0.8 (latest 16 jul 2019)

Usage Example:

const scrapedin = require('scrapedin')

const profileScraper = await scrapedin({ email: '[email protected]', password: 'pass' })
const profile = await profileScraper('https://www.linkedin.com/in/some-profile/')

Documentation:

  • scrapedin(options)

    • options Object:
      • email: LinkedIn login e-mail (required)
      • password: LinkedIn login password (required)
      • isHeadless: display browser (default false)
      • hasToLog: print logs on stdout (default false)
      • puppeteerArgs: puppeteer launch options Object. It's very useful, you can also pass Chromium parameters at its args property, example: { args: ['--no-sandbox'] } (default undefined)
    • returns: Promise of profileScraper function
  • profileScraper(url, waitTimeMs = 500)

    • url string: A LinkedIn profile URL
    • waitTimeMs integer: milliseconds to wait page load before scraping
    • returns: Promise of profile Object
  • profile Object:

    {
      profile: {
        name, headline, location, summary, connections, followers
      },
      positions:[
        { title, company, description, date1, date2,
          roles: [{ title, description, date1, date2 }]
        }
      ],
      educations: [
        { title, degree, date1, date2 }
      ],
      skills: [
        { title, count }
      ],
      recommendations: [
        { user, text }
      ],
      recommendationsCount: {
        received, given
      },
      recommendationsReceived: [
        { user, text }
      ],
      recommendationsGiven: [
        { user, text }
      ],
      accomplishments: [
       { count, title, items }
      ],
      volunteerExperience: {
        title, experience, location, description, date1, date2
      },
      peopleAlsoViewed: [
        { user, text }
      ]
    }

Tips

  • We already built a crawler to automatically collect multiple profiles, so check it out: scrapedin-linkedin-crawler

  • Usually in the first run LinkedIn asks for a manual check, to solve that you should:

    • set isHeadless to false on scrapedin to solve the manual check in the browser.
    • set waitTimeMs with a large number (such as 10000) to you have time to solve the manual check.

    After doing the manual check once you can go back with isHeadless and waitTimeMs previous values and start the scraping.

    We still don't have a solution for that on remote servers without GUI, if you have any idea please tell us!

Contribution

Feel free to contribute. Just open an issue to discuss something before creating a PR.

License

Apache 2.0

About

LinkedIn Scraper (working for the new website 2019)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 100.0%