Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

from scrapyjs import SplashRequest #47

Closed
podolskyi opened this issue Apr 1, 2016 · 5 comments
Closed

from scrapyjs import SplashRequest #47

podolskyi opened this issue Apr 1, 2016 · 5 comments

Comments

@podolskyi
Copy link

When I try import SplashRequest error occurred:
from scrapyjs import SplashRequest
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name SplashRequest

but
import scrapyjs
works.

Metadata-Version: 1.1
Name: scrapyjs
Version: 0.2
Summary: JavaScript support for Scrapy using Splash
Home-page: https://github.com/scrapy-plugins/scrapy-splash
Author: Mikhail Korobov
Author-email: [email protected]
License: BSD
Location: /usr/local/lib/python2.7/dist-packages

@kmike
Copy link
Member

kmike commented Apr 1, 2016

Hey @podolskyi,

There are 2 reasons:

  1. If you're using scrapyjs from pypi (https://pypi.python.org/pypi/scrapyjs) then you're using v0.2, and SplashRequest is not documented and not exposed there. Github README is for master branch.

  2. In master branch SplashRequest is not exposed as scrapyjs.SplashRequest; it is a bug which is fixed in Custom Splash responses #45 among other changes.

Sorry for inconvenience! There is a lot of changes coming to scrapy-splash, it is a bit in flux now. https://pypi.python.org/pypi/scrapyjs/0.2 is a stable release.

@podolskyi
Copy link
Author

@kmike thanks for the quick response. I test both versions pypi and from master branch.

Can you help me, simple question.

  1. How I can use crawlera with scrapyjs.

@podolskyi
Copy link
Author

@kmike I test example from pypi:

import json
import base64

class MySpider(scrapy.Spider):

    # ...
        script = """
        function main(splash)
            assert(splash:go(splash.args.url))
            return splash:evaljs("document.title")
        end
        """
        yield scrapy.Request(url, self.parse_result, meta={
            'splash': {
                'args': {'lua_source': script},
                'endpoint': 'execute',
            }
        })

    # ...
    def parse_response(self, response):
        doc_title = response.body_as_unicode()
        # ...

It works, thanks.

@kmike
Copy link
Member

kmike commented Apr 1, 2016

@podolskyi

  1. A basic way to use Crawlera is to use proxy argument; add it to 'args'. But this solution has some issues because Crawlera is not aware you're sending multiple requests to render a single page. A better way is to follow the example here: http://doc.scrapinghub.com/crawlera.html#using-crawlera-with-splash - there is some boilerplate which you can copy-paste to your Lua script.
  2. Check the example at https://pypi.python.org/pypi/scrapyjs ("Run a simple Splash Lua Script: ...") - you already figured that out!

@kmike
Copy link
Member

kmike commented Apr 11, 2016

scrapy-splash 0.3 is released; now README should match the package on pypi, so I'm closing this ticket.

@kmike kmike closed this as completed Apr 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants