-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'zytedata/main' into google-search-max-r…
…equests-int
- Loading branch information
Showing
29 changed files
with
2,928 additions
and
206 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
.. _search-queries: | ||
|
||
============== | ||
Search queries | ||
============== | ||
|
||
The :ref:`e-commerce spider template <e-commerce>` supports a spider argument, | ||
:data:`~zyte_spider_templates.spiders.ecommerce.EcommerceSpiderParams.search_queries`, | ||
that allows you to define a different search query per line, and | ||
turns the input URLs into search requests for those queries. | ||
|
||
For example, given the following input URLs: | ||
|
||
.. code-block:: none | ||
https://a.example | ||
https://b.example | ||
And the following list of search queries: | ||
|
||
.. code-block:: none | ||
foo bar | ||
baz | ||
By default, the spider would send 2 initial requests to those 2 input URLs, | ||
to try and find out how to build a search request for them, and if it succeeds, | ||
it will then send 4 search requests, 1 per combination of input URL and search | ||
query. For example: | ||
|
||
.. code-block:: none | ||
https://a.example/search?q=foo+bar | ||
https://a.example/search?q=baz | ||
https://b.example/s/foo%20bar | ||
https://b.example/s/baz | ||
The default implementation uses a combination of HTML metadata, AI-based HTML | ||
form inspection and heuristics to find the most likely way to build a search | ||
request for a given website. | ||
|
||
If this default implementation does not work as expected on a given website, | ||
you can :ref:`write a page object to fix that <fix-search>`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[pytest] | ||
filterwarnings = | ||
ignore:deprecated string literal syntax::jmespath.lexer |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.