Skip to content

Commit

Permalink
Merge pull request #29 from ClericPy/dev
Browse files Browse the repository at this point in the history
2.3.0
  • Loading branch information
ClericPy authored Jun 21, 2020
2 parents 613a456 + c440d68 commit 57fcf89
Show file tree
Hide file tree
Showing 9 changed files with 398 additions and 168 deletions.
147 changes: 86 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,23 @@
assert (await tab0.current_url) == 'about:blank'
assert (await tab1.current_url) == 'about:blank'
```
1. `connect_tab`
> The easiest way to get a connected tab.
> get an existing tab
```python
async with chrome.connect_tab(0) as tab:
print(await tab.current_title)
```
> get a new tab and auto close it
```python
async with chrome.connect_tab(None, True) as tab:
print(await tab.current_title)
```
> get a new tab with given url and auto close it
```python
async with chrome.connect_tab('http://python.org', True) as tab:
print(await tab.current_title)
```
</details>
Expand All @@ -90,102 +106,102 @@
1. `set_url` / `reload`
> navigate to a new url. `reload` equals to `set_url(None)`
1. `wait_event`
2. `wait_event`
> listening the events with given name, and separate from other same-name events with filter_function, finally run the callback_function with result.
1. `wait_page_loading` / `wait_loading`
3. `wait_page_loading` / `wait_loading`
> wait for `Page.loadEventFired` event, or stop loading while timeout. Different from `wait_loading_finished`.
1. `wait_response` / `wait_request`
4. `wait_response` / `wait_request`
> filt the `Network.responseReceived` / `Network.requestWillBeSent` event by `filter_function`, return the `request_dict` which can be used by `get_response` / `get_response_body` / `get_request_post_data`. WARNING: requestWillBeSent event fired do not mean the response is ready, should await tab.wait_request_loading(request_dict) or await tab.get_response(request_dict, wait_loading=True)
1. `wait_request_loading` / `wait_loading_finished`
5. `wait_request_loading` / `wait_loading_finished`
> sometimes event got `request_dict` with `wait_response`, but the ajax request is still fetching, which need to wait the `Network.loadingFinished` event.
1. `activate` / `activate_tab`
6. `activate` / `activate_tab`
> activate tab with websocket / http message.
1. `close` / `close_tab`
7. `close` / `close_tab`
> close tab with websocket / http message.
1. `add_js_onload`
8. `add_js_onload`
> `Page.addScriptToEvaluateOnNewDocument`, which means this javascript code will be run before page loaded.
1. `clear_browser_cache` / `clear_browser_cookies`
9. `clear_browser_cache` / `clear_browser_cookies`
> `Network.clearBrowserCache` and `Network.clearBrowserCookies`
1. `querySelectorAll`
10. `querySelectorAll`
> get the tag instance, which contains the `tagName, innerHTML, outerHTML, textContent, attributes` attrs.
1. `click`
11. `click`
> click the element queried by given *css selector*.
1. `refresh_tab_info`
12. `refresh_tab_info`
> to refresh the init attrs: `url`, `title`.
1. `current_html` / `current_title` / `current_url`
13. `current_html` / `current_title` / `current_url`
> get the current html / title / url with `tab.js`. or using the `refresh_tab_info` method and init attrs.
1. `crash`
14. `crash`
> `Page.crash`
1. `get_cookies` / `get_all_cookies` / `delete_cookies` / `set_cookie`
15. `get_cookies` / `get_all_cookies` / `delete_cookies` / `set_cookie`
> some page cookies operations.
1. `set_headers` / `set_ua`
16. `set_headers` / `set_ua`
> `Network.setExtraHTTPHeaders` and `Network.setUserAgentOverride`, used to update headers dynamically.
1. `close_browser`
17. `close_browser`
> send `Browser.close` message to close the chrome browser gracefully.
1. `get_bounding_client_rect` / `get_element_clip`
18. `get_bounding_client_rect` / `get_element_clip`
> `get_element_clip` is alias name for the other, these two method is to get the rect of element which queried by css element.
1. `screenshot` / `screenshot_element`
19. `screenshot` / `screenshot_element`
> get the screenshot base64 encoded image data. `screenshot_element` should be given a css selector to locate the element.
1. `get_page_size` / `get_screen_size`
20. `get_page_size` / `get_screen_size`
> size of current window or the whole screen.
1. `get_response`
21. `get_response`
> get the response body with the given request dict.
1. `js`
22. `js`
> run the given js code, return the raw response from sending `Runtime.evaluate` message.
1. `inject_js_url`
23. `inject_js_url`
> inject some js url, like `<script src="xxx/static/js/jquery.min.js"></script>` do.
1. `get_value` & `get_variable`
24. `get_value` & `get_variable`
> run the given js variable or expression, and return the result.
```python
await tab.get_value('document.title')
await tab.get_value("document.querySelector('title').innerText")
```
8. `keyboard_send`
25. `keyboard_send`
> dispath key event with `Input.dispatchKeyEvent`
9. `mouse_click`
26. `mouse_click`
> dispath click event on given position
1. `mouse_drag`
27. `mouse_drag`
> dispath drag event on given position, and return the target x, y. `duration` arg is to slow down the move speed.
1. `mouse_drag_rel`
28. `mouse_drag_rel`
> dispath drag event on given offset, and return the target x, y.
1. `mouse_drag_rel`
29. `mouse_drag_rel`
> drag with offsets continuously.
```python
await tab.set_url('https://draw.yunser.com/')
walker = await tab.mouse_drag_rel_chain(320, 145).move(50, 0, 0.2).move(
0, 50, 0.2).move(-50, 0, 0.2).move(0, -50, 0.2)
await walker.move(50 * 1.414, 50 * 1.414, 0.2)
```
1. `mouse_press` / `mouse_release` / `mouse_move` / `mouse_move_rel` / `mouse_move_rel_chain`
30. `mouse_press` / `mouse_release` / `mouse_move` / `mouse_move_rel` / `mouse_move_rel_chain`
> similar to the drag features. These mouse features is only dispatched events, not the real mouse action.
1. `history_back` / `history_forward` / `goto_history_relative` / `reset_history`
31. `history_back` / `history_forward` / `goto_history_relative` / `reset_history`
> back / forward history
Expand Down Expand Up @@ -220,26 +236,25 @@ import asyncio
async def main():
# If there is an existing daemon, such as `python -m ichrome`, the `async with AsyncChromeDaemon` context can be omitted.
async with AsyncChromeDaemon():
# connect to an opened chrome
# connect to an opened chrome, default host=127.0.0.1, port=9222, headless=False
async with AsyncChrome() as chrome:
tab = await chrome.new_tab(url="https://github.com/ClericPy")
# async with tab() as tab:
# and `as tab` can be omitted
async with tab():
# If you need reuse an existing tab, set index with int like 0 for activated tab, such as `async with chrome.connect_tab(0) as tab:`
async with chrome.connect_tab(index='https://github.com/ClericPy',
auto_close=True) as tab:
await tab.wait_loading(2)
await tab.js("document.write('<h1>Document updated.</h1>')")
await asyncio.sleep(1)
# await tab.js('alert("test ok")')
print('output:', await tab.html)
# output: <html><head></head><body><h1>Document updated.</h1></body></html>
await tab.close()
# will auto_close tab while exiting context
# await tab.close()
# close_browser gracefully, I have no more need of chrome instance
await chrome.close_browser()
if __name__ == "__main__":
asyncio.run(main())
```

[More Examples](https://github.com/ClericPy/ichrome/blob/master/examples_async.py)
Expand Down Expand Up @@ -277,58 +292,68 @@ usage:
All the unknown args will be appended to extra_config as chrome original args.
Demo:
> python -m ichrome --host=127.0.0.1 --window-size=1212,1212 --incognito
> ChromeDaemon cmd args: {'daemon': True, 'block': True, 'chrome_path': '', 'host': '127.0.0.1', 'port': 9222, 'headless': False, 'user_agent': '', 'proxy': '', 'user_data_dir': None, 'disable_image': False, 'start_url': 'about:blank', 'extra_config': ['--window-size=1212,1212', '--incognito'], 'max_deaths': 1, 'timeout': 2}
> python -m ichrome -H 127.0.0.1 -p 9222 --window-size=1212,1212 --incognito
> ChromeDaemon cmd args: port=9222, {'chrome_path': '', 'host': '127.0.0.1', 'headless': False, 'user_agent': '', 'proxy': '', 'user_data_dir': WindowsPath('C:/Users/root/ichrome_user_data'), 'disable_image': False, 'start_url': 'about:blank', 'extra_config': ['--window-size=1212,1212', '--incognito'], 'max_deaths': 1, 'timeout':1, 'proc_check_interval': 5, 'debug': False}
> python -m ichrome
> ChromeDaemon cmd args: port=9222, {'chrome_path': '', 'host': '127.0.0.1', 'headless': False, 'user_agent': '', 'proxy': '', 'user_data_dir': WindowsPath('C:/Users/root/ichrome_user_data'), 'disable_image': False, 'start_url': 'about:blank', 'extra_config': [], 'max_deaths': 1, 'timeout': 1, 'proc_check_interval': 5, 'debug': False}
Other operations:
1. kill local chrome process with given port:
python -m ichrome -s 9222
python -m ichrome -k 9222
2. clear user_data_dir path (remove the folder and files):
python -m ichrome --clear
python -m ichrome --clean
python -m ichrome -C -p 9222
3. show ChromeDaemon.__doc__:
python -m ichrome --doc
4. crawl the URL, output the HTML DOM:
python -m ichrome --crawl --headless --timeout=2 http://myip.ipip.net/
optional arguments:
-h, --help show this help message and exit
-V, --version ichrome version info
-c CHROME_PATH, --chrome_path CHROME_PATH
-v, -V, --version ichrome version info
-c CONFIG, --config CONFIG
load config dict from JSON file of given path
-cp CHROME_PATH, --chrome-path CHROME_PATH, --chrome_path CHROME_PATH
chrome executable file path, default to null for
automatic searching
--host HOST --remote-debugging-address, default to 127.0.0.1
-H HOST, --host HOST --remote-debugging-address, default to 127.0.0.1
-p PORT, --port PORT --remote-debugging-port, default to 9222
--headless --headless and --hide-scrollbars, default to False
-s SHUTDOWN, --shutdown SHUTDOWN
-s SHUTDOWN, -k SHUTDOWN, --shutdown SHUTDOWN
shutdown the given port, only for local running chrome
--user_agent USER_AGENT
--user-agen, default to 'Mozilla/5.0 (Windows NT 10.0;
WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/70.0.3538.102 Safari/537.36'
--proxy PROXY --proxy-server, default to None
--user_data_dir USER_DATA_DIR
user_data_dir to save the user data, default to
-A USER_AGENT, --user-agent USER_AGENT, --user_agent USER_AGENT
--user-agent, default to Chrome PC: Mozilla/5.0
(Linux; Android 6.0; Nexus 5 Build/MRA58N)
AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/83.0.4103.106 Mobile Safari/537.36
-x PROXY, --proxy PROXY
--proxy-server, default to None
-U USER_DATA_DIR, --user-data-dir USER_DATA_DIR, --user_data_dir USER_DATA_DIR
user_data_dir to save user data, default to
~/ichrome_user_data
--disable_image disable image for loading performance, default to
--disable-image, --disable_image
disable image for loading performance, default to
False
--start_url START_URL
-url START_URL, --start-url START_URL, --start_url START_URL
start url while launching chrome, default to
about:blank
--max_deaths MAX_DEATHS
max deaths in 5 secs, auto restart `max_deaths` times
if crash fast in 5 secs. default to 1 for without
auto-restart
--max-deaths MAX_DEATHS, --max_deaths MAX_DEATHS
restart times. default to 1 for without auto-restart
--timeout TIMEOUT timeout to connect the remote server, default to 1 for
localhost
--workers WORKERS the number of worker processes with auto-increment
port, default to 1
--proc_check_interval PROC_CHECK_INTERVAL
-w WORKERS, --workers WORKERS
the number of worker processes, default to 1
--proc-check-interval PROC_CHECK_INTERVAL, --proc_check_interval PROC_CHECK_INTERVAL
check chrome process alive every interval seconds
--crawl crawl the given URL, output the HTML DOM
--clean, --clear clean user_data_dir
-C, --clear, --clear clean user_data_dir
--doc show ChromeDaemon.__doc__
--debug set logger level to DEBUG
-K, --killall killall chrome launched local with --remote-debugging-
port
```

## Interactive Debugging
Expand Down
5 changes: 5 additions & 0 deletions examples_async.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,7 @@ def on_shutdown(chromed):
assert chrome.get_memory() > 0
await test_chrome(chrome)
# ===================== Tab Test Cases =====================
# Duplicate, use async with chrome.connect_tab(None) instead
tab: Tab = await chrome.new_tab()
await test_tab_ws(tab)
# same as: async with tab.connect():
Expand Down Expand Up @@ -291,6 +292,10 @@ def on_shutdown(chromed):
assert await tab.clear_browser_cache()
# close tab
await tab.close()
# test chrome.connect_tab
async with chrome.connect_tab(chrome.server + '/json', True) as tab:
await tab.wait_loading(2)
assert 'webSocketDebuggerUrl' in (await tab.current_html)
# close_browser gracefully, I have no more need of chrome instance
await chrome.close_browser()
# await chrome.kill()
Expand Down
2 changes: 1 addition & 1 deletion ichrome/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from .logs import logger
from .sync_utils import Chrome, Tab

__version__ = "2.2.4"
__version__ = "2.3.0"
__tips__ = "[github]: https://github.com/ClericPy/ichrome\n[cdp]: https://chromedevtools.github.io/devtools-protocol/\n[cmd args]: https://peter.sh/experiments/chromium-command-line-switches/"
__all__ = [
'Chrome', 'ChromeDaemon', 'Tab', 'Tag', 'AsyncChrome', 'AsyncTab', 'logger',
Expand Down
Loading

0 comments on commit 57fcf89

Please sign in to comment.