Skip to content

Releases: webrecorder/browsertrix-crawler

Browsertrix Crawler v1.4.1

26 Nov 02:44
fb8ed18
Compare
Choose a tag to compare

What's Changed

  • package: pin @novnc/novnc to 1.4.0 to prevent accidental upgrades by @ikreymer in #727

Full Changelog: v1.4.0...v1.4.1

Browsertrix Crawler v1.4.0

25 Nov 08:54
Compare
Choose a tag to compare

What's Changed

  • Support loading custom behaviors from URLs and/or filepaths by @tw4l in #707
  • support custom css selectors for extracting links by @ikreymer in #689
  • Dependency Update by @ikreymer in #718
  • add disable-lazy-loading flag, should fix #699 by @ikreymer in #720
  • Support loading custom behaviors from git repo by @tw4l in #717
  • fix indexing of cookie header: by @ikreymer in #714
  • Ensure partial responses are not written by @ikreymer in #721
  • support removing range from query (via wabac.js 2.20.6): by @ikreymer in #724
  • Implemented option for FullPage screenshot after the behaviours have run by @fservida in #656
  • Dependency Update by @ikreymer in #725

New Contributors

Full Changelog: v1.3.5...v1.4.0

Browsertrix Crawler v1.4.0-beta.1

24 Nov 19:28
6bfa7d5
Compare
Choose a tag to compare
Pre-release

What's Changed

  • support removing range from query (via wabac.js 2.20.6): by @ikreymer in #724
  • Implemented option for FullPage screenshot after the behaviours have run by @fservida in #656
  • Dependency Update by @ikreymer in #725

New Contributors

Full Changelog: v1.4.0-beta.0...v1.4.0-beta.1

Browsertrix Crawler v1.4.0-beta.0

14 Nov 07:29
0b9cd71
Compare
Choose a tag to compare
Pre-release

What's Changed

Full Changelog: v1.3.4...v1.4.0-beta.0

Browsertrix Crawler v1.3.5

05 Nov 21:47
3187685
Compare
Choose a tag to compare

What's Changed

  • quick fix for cookies not being available for replay (regression from 1.2.x), more extensive fix coming in next version.
  • fix cookie not being passed to replay regression: for now, add x-waba… by @ikreymer in #713

Full Changelog: v1.3.4...v1.3.5

Browsertrix Crawler v1.3.4

31 Oct 21:07
e5bab8e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.3.3...v1.3.4

Browsertrix Crawler v1.3.3

11 Oct 07:19
Compare
Choose a tag to compare

What's Changed

  • Fix for rare crash (link extraction promise cleanup): by @ikreymer in #701

Full Changelog: v1.3.2...v1.3.3

Browsertrix Crawler v1.3.2

08 Oct 00:26
157ac34
Compare
Choose a tag to compare

What's Changed

  • ensure extraHops also apply to maxDepth by @ikreymer in #694
  • Tests: disable blockrules test in CI by @ikreymer in #698
  • Add documentation for crawl collections by @tw4l in #695
  • bump puppeteer core to 23.5.1 by @ikreymer in #700
  • fix typo in QA exclude check, which resulted in all URLs being excluded by @ikreymer in #697

Full Changelog: v1.3.1...v1.3.2

Browsertrix Crawler v1.3.1

27 Sep 18:32
Compare
Choose a tag to compare

What's Changed

  • direct fetch: when cancelling due to redirect, read full body by @ikreymer in #688
  • Include depth in pages JSONL files by @tw4l in #691
  • Additional exception safety by @ikreymer in #692

Full Changelog: v1.3.0...v1.3.1

Browsertrix Crawler v1.3.0

12 Sep 16:30
Compare
Choose a tag to compare

What's Changed

  • Use isolated Python venv for dependencies installation by @benoit74 in #591
  • Adds warning about crawling with basic auth by @Shrinks99 in #669
  • Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
  • SOCKS5 over SSH Tunnel Support by @ikreymer in #671
  • Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673
  • fix for direct fetch timeouts by @ikreymer in #677
  • WARC writer + incremental indexing fixes by @ikreymer in #679
  • Additional direct fetch improvements by @ikreymer in #678
  • crawler args typing by @ikreymer in #680
  • bump browser to 1.69.162 by @ikreymer in #681
  • cleanup: remove old config files from pywb by @ikreymer in #682
  • eslint: add strict await checking: by @ikreymer in #684
  • update current crawl size in redis on each healthcheck call by @ikreymer in #685
  • exit codes: exit with error code 10 if interrupt is caused by unexpected browser exit by @ikreymer in #686

Full Changelog: v1.2.8...v1.3.0