Releases: webrecorder/browsertrix
Releases · webrecorder/browsertrix
Browsertrix Cloud 1.5.0 Beta 0
Major Changes
- Switch to new operator controller for CrawlJobs and ProfileJobs
- New integrated workflow / crawling UI, grouping crawls by crawl config
What's Changed
- Add crawl /log API endpoint to stream crawler logs by @tw4l in #682
- Remove CI by @stavares843 in #760
- Upgrade to mongo 6 and use sortArray for workflow crawls (#764) by @ikreymer in #765
- Add crawl timeout nightly test by @tw4l in #762
- Docs: Font additions, style updates, and icon implementation by @Shrinks99 in #758
- Update collections backend API by @tw4l in #759
- backend: add 'lastCrawlStartTime' and 'lastStartedByName' fields to c… by @ikreymer in #753
- Add crawl errors endpoint by @tw4l in #757
- crawlconfig api: add 'currCrawlState' and 'currCrawlTimeStart' to crawlconfig list api (already queried on backend) by @ikreymer in #770
- Make btrix helper work with microk8s by @tw4l in #768
- Fix workflow total size by @tw4l in #783
- nginx: enable worker processes autotune to correctly set the number o… by @ikreymer in #785
- Revert docs body font to Inter by @Shrinks99 in #790
- Refactor to use new operator on backend by @ikreymer in #789
- Backend: App Startup Fixes by @ikreymer in #793
- Frontend crawl workflows rework by @SuaYoo in #775
- Parse JSON-l errors before returning by @tw4l in #799
Full Changelog: v1.4.0...v1.5.0-beta.0
Browsertrix Cloud 1.4.1
Bug fix release: Update to mongo 6 to support $sortArray
operation, don't use JS in mongo due to lack of support by DigitalOcean.
What's Changed
Full Changelog: v1.4.0...v1.4.1
Browsertrix Cloud 1.4.0
Major Features
- Crawl Configs are now mutable, instead of creating a duplicate config
- Crawl Configs are now renamed to Workflows. (workflow revisions are tracked internally but not yet displayed)
- Support for latest Browsertrix Crawler 0.9.0 which sorts crawl queue by depth, uses playwright for crawling
What's Changed
- Add icons to crawl details navigation buttons by @Shrinks99 in #666
- Paginate API list endpoints by @tw4l in #659
- Fix crawl config name in "run now" alert by @SuaYoo in #673
- Fix missing crawl config name by @SuaYoo in #683
- backend: make crawlconfigs mutable! (#656) by @ikreymer in #662
- Limit organization name length by @SuaYoo in #671
- permissions: allow user with 'viewer' permissions to access read-only… by @ikreymer in #687
- Improve crawl queue pagination UX by @SuaYoo in #680
- Migrate crawl config frontend -> workflow by @SuaYoo in #686
- backend: Fix for total crawl time limit. by @ikreymer in #665
- Add lightweight logging mode by @leepro in #668
- exclusions editor fix: by @ikreymer in #692
- Fix saving config limit and browser setting fields by @SuaYoo in #704
- Disable copy tags menu item if no tags by @SuaYoo in #709
- Combine watch crawl with crawl queue by @SuaYoo in #710
- backend: update queue apis to work with new sorted queue apis (also b… by @ikreymer in #712
- Add proofread action CI by @stavares843 in #714
- Add optional description to crawl configs by @tw4l in #707
- Allow users to set workflow description by @SuaYoo in #708
- Remove new issue project automation by @SuaYoo in #718
- Add Playwright UI tests + CI by @stavares843 in #614
- chore(playwright): fix version by @stavares843 in #725
- Fix migration to avoid jobType KeyError by @tw4l in #727
- Leave trailing slash in seed URLs by @SuaYoo in #731
- fix version related to @playwright/test by @stavares843 in #729
- Filter and sort crawl and workflow list API endpoints in backend by @tw4l in #724
- Add README.md related to run playwright tests locally by @stavares843 in #722
- Allow configurable max pages per crawl in deployment settings by @ikreymer in #717
- Add pageSize to pagination output format by @tw4l in #736
- Update nightly test fixtures to use Seed objects by @tw4l in #734
- Max page limit override by @ikreymer in #737
- misc frontend build fixes: playwright version + chunking by @ikreymer in #740
- Set max pages to API default by @SuaYoo in #739
- No More Workflowuration by @ikreymer in #741
- fix: only include finished crawls in crawlCount value for /api/crawlconfigs by @ikreymer in #746
- config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend… by @ikreymer in #742
- Crawls list backend pagination by @SuaYoo in #735
- Frontend Docker build improvements by @SuaYoo in #749
- Allow users to set additional page time limits by @SuaYoo in #744
- Fix additional URLs by @SuaYoo in #752
- Add btrix CLI dev helper by @tw4l in #732
- Configure crawler disk utilization threshold via helm chart by @tw4l in #748
- Docs: adds mkdocs features, adds theming by @Shrinks99 in #728
- Inverts the autoscroll behavior setting to true by default by @Shrinks99 in #756
- Adds
inputmode
attributes to workflow config fields by @Shrinks99 in #755
Full Changelog: v1.3.1...v1.4.0
Browsertrix Cloud 1.4.0 Beta 2
What's Changed
- Add proofread action CI by @stavares843 in #714
- Add optional description to crawl configs by @tw4l in #707
- Allow users to set workflow description by @SuaYoo in #708
- Remove new issue project automation by @SuaYoo in #718
- Add Playwright UI tests + CI by @stavares843 in #614
- chore(playwright): fix version by @stavares843 in #725
- Fix migration to avoid jobType KeyError by @tw4l in #727
- Leave trailing slash in seed URLs by @SuaYoo in #731
- fix version related to @playwright/test by @stavares843 in #729
- Filter and sort crawl and workflow list API endpoints in backend by @tw4l in #724
- Add README.md related to run playwright tests locally by @stavares843 in #722
- Allow configurable max pages per crawl in deployment settings by @ikreymer in #717
- Add pageSize to pagination output format by @tw4l in #736
- Update nightly test fixtures to use Seed objects by @tw4l in #734
- Max page limit override by @ikreymer in #737
- misc frontend build fixes: playwright version + chunking by @ikreymer in #740
- Set max pages to API default by @SuaYoo in #739
- No More Workflowuration by @ikreymer in #741
- fix: only include finished crawls in crawlCount value for /api/crawlconfigs by @ikreymer in #746
- config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend… by @ikreymer in #742
- Crawls list backend pagination by @SuaYoo in #735
- Frontend Docker build improvements by @SuaYoo in #749
- Allow users to set additional page time limits by @SuaYoo in #744
- Fix additional URLs by @SuaYoo in #752
- Add btrix CLI dev helper by @tw4l in #732
- Configure crawler disk utilization threshold via helm chart by @tw4l in #748
- Docs: adds mkdocs features, adds theming by @Shrinks99 in #728
Full Changelog: v1.4.0-beta.1...v1.4.0-beta.2
Browsertrix Cloud 1.4.0 Beta 1
Crawl Features
- Support for latest Browsertrix Crawler 0.9.0 Beta 1 which sorts crawl queue by depth
What's Changed
- Fix saving config limit and browser setting fields by @SuaYoo in #704
- Disable copy tags menu item if no tags by @SuaYoo in #709
- Combine watch crawl with crawl queue by @SuaYoo in #710
- backend: update queue apis to work with new sorted queue apis (also b… by @ikreymer in #712
Full Changelog: v1.4.0-beta.0...v1.4.0-beta.1
v1.4.0-beta.0
Key Changes
- Initial refactor for changing Crawl Configs -> Workflows, which keep a revision history of crawl configs and are fixed, eg. editing crawlconfig / exclusions updates the latest config, versions the previous one and stays within the same Workflow object.
What's Changed
- Add icons to crawl details navigation buttons by @Shrinks99 in #666
- Paginate API list endpoints by @tw4l in #659
- Fix crawl config name in "run now" alert by @SuaYoo in #673
- Fix missing crawl config name by @SuaYoo in #683
- backend: make crawlconfigs mutable! (#656) by @ikreymer in #662
- Limit organization name length by @SuaYoo in #671
- permissions: allow user with 'viewer' permissions to access read-only… by @ikreymer in #687
- Improve crawl queue pagination UX by @SuaYoo in #680
- Migrate crawl config frontend -> workflow by @SuaYoo in #686
- backend: Fix for total crawl time limit. by @ikreymer in #665
- Add lightweight logging mode by @leepro in #668
- exclusions editor fix: by @ikreymer in #692
Full Changelog: v1.3.1...v1.4.0-beta.0
Browsertrix Cloud 1.3.1
Frontend fixes and improvements.
What's Changed
- Compute name from seed URLs in UI by @SuaYoo in #644
- Hide file size when crawl is running by @SuaYoo in #648
- Improve tag input keyboard navigation by @SuaYoo in #650
- Improve crawl list rendering by @SuaYoo in #645
- Automate issue project status by @SuaYoo in #660
- Persist "show only mine" across page refresh by @SuaYoo in #661
Full Changelog: v1.3.0...v1.3.1
Browsertrix Cloud 1.3.0
Major Updates
- Improve superadmin UI, including support for managing roles and creating new orgs via admin UI
- Updated Crawl list view
- Support crawl notes
- Support for deleting crawls
What's Changed
- chore(typo): fix typo in read me by @stavares843 in #552
- Add/remove admin node pool according to its configuration by @leepro in #556
- Minor visual / mockup alignment improvements by @Shrinks99 in #551
- Serialize pending invites to return "id" not "_id" by @tw4l in #559
- Fix invite accept in UI by @SuaYoo in #560
- Allow superadmins to create org from UI by @SuaYoo in #563
- health readiness check: more tolerant health check by @ikreymer in #562
- Run unit tests in frontend PR check by @SuaYoo in #569
- Fix text overflow problem on crawl details page by @Shrinks99 in #570
- Allow URL list to have URLs containing commas by @SuaYoo in #572
- Invite token improvements by @tw4l in #564
- Add org-specific delete invite endpoint by @tw4l in #575
- backend: /orgs//remove: return 404 if org user doesn't exist, fix… by @ikreymer in #561
- Manage org member roles and invites by @SuaYoo in #558
- Add notes to crawl and crawl updates by @tw4l in #587
- Make crawlconfig name optional by @tw4l in #588
- Remove non-org-scoped invites from backend by @tw4l in #585
- Improve superadmin invite UI by @SuaYoo in #581
- Crawl details "Crawl Scale" → "Crawler Instances" by @Shrinks99 in #589
- Make all config form help text localizable by @SuaYoo in #593
- Make pending invites expire via TTL index by @tw4l in #568
- Fix doc to build a local image for microk8s by @leepro in #594
- Fix app not rendering with bad auth storage states by @SuaYoo in #597
- Fix POST /orgs/{oid}/crawls/delete by @tw4l in #591
- enable firewalld ports by @kayiwa in #602
- Add back GET /users/invite/{token} used by frontend by @tw4l in #607
- Allow user to delete individual crawls by @SuaYoo in #609
- rocky firewall by @kayiwa in #604
- Remove crawlconfig name from file suffixes by @tw4l in #610
- fix the admin logging doc by @leepro in #612
- Edit crawl notes from crawl detail view by @SuaYoo in #595
- Include
firstSeed
andseedCount
fields in GET crawl API endpoints by @tw4l in #618 - Update crawls list control bar UI by @SuaYoo in #611
- crawler arguments fixes: by @ikreymer in #621
- Make nightly tests run nightly, not monthly by @tw4l in #624
- Dynamically calculate crawl stats for crawlconfig endpoints by @tw4l in #623
- Disable editing crawl config of running crawls by @SuaYoo in #620
- Fix nightly tests by @tw4l in #632
- Fix microk8s CI by @tw4l in #634
- Fix typos by @stavares843 in #640
- Adds h1 page titles, edits heading hierarchy, minor graphical tweaks and fixes by @Shrinks99 in #638
- Chart: split Crawl args into separate variables by @ikreymer in #639
- Rocky firewall by @kayiwa in #635
- Update crawls list styles by @SuaYoo in #630
- CrawlConfig migration and crawl stats query optimization by @tw4l in #633
- rename Information -> Metadata, rebuild localization strings list by @ikreymer in #642
Full Changelog: v1.2.0...v1.3.0
Browsertrix Cloud 1.2.0
Key Features / Changes
- Crawl Config Overhaul, with Seeded Crawl and URL List crawl config types
- Archives -> Orgs Rename
- Profile Page Update
- User role permission fixes
- Support for tags on crawl configs and crawls
- Docker Swarm support removed
- New Docs via Mkdocs (hosted at: https://docs.browsertrix.cloud/)
What's Changed
- fix link by @edsu in #404
- CI: Add K3D CI test by @ikreymer in #405
- Remove Code and Configs for Swarm/podman support by @ikreymer in #407
- New create crawl config user workflow by @SuaYoo in #391
- Minor docs style updates by @Shrinks99 in #409
- docs: CHANGES: fix typo, begin changelist for 1.2.0 by @ikreymer in #410
- Add single crawl info api at /crawls/{crawl_id} by @ikreymer in #418
- Disable replay-web-page HTTP caching by @SuaYoo in #419
- Crawl config detail view & edit workflow UI updates by @SuaYoo in #415
- Compute crawl config name from seed URLs by @SuaYoo in #435
- Frontend archives -> teams migration by @SuaYoo in #429
- Persist currently selected team/archive by @SuaYoo in #441
- Fix app not loading on older Safari by @SuaYoo in #436
- Have ingress for signer only when it is enabled by @leepro in #446
- Use archive_viewer_dep permissions to GET crawls by @tw4l in #443
- Always sub-navigation bar for selected team by @SuaYoo in #444
- Sticky the crawl config progress indicator position by @SuaYoo in #445
- VNC-Based Profile Browser by @ikreymer in #433
- Backend lint check by @ikreymer in #451
- quickfix: pydantic / lint fix by @ikreymer in #452
- backend: initial tags api support (addresses #365): by @ikreymer in #434
- API filters by user + crawl collection ids by @ikreymer in #462
- Filter crawls, configs, browser profiles by user ID by @SuaYoo in #463
- add digital ocean documentation by @kayiwa in #421
- Fix skipping to confirm when duplicating crawl config by @SuaYoo in #454
- Crawl config tag editor UI by @SuaYoo in #422
- Run frontend formatter on pre-commit hook by @SuaYoo in #461
- Add default organization by @tw4l in #465
- Copy tags from crawlconfig to crawl by @ikreymer in #467
- ansible: digitalocean tweaks: by @ikreymer in #469
- email sending tweaks: by @ikreymer in #470
- backend: add 'allow_dupe_invites' option to allow re-inviting users. … by @ikreymer in #471
- backend: registration: by @ikreymer in #472
- backend: password related fixes: by @ikreymer in #479
- Allowed URL Prefixes → Extra URLs In Scope by @Shrinks99 in #477
- Crawl config frontend fixes by @SuaYoo in #482
- Fix localization build by @SuaYoo in #488
- Crawl config form panel UX enhancement & fix by @SuaYoo in #489
- Improve "Show Only Mine" switch visibility by @SuaYoo in #494
- Add frontend build check by @SuaYoo in #498
- Add locale codes to version control by @SuaYoo in #501
- Add all localization files to source control by @SuaYoo in #502
- Rename archives/teams -> orgs in codebase + add db migration by @tw4l in #486
- backend: add default behavior time to /api/settings (part of #321) by @ikreymer in #499
- Add path filter to GH workflows by @SuaYoo in #500
- Improve Page Time Limit UX by @SuaYoo in #503
- Autocomplete tag options by @SuaYoo in #505
- Rename remaining crawl templates -> crawl configs by @SuaYoo in #509
- Rename api / nginx settings -> backend / frontend, set pull policy job images by @ikreymer in #504
- Improve new config navigation UX by @SuaYoo in #508
- Add logging service by @leepro in #442
- Fix logic for creating pidfile parent dir by @tw4l in #512
- Add API endpoints to remove users from orgs and delete invites by @tw4l in #511
- Add new /users/me-with-orgs API endpoint by @tw4l in #510
- Allow admin users to change org name by @SuaYoo in #506
- Only drop/recreate indexes on app startup only if migrations have been run by @tw4l in #515
- Handle DuplicateKeyError on org rename requests by @tw4l in #514
- Improve org routing & performance by @SuaYoo in #520
- chore(typos): fix typos by @stavares843 in #524
- Updates crawl type descriptions by @Shrinks99 in #526
- Update org settings & org invite UI by @SuaYoo in #528
- Add support for tags to update_crawl_config API endpoint by @tw4l in #521
- browser api: return additional data in profile /browser/ endpoint by @ikreymer in #537
- [FIX] Add ingress class for admin logging by @leepro in #532
- Fix issue where users are added to default org as admin by @tw4l in #534
- Add org role to /users/me-with-orgs by @tw4l in #536
- Fix browser profile origins sidebar overlap by @SuaYoo in #530
- CI: Setup manual workflow for dev deployment by @ikreymer in #540
- Deploy to Dev Cluster Fixes by @ikreymer in #542
- Make API updates for member updates by @tw4l in #541
- Reformat backend for black 23.1.0 by @tw4l in #548
- Add API endpoint to update crawl tags by @tw4l in #545
- Update crawl tags from detail view by @SuaYoo in #539
- Support additional seed URLs and custom scope type by @SuaYoo in #543
- Update tab access by user role by @SuaYoo in #549
- Add admin addons options to DigitalOcean by @leepro in #529
New Contributors
- @leepro made their first contribution in #446
- @tw4l made their first contribution in #443
- @stavares843 made their first contribution in #524
Full Changelog: 1.1.0...v1.2.0
Browsertrix Cloud 1.1.0
Key Features
- Viewing crawl queue and dynamically adding exclusions to crawl queue while crawl is running.
- Adding exclusions to crawl config screen.
- Setting initial browser language on crawl config.
Fixes
- Various UI fixes and improvements
- Fix issues with occasional log outs while using the UI
Documentation
-
Mkdocs based documentation added at https://docs.browsertrix.cloud/
-
Docker swarm / compose / podman-based deployments now deprecated in favor of Kubernetes
What's Changed
- Upgrade Shoelace 2.0.0-beta.61 -> 2.0.0-beta.83 by @SuaYoo in #358
- Allow users to set crawl config language by @SuaYoo in #377
- Frontend Node version support by @SuaYoo in #382
- Editable exclusion table cells by @SuaYoo in #379
- Fix authentication getting out of sync between tabs by @SuaYoo in #380
- chart / deployment fixes to run on microk8s: (fixes #385) by @ikreymer in #387
- Fix language configuration UI by @SuaYoo in #388
- Local Deployment Work: Support running locally + test cluster on CI by @ikreymer in #396
- mkdocs setup (deploy, dev, user-guide) by @Shrinks99 in #375
- build: increase network timeout for yarn for frontend build for arm64 build by @ikreymer in #401
- README + CHANGES + doc tweaks for 1.1.0 release by @ikreymer in #402
New Contributors
- @Shrinks99 made their first contribution in #375
Full Changelog: 1.1.0-beta.0...1.1.0