Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Diesel 2.x and enable using Debian's libmariadb #1644

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

tommie
Copy link

@tommie tommie commented Dec 27, 2024

Description

The end-goal of this PR is to be able to build for ARM64. The MySQL repo only contains client packages for AMD64.
However, this required an upgrade of Diesel to the latest, in order to have mysqlclient-sys find libmariadb.
That, in turn, required adding RwLock to Diesel connections.
Unfortunately, it also affects MigrationError, which has now become a dyn StdError.

This uncovered some bugs in the MySQL backend, e.g. there can be at most one batch ID per user every 10 ms. It seems Diesel 2 is faster (and libmysqlclient is faster than libmariadb), causing the E2E batch tests to generate batches faster than before. Nothing ever updated the session timestamp in the backend, which suggests there could only really be one batch per user.

When running E2E locally, batches were piling up in the database, because the fixture didn't remove batch_uploads and batch_upload_items. This required implementing in syncserver.

There was a note in Cargo.toml about diesel_logging not working, but this seems to have been fixed in 0.4.0: https://github.com/shssoichiro/diesel-logger/blob/a3a7df54823a3a0c922bf8d9b636045f079cecfe/CHANGELOG.md

syncstorage-rs switched to libmysqlclient in #1435 because libmariadb didn't default to TLS. In a Docker scenario, using TLS seems meaningless. The default remains libmysqlclient.

Testing

Existing integration tests.

There's a new build arg MYSQLCLIENTPKG that can select between libmariadb-dev and libmysqlclient-dev.

Issue(s)

A super-set of #1635

@tommie tommie force-pushed the mariadb branch 3 times, most recently from 296419d to a08b465 Compare December 27, 2024 18:55
@tommie

This comment was marked as outdated.

@tommie tommie changed the title Draft: Allow using Debian's libmariadb for building, upgrade to Diesel 2.2.6 and refactor Dockerfile Draft: Upgrade to Diesel 2.x and enable using Debian's libmariadb Dec 28, 2024
@tommie tommie force-pushed the mariadb branch 4 times, most recently from 3575d68 to b684206 Compare December 28, 2024 09:17
See https://diesel.rs/guides/migration_guide.html.

- This pulls in mysqlclient-sys >=0.4.0, which allows using libmariadb instead of libmysqlclient.
  That allows using Debian's bundled client library.
  MySQL's APT repository only has packages for AMD64.
- Connections are now passed as mut refs, which requires RefCell/RwLock.
- Embedded migrations had a slight API change.
- put_bso_sync was calling update_collection without its transaction, which now causes a deadlock.
  Updated to write inside its transaction.
The global functions were replaced with Server in 0.32.0 and the
Cargo.toml has 1.4 listed (probably due to a mistake when pinning
Diesel at 1.4.)
Cargo audit:

Crate:     idna
Version:   0.5.0
Title:     `idna` accepts Punycode labels that do not produce any non-ASCII when decoded
Date:      2024-12-09
ID:        RUSTSEC-2024-0421
URL:       https://rustsec.org/advisories/RUSTSEC-2024-0421
Solution:  Upgrade to >=1.0.0
@tommie

This comment was marked as resolved.

Otherwise E2E tests that hammer user IDs may see prior data.
Otherwise, tests that hammer user IDs may see old data.
There is no other way for the timestamp to increase in production.
Since the batch_uploads primary key already contains user_id, it does
nothing to increase time resolution. The counter may still see
conflicts with other processes, but the probably is much lower
now. (Spanner uses UUIDv4 for batch IDs.)

The user_id hack was added in e9b455f "to match the Python
implementation". E2E tests are failing because we can now
generate more than one batch within 10 ms.
@tommie tommie changed the title Draft: Upgrade to Diesel 2.x and enable using Debian's libmariadb Upgrade to Diesel 2.x and enable using Debian's libmariadb Dec 29, 2024
@tommie tommie marked this pull request as ready for review December 29, 2024 15:08
Copy link
Member

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank You for doing the leg work on updating diesel! Things look pretty good. I do have a question about adding the batch delete function for spanner, which seems a bit out of scope for this PR.

syncstorage-mysql/src/batch.rs Outdated Show resolved Hide resolved
syncstorage-spanner/src/batch.rs Outdated Show resolved Hide resolved
jrconlin added a commit that referenced this pull request Jan 8, 2025
This work is based on @tommie's commit in #1644 and updates to
diesel 2.2.

Closes SYNC-4568
@tommie
Copy link
Author

tommie commented Jan 8, 2025

I've reverted the batch delete change.

The only question remaining is about the RwLock error handling.

- The RwLock::write errors occur if there is a risk of deadlock.
- The timestamp SQL statement might fail if there is a database connection drop.
- The borrowing of the connection for calc_quota_usage_sync should never fail.
Copy link
Member

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Sorry that I missed the bit about using a Uatomic32, but I think we should probably address an old bug.

I don't know if you saw my earlier comment about signed commits. Unfortunately I can't land this PR as is because there are unsigned commits.

That said, I am wondering if we might want to break this into two different PRs (with signed commits). One that does the diesel 2.x update, and a follow up that does the libmaria update? That might make things a bit cleaner between the two and spotting key differences might be a little easier, since folk won't have to context switch.

Seriously, thank you for all of this work. If you think that breaking this apart is too big of an ask, let me know and I can do that for you (with proper credit, of course).

syncstorage-mysql/src/batch.rs Outdated Show resolved Hide resolved
syncstorage-spanner/src/batch.rs Outdated Show resolved Hide resolved
let batch_id = db.timestamp().as_i64() + (user_id % 10);
// We mix in a per-process counter to make batch IDs (more) unique within
// a timestamp.
let batch_id = db.timestamp().as_i64() + COUNTER.fetch_add(1, Ordering::Relaxed) as i64 % 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure that this has enough entropy. For single server machines, it's fine, since the Atomic is scoped, but for multi-node servers, there's a reasonable chance for collision. (Granted, it's technically true with the UserID too, but since that's externally generated, there's a bit less likelihood, since fxa_uids are randomized to begin with. I'd be equally fine with just appending a 2 or 4 digit rand if we want to not rely on external values.
(FWIW, this is fairly old code, so a lot of learning happened since.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For single server machines, it's fine, since the Atomic is scoped, but for multi-node servers, there's a reasonable chance for collision.

AFAICT, batch ID uniqueness only matters within a user-ID. So to me it seems this change is a strict improvement even in the multi-node sense.

We can't use user-ID since that doesn't help against conflicts in (user_id, batch_id). The problem is the E2E tests are now running faster, for some reason. They generate more than one batch within 10 ms, creating sporadic E2E test failures. The Spanner backend uses UUIDv4, so perhaps this should too. (I have a comment earlier in the PR that I marked as outdated, where I was keeping a journal of my struggles.)

Randomization would still leave the E2E tests potentially flaky. Adding both user_id%10 and the counter would work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't know that was the bug you were trying to address. (Sorry, just figured that you were trying to optimize code. FWIW, for some changes like this one, we've left comments in the PR to describe why we're introducing the change. It can help clarify things and help prevent folk from flagging or questioning some changes.)

@tommie
Copy link
Author

tommie commented Jan 9, 2025

I don't know if you saw my earlier comment about signed commits. Unfortunately I can't land this PR as is because there are unsigned commits.

No, but I still don't see any previous comment about signed commits. :\

CONTRIBUTING says "Sign your git commit." I'm guessing you mean a Signed-Off, and not an actual digital signature?

(Since some projects do squashing, and hate it when I force-push in the middle of a review, I tend to avoid rewriting history during reviews. :)

One that does the diesel 2.x update, and a follow up that does the libmaria update?

I've already split it twice already because it became too big, so sure. I saw you had a branch that referenced this PR. I can continue if you like the direction.

@jrconlin
Copy link
Member

jrconlin commented Jan 9, 2025

Huh, weird.
Or maybe I spaced and only commented on the syncstorage-rs PR that also has unsigned commits. My apologies.

@tommie
Copy link
Author

tommie commented Jan 9, 2025

Made #1647 for Diesel 2 (since my branch for this PR is called mariadb.)

@tommie
Copy link
Author

tommie commented Jan 9, 2025

BTW, the libmariadb change is basically just the Dockerfile change. Hence why I didn't think it was worth splitting from the start.

@jrconlin
Copy link
Member

jrconlin commented Jan 9, 2025

Yeah, but it's good to have separation around these things, regardless of the scope. This can be useful to untangle future problems without doing a lot of heavy cherry picking. Basically, it's a good habit to develop (and one I keep having to remind myself of.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants