-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to Diesel 2.x and enable using Debian's libmariadb #1644
base: master
Are you sure you want to change the base?
Conversation
296419d
to
a08b465
Compare
This comment was marked as outdated.
This comment was marked as outdated.
3575d68
to
b684206
Compare
See https://diesel.rs/guides/migration_guide.html. - This pulls in mysqlclient-sys >=0.4.0, which allows using libmariadb instead of libmysqlclient. That allows using Debian's bundled client library. MySQL's APT repository only has packages for AMD64. - Connections are now passed as mut refs, which requires RefCell/RwLock. - Embedded migrations had a slight API change. - put_bso_sync was calling update_collection without its transaction, which now causes a deadlock. Updated to write inside its transaction.
The global functions were replaced with Server in 0.32.0 and the Cargo.toml has 1.4 listed (probably due to a mistake when pinning Diesel at 1.4.)
Cargo audit: Crate: idna Version: 0.5.0 Title: `idna` accepts Punycode labels that do not produce any non-ASCII when decoded Date: 2024-12-09 ID: RUSTSEC-2024-0421 URL: https://rustsec.org/advisories/RUSTSEC-2024-0421 Solution: Upgrade to >=1.0.0
This comment was marked as resolved.
This comment was marked as resolved.
Otherwise E2E tests that hammer user IDs may see prior data.
Otherwise, tests that hammer user IDs may see old data.
There is no other way for the timestamp to increase in production.
Since the batch_uploads primary key already contains user_id, it does nothing to increase time resolution. The counter may still see conflicts with other processes, but the probably is much lower now. (Spanner uses UUIDv4 for batch IDs.) The user_id hack was added in e9b455f "to match the Python implementation". E2E tests are failing because we can now generate more than one batch within 10 ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank You for doing the leg work on updating diesel! Things look pretty good. I do have a question about adding the batch delete function for spanner, which seems a bit out of scope for this PR.
I've reverted the batch delete change. The only question remaining is about the RwLock error handling. |
- The RwLock::write errors occur if there is a risk of deadlock. - The timestamp SQL statement might fail if there is a database connection drop. - The borrowing of the connection for calc_quota_usage_sync should never fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Sorry that I missed the bit about using a Uatomic32
, but I think we should probably address an old bug.
I don't know if you saw my earlier comment about signed commits. Unfortunately I can't land this PR as is because there are unsigned commits.
That said, I am wondering if we might want to break this into two different PRs (with signed commits). One that does the diesel 2.x
update, and a follow up that does the libmaria update? That might make things a bit cleaner between the two and spotting key differences might be a little easier, since folk won't have to context switch.
Seriously, thank you for all of this work. If you think that breaking this apart is too big of an ask, let me know and I can do that for you (with proper credit, of course).
let batch_id = db.timestamp().as_i64() + (user_id % 10); | ||
// We mix in a per-process counter to make batch IDs (more) unique within | ||
// a timestamp. | ||
let batch_id = db.timestamp().as_i64() + COUNTER.fetch_add(1, Ordering::Relaxed) as i64 % 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure that this has enough entropy. For single server machines, it's fine, since the Atomic is scoped, but for multi-node servers, there's a reasonable chance for collision. (Granted, it's technically true with the UserID too, but since that's externally generated, there's a bit less likelihood, since fxa_uids are randomized to begin with. I'd be equally fine with just appending a 2 or 4 digit rand if we want to not rely on external values.
(FWIW, this is fairly old code, so a lot of learning happened since.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For single server machines, it's fine, since the Atomic is scoped, but for multi-node servers, there's a reasonable chance for collision.
AFAICT, batch ID uniqueness only matters within a user-ID. So to me it seems this change is a strict improvement even in the multi-node sense.
We can't use user-ID since that doesn't help against conflicts in (user_id, batch_id)
. The problem is the E2E tests are now running faster, for some reason. They generate more than one batch within 10 ms, creating sporadic E2E test failures. The Spanner backend uses UUIDv4, so perhaps this should too. (I have a comment earlier in the PR that I marked as outdated, where I was keeping a journal of my struggles.)
Randomization would still leave the E2E tests potentially flaky. Adding both user_id%10
and the counter would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I didn't know that was the bug you were trying to address. (Sorry, just figured that you were trying to optimize code. FWIW, for some changes like this one, we've left comments in the PR to describe why we're introducing the change. It can help clarify things and help prevent folk from flagging or questioning some changes.)
No, but I still don't see any previous comment about signed commits. :\ CONTRIBUTING says "Sign your git commit." I'm guessing you mean a Signed-Off, and not an actual digital signature? (Since some projects do squashing, and hate it when I force-push in the middle of a review, I tend to avoid rewriting history during reviews. :)
I've already split it twice already because it became too big, so sure. I saw you had a branch that referenced this PR. I can continue if you like the direction. |
Huh, weird. |
Made #1647 for Diesel 2 (since my branch for this PR is called mariadb.) |
BTW, the libmariadb change is basically just the Dockerfile change. Hence why I didn't think it was worth splitting from the start. |
Yeah, but it's good to have separation around these things, regardless of the scope. This can be useful to untangle future problems without doing a lot of heavy cherry picking. Basically, it's a good habit to develop (and one I keep having to remind myself of.) |
Description
The end-goal of this PR is to be able to build for ARM64. The MySQL repo only contains client packages for AMD64.
However, this required an upgrade of Diesel to the latest, in order to have mysqlclient-sys find libmariadb.
That, in turn, required adding
RwLock
to Diesel connections.Unfortunately, it also affects
MigrationError
, which has now become adyn StdError
.This uncovered some bugs in the MySQL backend, e.g. there can be at most one batch ID per user every 10 ms. It seems Diesel 2 is faster (and libmysqlclient is faster than libmariadb), causing the E2E batch tests to generate batches faster than before. Nothing ever updated the session timestamp in the backend, which suggests there could only really be one batch per user.
When running E2E locally, batches were piling up in the database, because the fixture didn't remove batch_uploads and batch_upload_items. This required implementing in syncserver.
There was a note in Cargo.toml about diesel_logging not working, but this seems to have been fixed in 0.4.0: https://github.com/shssoichiro/diesel-logger/blob/a3a7df54823a3a0c922bf8d9b636045f079cecfe/CHANGELOG.md
syncstorage-rs switched to libmysqlclient in #1435 because libmariadb didn't default to TLS. In a Docker scenario, using TLS seems meaningless. The default remains libmysqlclient.
Testing
Existing integration tests.
There's a new build arg MYSQLCLIENTPKG that can select between libmariadb-dev and libmysqlclient-dev.
Issue(s)
A super-set of #1635