Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Integrate the bugfixes of Redis 7.2.5 #567

Closed

Conversation

SoulPancake
Copy link

@SoulPancake
Copy link
Author

Can you take a look here @madolson
Since unstable already had all these changes, I decided to accept those during the conflicts while basing of my branch from 7.2

@hwware
Copy link
Member

hwware commented May 28, 2024

@SoulPancake DCO missed, pls sign off

@SoulPancake SoulPancake force-pushed the ab/integrate-bugfixes-7.2 branch from b8dda13 to 443f771 Compare May 28, 2024 18:51
@SoulPancake
Copy link
Author

My bad, done @hwware

@PingXie
Copy link
Member

PingXie commented May 28, 2024

The size of this PR doesn't look right and cluster_legacy.c doesn't exist on 7.2. Did you bring over the unstable changes?

@SoulPancake
Copy link
Author

Yes I think so
So when I was cherry picking b3aaa0a
I accepted unstable changes, as I could see the entire changes there from the bugfix commits
@PingXie

image

@hwware
Copy link
Member

hwware commented May 28, 2024

I think you need to update some test cases executable file name, such as valkey-check-aof etc

@@ -155,24 +119,24 @@ tags {"aof external:skip"} {
}

catch {
exec src/valkey-check-aof $aof_manifest_file
exec src/redis-check-aof $aof_manifest_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update to valkey-check-aof, otherwise test case fails

@SoulPancake SoulPancake force-pushed the ab/integrate-bugfixes-7.2 branch from 443f771 to 4887bb4 Compare May 29, 2024 04:20
bentotten and others added 7 commits May 29, 2024 10:18
… FAIL instead of PFAIL (#12824)

Fixes issue where a single primary cannot mark a replica as failed in a
single-shard cluster.

Signed-off-by: Anuragkillswitch <[email protected]>
… 4GB (#12955)

Fix #12864

The main reason for this crash is that when replacing a element of a
quicklist packed node with lpReplace() method,
if the final size is larger than 4GB, lpReplace() will fail and returns
NULL, causing `node->entry` to be incorrectly set to NULL.

Since the inserted data is not a large element, we can't just replace it
like a large element, first quicklistInsertAfter()
and then quicklistDelIndex(), because the current node may be merged and
invalidated in quicklistInsertAfter().

The solution of this PR:
When replacing a node fails (listpack exceeds 4GB), split the current
node, create a new node to put in the middle, and try to merge them.
This is the same as inserting a large element.
In the worst case, its size will not exceed 4GB.

Signed-off-by: Anuragkillswitch <[email protected]>
…#13004)

In #11012, we will reprocess command when client is unblocked on keys,
in some blocking commands, for example, in the XREADGROUP BLOCK
scenario,
because of the re-processing command, we will recalculate the block
timeout,
causing the blocking time to be reset.

This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly
let the command know that it is being re-processed, later in
blockForKeys
we will not reset the timeout.

Affected BLOCK cases:
- list / zset / stream, added test cases for each.

Unaffected cases:
- module (never re-process the commands).
- WAIT / WAITAOF (never re-process the commands).

Fixes #12998.

Signed-off-by: Anuragkillswitch <[email protected]>
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).

Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.

```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```

Switch to using ll2string if the number can be safely represented as a
long long.

The problem was introduced in #10587 (Redis 7.2).
closes #13113.

---------

Co-authored-by: Binbin <[email protected]>
Co-authored-by: debing.sun <[email protected]>
Co-authored-by: Oran Agra <[email protected]>
Signed-off-by: Anuragkillswitch <[email protected]>
… --pattern was also used (#13092)

The --count option for redis-cli has been released in redis 7.2.
redis/redis#12042
But I have found in code, that some logic was missing for using this
'count' option.

```
static redisReply *sendScan(unsigned long long *it) {
    redisReply *reply;

    if (config.pattern)
        reply = redisCommand(context, "SCAN %llu MATCH %b COUNT %d",
            *it, config.pattern, sdslen(config.pattern), config.count);
    else
        reply = redisCommand(context,"SCAN %llu",*it);
```

The intention was being able to using scan count.
But in this case, the --count will be only applied when 'pattern' is
declared.
So, I had fix it simply, to be worked properly - even if --pattern
option is not being used.

I tested it simply with time() command several times, and I could see it
works as intended with this commit.
The examples of test results are below:
```
# unstable build

time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null)

real    0m1.287s
user    0m0.011s
sys     0m0.022s

# count is not applied
time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null)

real    0m1.117s
user    0m0.011s
sys     0m0.020s

# count is applied with --pattern

time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:*" >/dev/null 2>/dev/null)

real    0m0.045s
user    0m0.002s
sys     0m0.002s
```

```
# fix-redis-cli-scan-count build
time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null)

real    0m1.084s
user    0m0.008s
sys     0m0.024s

# count is applied even if --pattern is not declared
time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null)

real    0m0.043s
user    0m0.000s
sys     0m0.004s

# of course this also applied
time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:*" >/dev/null 2>/dev/null)

real    0m0.031s
user    0m0.002s
sys     0m0.002s
```

Thanks a lot.

Signed-off-by: Anuragkillswitch <[email protected]>
…s MP-AOF (#12958)

The check in fileIsManifest misjudged the manifest file. For example,
if resp aof contains "file", it will be considered a manifest file and
the check will fail:
```
*3
$3
set
$4
file
$4
file
```

In #12951, if the preamble aof also contains it, it will also fail.
Fixes #12951.

the bug was happening if the the word "file" is mentioned
in the first 1024 lines of the AOF. and now as soon as it finds
a non-comment line it'll break (if it contains "file" or doesn't)

Signed-off-by: Anuragkillswitch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants