Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid index lookup for a NULL key if the condition is known to always be FALSE in this case #8278

Open
AnatolyLoy opened this issue Oct 8, 2024 · 3 comments

Comments

@AnatolyLoy
Copy link

The case contains three linked tables (see script below).
The simple Query No1 and tricked Query No2 produced the same result, but Query No1 slowed it.

It looks like Firebird compares the NULL value of T1.T1_ID with every index value and spend resources...

CREATE TABLE T1 (T1_ID INTEGER NOT NULL, PRIMARY KEY (T1_ID));
CREATE TABLE T2 (T2_ID INTEGER NOT NULL, T1_ID INTEGER, PRIMARY KEY (T2_ID));
ALTER TABLE T2 ADD FOREIGN KEY (T1_ID) REFERENCES T1 (T1_ID);
CREATE TABLE T3 (T3_ID INTEGER NOT NULL, T1_ID INTEGER, PRIMARY KEY (T3_ID));
ALTER TABLE T3 ADD FOREIGN KEY (T1_ID) REFERENCES T1 (T1_ID);
-- Fill data:
-- T1 - 1 000 recorsa
-- T2 - 10 000 records, 1 000 of them referred to T1
-- T3 - 1 000 000 records, 1 000 of them referred to T1
EXECUTE BLOCK
AS
DECLARE VARIABLE L_T1_ID INTEGER;
DECLARE VARIABLE L_T2_ID INTEGER;
DECLARE VARIABLE L_T3_ID INTEGER;
BEGIN
L_T3_ID = 1;
WHILE (L_T3_ID <= 1000000) DO
BEGIN
L_T1_ID = IIF(MOD(L_T3_ID, 1000) = 0, TRUNC(L_T3_ID/1000), NULL);
L_T2_ID = IIF(MOD(L_T3_ID, 100) = 0, TRUNC(L_T3_ID/100), NULL);
IF (L_T1_ID IS NOT NULL) THEN
INSERT INTO T1 (T1_ID) VALUES (:L_T1_ID);
INSERT INTO T3 (T3_ID, T1_ID) VALUES (:L_T3_ID, :L_T1_ID);
IF (L_T2_ID IS NOT NULL) THEN
INSERT INTO T2 (T2_ID, T1_ID) VALUES (:L_T2_ID, :L_T1_ID);
L_T3_ID = L_T3_ID + 1;
END
END;

-- Query No1: Simple query and bad performance
SELECT count(*)
FROM t2
LEFT OUTER JOIN T1 ON T1.T1_ID = T2.T1_ID
LEFT OUTER JOIN T3 ON T3.T1_ID = T1.T1_ID

Plan
PLAN JOIN (JOIN (T2 NATURAL, T1 INDEX (RDB$PRIMARY294)), T3 INDEX (RDB$FOREIGN298))

Adapted Plan
PLAN JOIN (JOIN (T2 NATURAL, T1 INDEX (INTEG_2830)), T3 INDEX (INTEG_2836))

------ Performance info ------
Prepare time = 0ms
Execute time = 44s 797ms
Avg fetch time = 44,797.00 ms
Current memory = 40,174,016
Max memory = 74,196,768
Memory buffers = 2,048
Reads from disk to cache = 1,251
Writes from cache to disk = 0
Fetches from cache = 4,059,140

-- Query No2: Trick query and expected performance
SELECT count(*)
FROM T2
LEFT OUTER JOIN T1 on T1.T1_ID = T2.T1_ID
LEFT OUTER JOIN T3 on T3.T1_ID = COALESCE(T1.T1_ID, -1)

Plan
PLAN JOIN (JOIN (T2 NATURAL, T1 INDEX (RDB$PRIMARY294)), T3 INDEX (RDB$FOREIGN298))

Adapted Plan
PLAN JOIN (JOIN (T2 NATURAL, T1 INDEX (INTEG_2830)), T3 INDEX (INTEG_2836))

------ Performance info ------
Prepare time = 0ms
Execute time = 31ms
Avg fetch time = 31.00 ms
Current memory = 37,804,784
Max memory = 38,739,168
Memory buffers = 2,048
Reads from disk to cache = 1,042
Writes from cache to disk = 0
Fetches from cache = 63,145

@AnatolyLoy AnatolyLoy changed the title Bad performance - Second LEFT JOIN try to compare NULL value with all index values Firebird 5.0 - Bad performance - Second LEFT JOIN try to compare NULL value with all index values Oct 8, 2024
@livius2
Copy link

livius2 commented Oct 15, 2024

This is a typical case, and I suppose the NULL comparison should be removed by the engine here, as this is a simple scenario. In more complex cases, however, it might be harder to detect such situations. Comparisons with NULL should be skipped, as they cannot return any rows. As a result, scanning the index for a NULL value is unnecessary.

@AnatolyLoy
Copy link
Author

This is a typical case, and I suppose the NULL comparison should be removed by the engine here, as this is a simple scenario. In more complex cases, however, it might be harder to detect such situations. Comparisons with NULL should be skipped, as they cannot return any rows. As a result, scanning the index for a NULL value is unnecessary.

Exactly. It's unfortunate.
The simple query for the case mentioned above produces the same result -
The OUTER JOIN compares NULLs 9,000 times with all 1,000,000 index values...
So, it's a spend server resource whenever you try to left-join the non-mandatory column with many NULL values,
Just a full index scan... Every comparison. Brrrrrrr.....

SELECT COUNT(*)
FROM T2 -- 10 000 records
LEFT OUTER JOIN T3 ON T3.T1_ID = NULL -- * 1 000 000 index scans

Plan

PLAN JOIN (T2 NATURAL, T3 INDEX (RDB$FOREIGN300))

Adapted Plan

PLAN JOIN (T2 NATURAL, T3 INDEX (INTEG_2846))

------ Performance info ------
Prepare time = 15ms
Execute time = 49s 625ms
Avg fetch time = 49,625.00 ms
Current memory = 56,801,616
Max memory = 57,807,920
Memory buffers = 3,000
Reads from disk to cache = 0
Writes from cache to disk = 0
Fetches from cache = 4,480,148

@AnatolyLoy AnatolyLoy changed the title Firebird 5.0 - Bad performance - Second LEFT JOIN try to compare NULL value with all index values Firebird 5.0 and 2.5 - Bad performance - LEFT JOIN try to compare NULL value with all index values Oct 20, 2024
@AnatolyLoy AnatolyLoy changed the title Firebird 5.0 and 2.5 - Bad performance - LEFT JOIN try to compare NULL value with all index values Firebird 5.0 and 2.5 - Bad performance - LEFT/INNER JOIN try to compare NULL value with all index values Oct 20, 2024
@AnatolyLoy
Copy link
Author

The Firebird 2.5 makes the same result.
The INNER JOIN makes the same result.
SELECT COUNT(*)
FROM T2 -- 10 000 records
JOIN T3 ON T3.T1_ID = NULL -- * 1 000 000 index scans

Even do not try to run

SELECT COUNT(*)
FROM T2 -- * 10 000 index scans
JOIN T3 ON T3.T1_ID = T2.T1_ID -- 1 000 000 records

The server has chosen the T3 -> T2 plan. and we will have about 1 000 000 attempts to make scans for 1000 index values :(

So, it looks like a non-optimal execution in IndexTableScan...

@dyemanov dyemanov self-assigned this Oct 21, 2024
@dyemanov dyemanov changed the title Firebird 5.0 and 2.5 - Bad performance - LEFT/INNER JOIN try to compare NULL value with all index values Avoid index lookup for a NULL key if the condition is known to always be FALSE in this case Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants