Improve ucaps.h `_find_upper()` and `_find_lower()` performance #93360

aaronp64 · 2024-06-19T17:01:34Z

Updated _find_upper() and _find_lower() to use a hashmap-like lookup instead of binary search. Reduces time of functions like String::findn(), String::to_upper(), and String::to_lower() by around 50-80% depending on strings used.

Compared using gdscript below:

extends Node2D

var text := "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."

func _ready() -> void:
	test_findn()
	test_to_lower()
	test_to_upper()

func test_findn():
	var ms_start = Time.get_ticks_msec()
	var to_find := "abc"
	for i in 1000000:
		text.findn(to_find)
	var ms_end = Time.get_ticks_msec()
	print("findn: " + str(ms_end - ms_start) + "ms")

func test_to_lower():
	var ms_start = Time.get_ticks_msec()
	for i in 1000000:
		text.to_lower()
	var ms_end = Time.get_ticks_msec()
	print("to_lower: " + str(ms_end - ms_start) + "ms")

func test_to_upper():
	var ms_start = Time.get_ticks_msec()
	for i in 1000000:
		text.to_upper()
	var ms_end = Time.get_ticks_msec()
	print("to_upper: " + str(ms_end - ms_start) + "ms")

Old:

findn: 13111ms
to_lower: 6793ms
to_upper: 5231ms

New:

findn: 2428ms
to_lower: 1435ms
to_upper: 2241ms

AThousandShips · 2024-06-19T17:04:48Z

See also:

[Core] Fix and optimize binary search #90036

Updated _find_upper() and _find_lower() to use a hashmap-like lookup instead of binary search. Reduces time of functions like String::findn(), String::to_upper(), and String::to_lower() by around 50-80% depending on strings used.

MewPurPur · 2024-06-20T07:08:38Z

core/string/ucaps.h

+	size_t size = 1;
+	while (size < count * 2) {
+		size *= 2;
+	}


get_nearest_po2() * 2 for more speed? òwó

next_power_of_2() * 2 should work here, but would require changing next_power_of_2() to be constexpr. Size is only calculated once at compile time for each caps_table array, so speed shouldn't be an issue.

Ivorforce · 2025-01-05T11:37:02Z

Just to state it here too: I have opened #99971, which yields better performance benefits for latin texts.
That's not to say there's no value in this PR! #99971 only optimizes latin (ascii+) texts. Texts with non-latin contents, like chinese, would benefit from this PR. But we should be aware of this caveat when considering merging this :)

aaronp64 requested a review from a team as a code owner June 19, 2024 17:01

AThousandShips added enhancement topic:core performance labels Jun 19, 2024

AThousandShips added this to the 4.x milestone Jun 19, 2024

aaronp64 force-pushed the ucaps_lookup branch from 35f12b6 to 640d08e Compare June 19, 2024 17:07

MewPurPur reviewed Jun 20, 2024

View reviewed changes

MewPurPur mentioned this pull request Dec 3, 2024

Optimize String _find_upper and _find_lower by handling low-bit characters (including normal latin) explicitly. #99971

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ucaps.h `_find_upper()` and `_find_lower()` performance #93360

Improve ucaps.h `_find_upper()` and `_find_lower()` performance #93360

aaronp64 commented Jun 19, 2024

AThousandShips commented Jun 19, 2024

MewPurPur Jun 20, 2024

aaronp64 Jun 20, 2024

Ivorforce commented Jan 5, 2025 •

edited

Loading

Improve ucaps.h _find_upper() and _find_lower() performance #93360

Are you sure you want to change the base?

Improve ucaps.h _find_upper() and _find_lower() performance #93360

Conversation

aaronp64 commented Jun 19, 2024

AThousandShips commented Jun 19, 2024

MewPurPur Jun 20, 2024

Choose a reason for hiding this comment

aaronp64 Jun 20, 2024

Choose a reason for hiding this comment

Ivorforce commented Jan 5, 2025 • edited Loading

Improve ucaps.h `_find_upper()` and `_find_lower()` performance #93360

Improve ucaps.h `_find_upper()` and `_find_lower()` performance #93360

Ivorforce commented Jan 5, 2025 •

edited

Loading