There are several performance tests for each of the functions.
The performance tests are against a dataset of a million rows, generated from sys.all_objects.
These are the performance test results from my development machine, a Windows 7 desktop with a 4-core Skylake CPU and 32GB of RAM.
Each test is run 10 times, and you're seeing the average runtime. You can also look at the raw data if you're interested.
Test Category | Simple | Complex | Null |
---|---|---|---|
Baseline | 0.49 | 0.62 | N/A |
Match | 9.01 | 12.57 | 5.33 |
GroupMatch | 17.54 | 20.79 | |
Matches | 28.09 | 22.14 | |
Replace | 22.96 | 21.10 | |
Split | 28.00 | 23.84 |
Analysis
The SQL CLR functions are slow. Even calling the simplest function with a null value, RegexMatch(), takes ~10 times longer than the baseline query, 5 seconds vs .5 seconds. Relatively simple regular expressions are slow.
I noticed that one of the cores on my computer was consistently pegged at 100%, suggesting that the limitation was CPU. That makes sense; both the CLR and regular expressions are relatively CPU heavy.
Update, 2017-01-21 - Using static methods in the Regex() class boosts performance, up to 35% in some cases.
There are 6 perf test scripts, one per function, plus a 'baseline' script.
Test Setup
To run these tests, you'll need to have sql-server-regex installed onto a SQL Server database.
Running a Test
- Open a SQL file for a test (for example, 'perf-test-match.sql')
- The top of the script has the runtime on my development machine (a desktop with a 4-core Skylake CPU and 16GB RAM).
- Run the script
- Each test should run 10 times, and return the runtime of each.
- If any of the tests error, then something is broken/failing.