Optimize Regex #64329
-
Is there an API that can optimize/minify the string representation of a regex? Tagging @stephentoub because he seems to be doing a ton of work on regex source gen stuff. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
That's not a valid transformation, as it changes the number of capture groups and what each capture group would capture. Now, if your question was instead about // Description:
// ○ Match the string "aaaaaa".
// ○ Match a character in the set [1-3].
protected override bool FindFirstChar()
{
int pos = base.runtextpos, end = base.runtextend;
global::System.ReadOnlySpan<char> inputSpan = base.runtext;
if (pos < end - 6)
{
int i = global::System.MemoryExtensions.IndexOf(inputSpan.Slice(pos, end - pos), "aaaaaa");
if (i >= 0)
{
base.runtextpos = pos + i;
return true;
}
}
// No starting position found
NoStartingPositionFound:
base.runtextpos = end;
return false;
}
[global::System.Runtime.CompilerServices.SkipLocalsInit]
protected override void Go()
{
global::System.ReadOnlySpan<char> inputSpan = base.runtext;
int pos = base.runtextpos, end = base.runtextend;
int original_pos = pos;
global::System.ReadOnlySpan<byte> byteSpan;
global::System.ReadOnlySpan<char> slice = inputSpan.Slice(pos, end - pos);
if ((uint)slice.Length < 7)
{
goto NoMatch;
}
// Match the string "aaaaaa".
{
byteSpan = global::System.Runtime.InteropServices.MemoryMarshal.AsBytes(slice);
if (global::System.Buffers.Binary.BinaryPrimitives.ReadUInt64LittleEndian(byteSpan) != 0x61006100610061ul ||
global::System.Buffers.Binary.BinaryPrimitives.ReadUInt32LittleEndian(byteSpan.Slice(8)) != 0x610061u)
{
goto NoMatch;
}
}
if ((((uint)slice[6]) - '1' > (uint)('3' - '1'))) // Match a character in the set [1-3].
{
goto NoMatch;
}
// The input matched.
pos += 7;
base.runtextpos = pos;
base.Capture(0, original_pos, pos);
return;
// The input didn't match.
NoMatch:;
} but there's no API nor plans for an API to round-trip that back to a valid string representation. |
Beta Was this translation helpful? Give feedback.
That's not a valid transformation, as it changes the number of capture groups and what each capture group would capture.
Now, if your question was instead about
aaaaaa1|aaaaaa2|aaaaaa3
, then .NET 7 will optimize that to the equivalent ofaaaaaa[123]
, which you can see by looking at the code generated by the source generator: