Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode Matching Indices are potentially incorrect #15

Open
MinusGix opened this issue Jun 18, 2022 · 1 comment
Open

Unicode Matching Indices are potentially incorrect #15

MinusGix opened this issue Jun 18, 2022 · 1 comment

Comments

@MinusGix
Copy link

fn main() {
    let matcher = SkimMatcherV2::default();

    let text = " üäö ";
    // [(0, ' '), (1, 'ü'), (3, 'ä'), (5, 'ö'), (7, ' ')]
    println!("{:?}", text.char_indices().collect::<Vec<_>>());
    println!("{:?}", matcher.fuzzy_indices(text, "ü")); // -> 1 (good)
    println!("{:?}", matcher.fuzzy_indices(text, "ä")); // -> 2 (bad)
    println!("{:?}", matcher.fuzzy_indices(text, "ö")); // -> 3 (bad)

    let text = "2  üäö ";
    // [(0, '2'), (1, ' '), (2, ' '), (3, 'ü'), (5, 'ä'), (7, 'ö'), (9, ' ')]
    println!("{:?}", text.char_indices().collect::<Vec<_>>());
    println!("{:?}", matcher.fuzzy_indices(text, "ü")); // -> 3 (good)
    println!("{:?}", matcher.fuzzy_indices(text, "ä")); // -> 4 (bad)
    println!("{:?}", matcher.fuzzy_indices(text, "ö")); // -> 5 (bad)
}

I would have expected the indices to match the 'indices' into a string, but it seems to be matching the indices into the chars()/chars_indices() iterator.
Is this intended behavior (which should probably be documented?), or is it a case of missing a + ch.len_utf8() in order to properly increment the index into the string?

@RobWalt
Copy link

RobWalt commented Jun 29, 2024

Maybe there should be an extra option for that behavior? I could imagine that people find the current style of handling things usefull for other use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants