Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-by-one-Error in train_two_layer implementation #8

Open
Jonas-Heinrich opened this issue Jul 13, 2021 · 0 comments
Open

Off-by-one-Error in train_two_layer implementation #8

Jonas-Heinrich opened this issue Jul 13, 2021 · 0 comments

Comments

@Jonas-Heinrich
Copy link

Hi,

I encountered an assert issue when training RMI on very small (100 items) datasets for testing purposes:

thread '<unnamed>' panicked at 'start index was 100 but end index was 100', [...]/RMI/rmi_lib/src/train/two_layer.rs:27:5

Upon closer investigation, I think I have found an off-by-one error in the train_two_layer function implementation. I did not take a look at the context beyond the function, therefore take what I am about to say with a grain of salt:


Link to source file

  1. The value of split_idx is calculated here:

let split_idx = md_container.lower_bound_by(|x| {

  1. split_idx should be in the interval [0, md_container.len())

if split_idx > 0 && split_idx < md_container.len() {

Now lets look at the case where split_idx == md_container.len() - 1, which is valid per [2.]:

  1. The else branch is taken, since split_idx < md_container.len()

let mut leaf_models = if split_idx >= md_container.len() {

  1. split_idx + 1 (== md_container.len()) is passed to build_models_from as start_idx
|| build_models_from(&md_container, &top_model, layer2_model,
                                   split_idx + 1, md_container.len(),
                                   split_idx_target,
                                   second_half_models)
fn build_models_from<T: TrainingKey>(data: &RMITrainingData<T>,
                                    top_model: &Box<dyn Model>,
                                    model_type: &str,
                                    start_idx: usize, end_idx: usize,
                                    first_model_idx: usize,
                                    num_models: usize) -> Vec<Box<dyn Model>>

Link 4.1

Link 4.2

  1. Assert fails, since md_container.len() > md_container.len() is false
assert!(end_idx > start_idx,
        "start index was {} but end index was {}",
        start_idx, end_idx
);

Link 5


An obvious fix would be to change the condition in [3.] to split_idx >= md_container.len() - 1, however I am not entirely certain whether that leads to issues in other contexts. I guess a similar issue would happen if similar_idx == 0, only for the first call. I changed the condition in my local version and re-ran the tests - it seems to work just fine:

Running test cache_fix_osm
Test cache_fix_osm finished.
Running test cache_fix_wiki
Test cache_fix_wiki finished.
Running test max_size_wiki
Test max_size_wiki finished.
Running test radix_model_wiki
Test radix_model_wiki finished.
Running test simple_model_osm
Test simple_model_osm finished.
Running test simple_model_wiki
Test simple_model_wiki finished.
============== TEST RESULTS ===============
python3 report.py
PASS cache_fix_osm
PASS cache_fix_wiki
PASS max_size_wiki
PASS radix_model_wiki
PASS simple_model_osm
PASS simple_model_wiki

I can open a pull request with that fix if you would like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant