utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 #61

Trotts · 2021-04-06T09:27:10Z

Hi,

First off I'd like to say thank you for making this repo, it has been immensely helpful to my research! I believe I have found a bug in utils.py: FunctionNegativeTripletSelector when working with certain datasets. Specifically, the issue occurs on lines 174-175:

if len(triplets) == 0:
            triplets.append([anchor_positive[0], anchor_positive[1], negative_indices[0]])

When the dataset is small, for example a validation set of an already small dataset, it may not be possible for label_indices defined on line 161 to ever be >= 2. This is then checked on line 159-160:

if len(label_indices) < 2:
                continue

If every label_indices causes the continue to run, then len(triplets) will be 0. This triggers the offending code on lines 174-175. However at this point under the conditions, anchor_positive has not yet been defined, leading to a fatal error.

To test this, I created a training and validation split, with the validation set being of the following size:

unique, counts = np.unique(val_dataset.val_labels, return_counts=True)
print("y_val: len", len(unique), "\n", np.asarray((unique, counts)).T, "\n")

y_val: len 28 
 [[ 1  2]
 [ 2  2]
 [ 3  4]
 [ 4  2]
 [ 7  2]
 [ 8  5]
 [11  6]
 [13  3]
 [14  4]
 [15  3]
 [16  4]
 [18  3]
 [19  2]
 [21  2]
 [22  3]
 [23  2]
 [24  2]
 [25  3]
 [26  3]
 [28  5]
 [29  3]
 [31  2]
 [32  2]
 [34  5]
 [37  2]
 [39  2]
 [41  2]
 [44  2]]

Training occurs as normal, as the offending conditional is never hit:


TRAINING
labels: [ 4  4 34 34 11 11  3  3  2  2 13 13 32 32 16 16 22 22 39 39 29 29 25 25
  8  8 24 24 37 37 41 41 44 44 18 18 21 21 31 31 23 23  1  1 15 15  7  7
 26 26 28 28 14 14 19 19]
label_indices [42 43]
label_indices [8 9]
label_indices [6 7]
label_indices [0 1]
label_indices [46 47]
label_indices [24 25]
label_indices [4 5]
label_indices [10 11]
label_indices [52 53]
label_indices [44 45]
label_indices [14 15]
label_indices [34 35]
label_indices [54 55]
label_indices [36 37]
label_indices [16 17]
label_indices [40 41]
label_indices [26 27]
label_indices [22 23]
label_indices [48 49]
label_indices [50 51]
label_indices [20 21]
label_indices [38 39]
label_indices [12 13]
label_indices [2 3]
label_indices [28 29]
label_indices [18 19]
label_indices [30 31]
label_indices [32 33]
Train: [0/181 (0%)]	Loss: 0.987690	Average nonzero triplets: 28.0

However the validation set falls foul:

VALIDATION
labels: [19 13  4 15 14 31 37 21 11 26  3  1 41 29 34  2  8 23 28 39 44 22 32 18
 24 16 25  7]
label_indices [11]
continue triggered
label_indices [15]
continue triggered
label_indices [10]
continue triggered
label_indices [2]
continue triggered
label_indices [27]
continue triggered
label_indices [16]
continue triggered
label_indices [8]
continue triggered
label_indices [1]
continue triggered
label_indices [4]
continue triggered
label_indices [3]
continue triggered
label_indices [25]
continue triggered
label_indices [23]
continue triggered
label_indices [0]
continue triggered
label_indices [7]
continue triggered
label_indices [21]
continue triggered
label_indices [17]
continue triggered
label_indices [24]
continue triggered
label_indices [26]
continue triggered
label_indices [9]
continue triggered
label_indices [18]
continue triggered
label_indices [13]
continue triggered
label_indices [5]
continue triggered
label_indices [22]
continue triggered
label_indices [14]
continue triggered
label_indices [6]
continue triggered
label_indices [19]
continue triggered
label_indices [12]
continue triggered
label_indices [20]
continue triggered

Traceback (most recent call last):
  File "/trainer.py", line 53, in fit
    val_loss, metrics = test_epoch(val_loader, model, loss_fn, cuda, metrics)
  File "/trainer.py", line 190, in test_epoch
    loss_outputs = loss_fn(*loss_inputs)
  File "/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/losses.py", line 82, in forward
    triplets = self.triplet_selector.get_triplets(embeddings, target)
  File "/utils.py", line 187, in get_triplets
    triplets.append([anchor_positive[0], anchor_positive[1], negative_indices[0]])
UnboundLocalError: local variable 'anchor_positive' referenced before assignment

Note the line numbers above may be different from the repo's code due to the print statements. To me, it seems like currently the only way around this error would be to increase the size of the dataset, increasing the chance of label_indices at some point being >=2. anchor_positive needs to be defined somewhere before the offending conditional to stop the error, but I'm unsure where or how in order to fix.

The text was updated successfully, but these errors were encountered:

bth5032 · 2021-05-04T08:07:56Z

Can you expand on this, I'm running into the same issue, but you seem to be implying the issue is somehow at the dataset level, but the offending code is called for each batch of data.

I don't understand your example, there are way less than 181 labels in

TRAINING
labels: [ 4  4 34 34 11 11  3  3  2  2 13 13 32 32 16 16 22 22 39 39 29 29 25 25
  8  8 24 24 37 37 41 41 44 44 18 18 21 21 31 31 23 23  1  1 15 15  7  7
 26 26 28 28 14 14 19 19]

so why does that code show 0/181?

I'm really confused as to what the entire set has to do with anything because this seems like a scenario where you can have a bad batch, not a bad training/validation set. Can you expand on how many examples are being sent to get_triplets inside of OnlineTripletLoss.forward() per batch in your test cases (i.e. in your 'train' and 'validation' set)??

It looks like you're only sending in one example in the case of the validation set because you only print one 'label' and you always hit the continue. Obviously you can't do contrastive loss with only one example, so I'm guessing I'm misunderstanding something here.

bth5032 · 2021-05-04T08:31:02Z

After looking at the code for a while, I think what's happening is that if you send in a batch where NONE of your entries has a positive example, then you hit that continue statement for every element in the batch. In which case, there are no triplets since you need at least 1 positive and one negative example to make a triplet. That's why len(triplets) == 0.

I think the line you pointed out

if len(triplets) == 0:
            triplets.append([anchor_positive[0], anchor_positive[1], negative_indices[0]])

is just a typo as, anchor_positive is only defined in the nested for loop above. Clearly, the author was trying to have a default return case when there are no triplets in the batch, but I too am unsure exactly what he was trying to do there because there is no case where that if statement should work properly.

Presumably we could just return a constant 1x3 vector in that if statement and the gradient of the loss in that case would be zero and not change the network at all, which is probably the behavior we want.

Trotts · 2021-05-04T14:07:23Z

@bth5032 my hunch as to what was causing the issue may be incorrect, it was simply based on the fact that the code was failing to create batches just for this dataset. I believe the offending line is a typo, have you had any luck with your proposed solution?

bth5032 · 2021-05-04T18:27:08Z

Hi, yeah I ran a test overnight, just returning a constant vector, in my case

if len(triplets) == 0:
            triplets.append([0,0,0])

fixed the crash and the model seems to be training just fine

Trotts · 2021-05-05T08:29:10Z

Great stuff, thank you for taking the time to test that. I ran into this issue but solved it by changing the split of my train-test set. This is a much better solution.

bth5032 · 2021-05-06T08:10:46Z

Hey no problem, thanks for making the post, definitely saved me some time on this!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 #61

utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 #61

Trotts commented Apr 6, 2021 •

edited

Loading

bth5032 commented May 4, 2021

bth5032 commented May 4, 2021 •

edited

Loading

Trotts commented May 4, 2021

bth5032 commented May 4, 2021

Trotts commented May 5, 2021

bth5032 commented May 6, 2021

utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 #61

utils.py:FunctionNegativeTripletSelector - 'anchor_positive' referenced before assignment when len(label_indices) < 2 #61

Comments

Trotts commented Apr 6, 2021 • edited Loading

bth5032 commented May 4, 2021

bth5032 commented May 4, 2021 • edited Loading

Trotts commented May 4, 2021

bth5032 commented May 4, 2021

Trotts commented May 5, 2021

bth5032 commented May 6, 2021

Trotts commented Apr 6, 2021 •

edited

Loading

bth5032 commented May 4, 2021 •

edited

Loading