TRAK wrapper #101

dilyabareeva · 2024-08-06T13:24:31Z

No description provided.

gumityolcu · 2024-08-16T16:01:47Z

Hello.

Some issues about the current state of the TRAK wrapper

TRAKer object (the underlying explainer we are interfacing with) works as follows:

1 - We "featurize" the training data given a model with model_id. This is done by iterating through checkpoints we 1-) load_checkpoint(), 2-) featurize() training dataset and 3-)finalize_features().
2- We then "score" the test data: Iterating through checkpoints, 1-)start_scoring_checkpoint() 2-score() test batch and 3-) finalize_scores()

once you finalize_scores(), you can not use score() until you start_scoring_checkpoint(). You can not get explanations without finalizing scores, unless you make a subclass of the underlying TRAK explainer.

Basically it wants you to go through all the test dataset, and get explanations in the end, and it is being clever by caching everything.

This causes the problem that:

every batch will return the same explanations, becasue it is batched and the whole process is started from scratch with each explanation call
when you destroy an object, if you create a new one with the same model id and cache folder, it will use cached explanations

so we either need some garbage collection and creating new cache folders with new "experiment_name"s. Any straightforward solutions i tried failed: delete corresponding cache file or small changes in the wrapper logic.

gumityolcu · 2024-08-16T16:02:19Z

That's why I closed the PR #106

dilyabareeva · 2024-08-29T16:32:30Z

@gumityolcu I have investigated the caching issue in more detail. The issue stems from TRAK using a memory-mapped numpy array saver. When we call start_scoring_checkpoint a new sample indices count is initiated and the results are saved to a specific memory address (on disk) that only depends on those indices. When we call start_scoring_checkpoint anew, the saving address still remains the same as for the previously calculated batches. So the old results are overwritten on disk. The explanations are being returned in this memory-mapped format, referring directly to a disk memory address, which leads to our issues.

Luckily, this is easily resolved if we allocate new memory for the explanations by calling copy.deepcopy 😄

TRAK Library allows a saver of their AbstractSaver type to be passed to a TRAKer instance. So I think a long-term better solution is to write our own saver implementation, which is not memory-mapped. I will open an issue for that.

dilyabareeva assigned gumityolcu Aug 6, 2024

gumityolcu mentioned this issue Aug 13, 2024

Trak wrapper #106

Closed

dilyabareeva mentioned this issue Aug 29, 2024

TRAK Wrapper #120

Merged

dilyabareeva closed this as completed in #120 Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRAK wrapper #101

TRAK wrapper #101

dilyabareeva commented Aug 6, 2024

gumityolcu commented Aug 16, 2024

gumityolcu commented Aug 16, 2024

dilyabareeva commented Aug 29, 2024

TRAK wrapper #101

TRAK wrapper #101

Comments

dilyabareeva commented Aug 6, 2024

gumityolcu commented Aug 16, 2024

gumityolcu commented Aug 16, 2024

dilyabareeva commented Aug 29, 2024