[WIP] Fix realtime entropy patching #26

Vectorrent · 2025-01-18T00:13:27Z

This is a followup to the comment made here.

I want to load and train an entropy model alongside of the latent model - rather than loading it from checkpoint. The current code does not support this, so I added an additional PatcherArgs attribute, called entropy_model. Here, you can pass a Pytorch module directly, instead of a checkpoint path.

I also removed a self.output_proj variable, because it crashes when using entropy patching. The entropy_model_checkpoint_dir attribute does not exist on LocalModelArgs. Regardless, this variable is unused - and probably not necessary?

Finally, I removed a logger warning that was extremely spammy. We are already gating this warning with the BLT_SUPPRESS_ATTN_ERROR environment variable, and I'm assuming you didn't intend on throwing the same warning with every forward pass. I'm sure this was overlooked, and you meant to remove this extra logging.

Let me know if you have any questions!

Vectorrent · 2025-01-18T01:44:22Z

I noticed that the Patcher expects a static batch size to be set, but my custom model is more dynamic than that - and often uses a variable batch size. Thus, I need to adjust the batch size in the patcher in realtime, during training. I made that possible with an override in the forward method.

Vectorrent · 2025-01-18T02:37:32Z

I reverted some of the Patcher code to it's original style, since I realized I could just use the entropies argument to handle the batch size stuff.

I also fixed a warning here, since passing entropies in that way was causing this:

bytelatent/data/patcher.py:535: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  scores = torch.tensor(entropies, dtype=torch.float32)

Vectorrent added 3 commits January 17, 2025 18:03

allow loading of the entropy model directly

4203261

remove unused argument

6129756

remove spammy warning

175fce6

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 18, 2025

allow patch_batch_size to be adjusted in the forward() method

cff0dcb

Vectorrent changed the title ~~Fix realtime entropy patching~~ [WIP] Fix realtime entropy patching Jan 18, 2025

Vectorrent marked this pull request as draft January 18, 2025 01:48

revert to original patcher style, fix warning

9e42f5d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Fix realtime entropy patching #26

[WIP] Fix realtime entropy patching #26

Vectorrent commented Jan 18, 2025

Vectorrent commented Jan 18, 2025 •

edited

Loading

Vectorrent commented Jan 18, 2025

[WIP] Fix realtime entropy patching #26

Are you sure you want to change the base?

[WIP] Fix realtime entropy patching #26

Conversation

Vectorrent commented Jan 18, 2025

Vectorrent commented Jan 18, 2025 • edited Loading

Vectorrent commented Jan 18, 2025

Vectorrent commented Jan 18, 2025 •

edited

Loading