feedback on training with lightning #1083

paulpeyret-biophonia · 2024-12-12T11:04:20Z

paulpeyret-biophonia
Dec 12, 2024

I dived into the lightning framework and checked the last update for training model with lightning (OPSO v0.11.0).
Many thanks for this great update! 👍
Here are a few comments and thoughts about my usage and possible improvements.

Loggers
Logging with different loggers works fine. There are still some historical code with wandb table code around that might be addressed through a lightning WandbLogger and some specific callbacks ?

Training function
About the LightningSpectrogramModule. I think I will not be using the fit_with_trainer() method since I prefer calling Trainer instanciation and use the trainer.fit() on the LightningModule itself. I understand that the fit_with_trainer() method might be more easy to approach for people switching from CNN training to Lightning Training.
I must admit that It might feel a bit confusing in the beginning but I feel that the pt.Lightning objects structure makes a lot of sens for easy and flexible configuration. As you already now, the common Lightning way is to keep Trainer, Module, Callbacks and Loggers well isolated and instantiated separately before training. It allows to keep thing well separated in configuration files (or dictionaries or parameters) and avoid using too many kwargs. This being said, there is no limitation today and anyone can do it one way or this other because LightningSpectrogramModule is still a LightningModule ! 😃

Model Testing
Model testing is a bit intricated for now but still easily done with predict_with_trainer() then call multi_target_metrics()

test_dataset=pd.read_csv(test_dataset_csv,index_col=[0,1,2])
scores_df=model.predict_with_trainer(test_dataset, batch_size=batch_size,  num_workers=num_workers,activation_layer="sigmoid")    
metrics=multi_target_metrics(test_dataset.to_numpy(),scores_df.to_numpy(),class_names=model.classes,threshold=0.5)

I also saw that there is a method LightningModule.test() that allow for testing models (after training). This method requires a test_step() to be defined in the model so I thought It might be interesting to look into this for evaluating the models on various datasets. ❔

Plots
I saw some nice plots in the tutorial with histograms of scores and custom Precision Recall curves. Are these functions available somewhere in opso package or elsewhere? It would be nice to have a collection of functions to plot common figures for evaluation of mutilabel / multiclass / binary classifications.

Inference
For inference, the predict_with_trainer() method is very nice and offers a ready to use interface that doesn't require a lot of customization! 👍

Preprocessing
This is not specific to Lightning but I feel like the preprocessing is often quite long in the training. I am wondering if it would be feasible to cache the preprocessed files (spectrograms) so that you wouldn't need to recalculate them if you train several models with the same dataset and dataloader parameters. Not sure it is easy to do though... ❔

All the best!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feedback on training with lightning #1083

{{title}}

Replies: 0 comments

Select a reply

feedback on training with lightning #1083

paulpeyret-biophonia Dec 12, 2024

Replies: 0 comments

paulpeyret-biophonia
Dec 12, 2024