Using strain data segments from a network of ground-based GW detectors, predict whether a GW signal is present in the strain segment.
Binary classification using SOTA deep neural networks of multivariate timeseries data.
Source: G2Net Gravitational Wave Detection (https://www.kaggle.com/competitions/g2net-gravitational-wave-detection/overview)
Each data sample (npy file) contains 3 time series (1 for each detector) and each spans 2 sec and is sampled at 2,048 Hz.
train/
- the training set files, one npy file per observation; labels are provided in a files shown below
test/
- the test set files; you must predict the probability that the observation contains a gravitational wave
training_labels.csv
- target values of whether the associated signal contains a gravitational wave
-
src
: Scripts FolderGettingStarted.ipynb
: Start here; it walks through EDA and the modelling pipeline.configs
: Folder contains config files used in the training scripts to provide parameters of dataloaders, models, etc.Subfolder
train
holds config JSON files used to perform experimental runs usingtrain.py
ortrain_pl.py
. Filebase.json
which contains the basic config like choice of optimizer, no. of training epochs, etc. Fileoptim.json
contains the parameters for the chosen optimizer. Filestop_early.json
contains parameters for early stopping criteria. Filelr_schd.json
contains the parameters for the chosen learning rate scheduler. The rest of the JSON files correspond to the models; there will be one file per model. All relevant config files are read in the beginning of the training script.Subfolder
sweep
contains JSON files used to perform hyperparameter sweeps. Contains similar JSON files; just adapted to work with wandb sweep.dataloaders
: Folder contains scripts that implememt useful functions to load data from local download.models
: Folder contains implementations of SOTA DL models for time-series classification. Has atsai
folder containing source code of TSAI (https://github.com/timeseriesAI/tsai); contains SOTA models implementataions. Also has apytorch
folder for other custom implementations which may or may not usetsai
modules.wandb_sweep.py
: Entry-point script for model training and hyperparameter tuning with W&B logging.train.py
: Entry-point script for single run of training and evaluation using vanilla PyTorch with W&B logging. Will produce a run directory in results directory with test-set eval results and optionally model weights.train_pl.py
: Entry-point script for single run of training and evaluation with PyTorch Lightning with W&B logging. Will produce a run directory in results directory with test-set eval results and optionally model weights. -
results
: Folder to organize run results. -
environment.yml
: File to create python environment. -
wandb_api_key.txt
: File to hold your wandb API key for logging to your wandb. dashboard.Instructions: 1. Create a wandb account at wandb.ai 2. Create a new project. 3. Copy your API key and paste on the first line of this file.
Note: Make sure anaconda or miniconda is installed on your system.
- Clone this repository.
- In
environment.yml
, specify appropriate env. name (first line; default:gwsearchenv
) and path (last line); a standard practice is to specify this to be<path/to/anaconda or miniconda/dir>/envs/<name_of_env>
. - Create the environment by running this command -
conda env create -f environment.yml
. - Activate your newly create environment
conda activate gwsearchenv
. - Run
GettingStarted.ipynb
.