Skip to content

Commit

Permalink
update html notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
sammlapp committed Nov 17, 2023
1 parent b65b006 commit 4cc8fb9
Show file tree
Hide file tree
Showing 7 changed files with 375 additions and 1,190 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
rana_sierrae_2022/
.DS_Store
.DS_Store
248 changes: 124 additions & 124 deletions 01_explore_annotated_data.html

Large diffs are not rendered by default.

894 changes: 58 additions & 836 deletions 02_prep_training_data.html

Large diffs are not rendered by default.

36 changes: 20 additions & 16 deletions 03_train_cnn.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
.highlight .m { color: var(--jp-mirror-editor-number-color) } /* Literal.Number */
.highlight .s { color: var(--jp-mirror-editor-string-color) } /* Literal.String */
.highlight .ow { color: var(--jp-mirror-editor-operator-color); font-weight: bold } /* Operator.Word */
.highlight .pm { color: var(--jp-mirror-editor-punctuation-color) } /* Punctuation.Marker */
.highlight .w { color: var(--jp-mirror-editor-variable-color) } /* Text.Whitespace */
.highlight .mb { color: var(--jp-mirror-editor-number-color) } /* Literal.Number.Bin */
.highlight .mf { color: var(--jp-mirror-editor-number-color) } /* Literal.Number.Float */
Expand Down Expand Up @@ -14559,12 +14560,12 @@
},
displayAlign: 'center',
CommonHTML: {
linebreaks: {
automatic: true
linebreaks: {
automatic: true
}
}
});

MathJax.Hub.Queue(["Typeset", MathJax.Hub]);
}
}
Expand All @@ -14584,6 +14585,9 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<p>Note that training is a stochastic process and will result in slightly different results each time
the script is run. The original model object trained and used in the manuscript is included in
the subfolder <code>./resources/rana_seirrae_cnn.model</code>.</p>
<h3 id="This-takes-a-long-time">This takes a long time<a class="anchor-link" href="#This-takes-a-long-time">&#182;</a></h3><p>Training a CNN deep learning model is computationaly expensive and slow. It is much faster when a GPU is available (OpenSoundscape will automatically use a GPU if it is available) but even so will take about an hour to run (estimated 20 hours for CPU only machine).</p>
<p>You can proceed through the rest of the notebooks without re-training the model, instead using the original model object trained and used in the manuscript, which is included in
the subfolder <code>./resources/rana_seirrae_cnn.model</code>.</p>
<p>This notebook is part of a series of notebooks and scripts in the <a href="https://github.com/kitzeslab/rana-sierrae-cnn">repository</a>:</p>
<ul>
<li><p><code>01_explore_annotated_data.ipynb</code> Explore annotated dataset of Rana sierrae call types</p>
Expand Down Expand Up @@ -14622,7 +14626,7 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea">
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[1]:</div>
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[&nbsp;]:</div>
<div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div class="CodeMirror cm-s-jupyter">
<div class=" highlight hl-ipython3"><pre><span></span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
Expand Down Expand Up @@ -14653,14 +14657,14 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea">
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[2]:</div>
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[&nbsp;]:</div>
<div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div class="CodeMirror cm-s-jupyter">
<div class=" highlight hl-ipython3"><pre><span></span><span class="c1"># Load the training and validation datasets prepared in the notebook 02_prep_training_data.ipynb</span>
<span class="n">train_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s1">&#39;./resources/training_set.csv&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">set_index</span><span class="p">([</span><span class="s1">&#39;file&#39;</span><span class="p">,</span><span class="s1">&#39;start_time&#39;</span><span class="p">,</span><span class="s1">&#39;end_time&#39;</span><span class="p">])</span>
<span class="n">train_df</span><span class="p">[</span><span class="s1">&#39;negative&#39;</span><span class="p">]</span><span class="o">=</span><span class="mi">1</span><span class="o">-</span><span class="n">train_df</span><span class="p">[</span><span class="s1">&#39;rana_sierrae&#39;</span><span class="p">]</span>
<span class="n">val_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s1">&#39;./resources/validation_set.csv&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">set_index</span><span class="p">([</span><span class="s1">&#39;file&#39;</span><span class="p">,</span><span class="s1">&#39;start_time&#39;</span><span class="p">,</span><span class="s1">&#39;end_time&#39;</span><span class="p">])</span>
<span class="n">val_df</span><span class="p">[</span><span class="s1">&#39;negative&#39;</span><span class="p">]</span><span class="o">=</span><span class="mi">1</span><span class="o">-</span><span class="n">val_df</span><span class="p">[</span><span class="s1">&#39;rana_sierrae&#39;</span><span class="p">]</span>
<span class="n">train_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s1">'./resources/training_set.csv'</span><span class="p">)</span><span class="o">.</span><span class="n">set_index</span><span class="p">([</span><span class="s1">'file'</span><span class="p">,</span><span class="s1">'start_time'</span><span class="p">,</span><span class="s1">'end_time'</span><span class="p">])</span>
<span class="n">train_df</span><span class="p">[</span><span class="s1">'negative'</span><span class="p">]</span><span class="o">=</span><span class="mi">1</span><span class="o">-</span><span class="n">train_df</span><span class="p">[</span><span class="s1">'rana_sierrae'</span><span class="p">]</span>
<span class="n">val_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s1">'./resources/validation_set.csv'</span><span class="p">)</span><span class="o">.</span><span class="n">set_index</span><span class="p">([</span><span class="s1">'file'</span><span class="p">,</span><span class="s1">'start_time'</span><span class="p">,</span><span class="s1">'end_time'</span><span class="p">])</span>
<span class="n">val_df</span><span class="p">[</span><span class="s1">'negative'</span><span class="p">]</span><span class="o">=</span><span class="mi">1</span><span class="o">-</span><span class="n">val_df</span><span class="p">[</span><span class="s1">'rana_sierrae'</span><span class="p">]</span>

<span class="c1"># upsample to match the class with the most samples (reuse samples from other classes)</span>
<span class="n">train_df</span> <span class="o">=</span> <span class="n">resample</span><span class="p">(</span><span class="n">train_df</span><span class="p">,</span><span class="n">upsample</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span><span class="n">n_samples_per_class</span><span class="o">=</span><span class="n">train_df</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">max</span><span class="p">())</span>
Expand Down Expand Up @@ -14696,10 +14700,10 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<span class="c1"># can skip this non-critical step by passing wandb_session=None to train() and commenting</span>
<span class="c1"># out these lines</span>
<span class="n">wandb_session</span> <span class="o">=</span> <span class="n">wandb</span><span class="o">.</span><span class="n">init</span><span class="p">(</span>
<span class="n">entity</span><span class="o">=</span><span class="s1">&#39;kitzeslab&#39;</span><span class="p">,</span>
<span class="n">project</span><span class="o">=</span><span class="s2">&quot;rana_sierrae_notebooks&quot;</span><span class="p">,</span>
<span class="n">entity</span><span class="o">=</span><span class="s1">'kitzeslab'</span><span class="p">,</span> <span class="c1">#replace this with your WandB "entity" ie group name</span>
<span class="n">project</span><span class="o">=</span><span class="s2">"rana_sierrae_notebooks"</span><span class="p">,</span>
<span class="n">config</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span>
<span class="n">comment</span><span class="o">=</span><span class="s2">&quot;Description: training resnet18 on A &amp; E classes and excluding unknown class X&quot;</span><span class="p">,</span>
<span class="n">comment</span><span class="o">=</span><span class="s2">"Description: training resnet18 on A &amp; E classes and excluding unknown class X"</span><span class="p">,</span>
<span class="p">)</span>
<span class="p">)</span>
</pre></div>
Expand All @@ -14726,11 +14730,11 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea">
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[4]:</div>
<div class="jp-InputPrompt jp-InputArea-prompt">In&nbsp;[&nbsp;]:</div>
<div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div class="CodeMirror cm-s-jupyter">
<div class=" highlight hl-ipython3"><pre><span></span><span class="c1"># create opensoundscape.CNN object to train a CNN on audio</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">CNN</span><span class="p">(</span><span class="n">architecture</span><span class="o">=</span><span class="s1">&#39;resnet18&#39;</span><span class="p">,</span><span class="n">classes</span><span class="o">=</span><span class="n">train_df</span><span class="o">.</span><span class="n">columns</span><span class="p">,</span><span class="n">sample_duration</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span><span class="n">single_target</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">CNN</span><span class="p">(</span><span class="n">architecture</span><span class="o">=</span><span class="s1">'resnet18'</span><span class="p">,</span><span class="n">classes</span><span class="o">=</span><span class="n">train_df</span><span class="o">.</span><span class="n">columns</span><span class="p">,</span><span class="n">sample_duration</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span><span class="n">single_target</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>

<span class="c1">#modify preprocessing of the CNN:</span>
<span class="c1">#bandpass spectrograms to 300-2000 Hz</span>
Expand All @@ -14741,7 +14745,7 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<span class="n">model</span><span class="o">.</span><span class="n">preprocessor</span><span class="o">.</span><span class="n">pipeline</span><span class="o">.</span><span class="n">add_noise</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="n">std</span><span class="o">=</span><span class="mf">0.01</span><span class="p">)</span>

<span class="c1"># decrease the learning rate from the default value</span>
<span class="n">model</span><span class="o">.</span><span class="n">optimizer_params</span><span class="p">[</span><span class="s1">&#39;lr&#39;</span><span class="p">]</span><span class="o">=</span><span class="mf">0.002</span>
<span class="n">model</span><span class="o">.</span><span class="n">optimizer_params</span><span class="p">[</span><span class="s1">'lr'</span><span class="p">]</span><span class="o">=</span><span class="mf">0.002</span>
</pre></div>

</div>
Expand Down Expand Up @@ -14777,7 +14781,7 @@ <h2 id="Train-CNN-to-detect-Rana-sierrae-in-audio-recordings">Train CNN to detec
<span class="n">epochs</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span>
<span class="n">num_workers</span><span class="o">=</span><span class="mi">12</span><span class="p">,</span>
<span class="n">save_path</span><span class="o">=</span><span class="sa">f</span><span class="s1">&#39;./resources/&#39;</span><span class="p">,</span>
<span class="n">save_path</span><span class="o">=</span><span class="sa">f</span><span class="s1">'./resources/'</span><span class="p">,</span>
<span class="n">save_interval</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
<span class="n">log_interval</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span>
<span class="n">validation_interval</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
Expand Down
Loading

0 comments on commit 4cc8fb9

Please sign in to comment.