Skip to content

Latest commit

 

History

History
364 lines (265 loc) · 22 KB

README.md

File metadata and controls

364 lines (265 loc) · 22 KB

scaNpix

General remarks:

• This is a Matlab package to either load DACQ or neuropixel data into Matlab for inspection in a GUI and further analysis

• You will need to use at least MATLAB 2017b for the general code base, and MATLAB 2018b for the GUI. Note that I coded most of it in MATLAB 2019, so there might be dependencies that I have overlooked.

• If you do find a bug or have a request to improve something, please raise an issue in Github rather than email me personally about it.

• Use this code at your own risk; it will hopefully help you get that Nature paper, but equally likely some bug will screw up your analysis

Extra toolboxes required:

Circular statistics tool box

A) Getting Started

• Download/Clone/Photocopy the package and add to your Matlab path (note that Matlab won’t let you add the +scanpix folder itself, but rather only the parent directory)

1. Some general remarks about the syntax

• Functions get called the following way: scanpix.FunctionName or scanpix.SubPackage.FunctionName, e.g.if you want to call the function makeRateMaps from the +maps subpackage you would do:

scanpix.maps.makeRateMaps(someInput);

• You can use the Tab key to autocomplete for subpackage and/or function name

2. Create a class object and load some data:

We first create a data object by grabbing some basic parameters (see also section about the parameter space) and then opening a UI dialogue to fetch which data type (e.g. DACQ or neuropixels) and whhat files to load. This will initialise the object into the Matlab workspace, but not load any data from disk yet. To select what data to load we first select a parent directory. Then we will get a list of all data files contained within any sub directories and from this list we select which data we want to load (so if your parent directory contains a lot of data files the list will be long). This data will then be treated as one experiment.

Syntax:

obj = scanpix.ephys(someInput);

Then we can use the class's load method to load the actual data, like so:

obj.load(loadMode, varargin);

Inputs (type):

  • loadMode (cell array)

    • controls what part(s) of the data will be loaded into object. Either {‘all’} or any combination of {‘pos’,’spikes’,’lfp’}
  • varargin (comma separated list of strings)

    • comma separated list of data file names (ommit extensions) to be loaded. Useful when you might want to reload only some particular part of the data

Examples:

obj.load;                           % load all files indicated in obj.trialNames. Types of data to be loaded will be collected with UI dialogue
obj.load({‘pos’,’lfp’});            % load position and eeg data for all files indicated in obj.trialNames.
obj.load({‘all},’SomeDataFileName’); % load all types data for trial ’SomeDataFileName’

Other object methods

  • add more info here

3. Do something exciting with the data you loaded into Matlab

3A Inspect data in GUI

• The GUI(s) were all made with the app designer, so if you want to look at the code or edit something you need to open it in there

• You can start (a) GUI(s) by using a wrapper function (scanpix.GUI.startGUI) or by calling it directly (e.g. mainGUI)

• When launching the main GUI, we will check if there are any objects matching the data type in the base workspace and ask user if they want to import these, but you can also simply load the data from raw within the GUI or load a GUI state that you saved to disk previously.

• In the GUI you can load and inspect multiple datasets/data from multiple experiments

Syntax:

scanpix.GUI.startGUI(GUIType);

or

mainGUI(classType);

Inputs:

  • GUIType (string)

    • ‘mainGUI’ – start main GUI to inspect data

    • ‘lfpBrowser’ – start GUI to browse EEG data (Note: Not implemented yet!)

Main GUI:

Picture1

GUI menu bar:

  • File:

    • Load Data - Load Datasets from raw, from a GUI state or reload sorting results
    • Save Data - Save GUI state; either the currently selected dataset(s) or the full GUI content
    • Delete Data - Delete currently selected dataset(s), trial(s) or cell(s)
    • Help - Display help (will display available shortcuts in GUI)
  • Settings:

    • Set Defaults - Set default parameters for all aspects of GUI or restore the built-in defaults. Note that the directory in the GUI defaults is the location on disk where data will be saved to
    • Pause Updates - Pause updates for any plot(s) on the Overview tab in case updating is slow or plot(s) are irrelevant for user
  • Data:

    • Add MetaData - Add meta data to currently selected dataset
    • Filter - Filter data according to cluster label (neuropixel data only), minimum number of spikes in any trial or spatial properties. The latter will prompt user to indicate the threshold value, filter direction ('gt'=greater than or 'lt'=lower than) and trial Index. For the trial index either a single numeric index can be used or any number of valid trial indices can be combined by either Matlab AND & or OR | operators (e.g. to filter for cells that pass threshold in the first 3 trials of a given dataset, set trial index to 1&2&3). Note that currently you cannot combine &'s and |'s. Use the Remove Filter option to undo any filtering
  • Figures:

    • Save As PDF - Save figures generated by GUI as PDFs to disk (either all or custom selection)
    • Neuropixels or DACQ - Generate data type specific figures (currently only for neuropixel data)
    • Maps - Generate time series plots (Plot time series)
  • Analysis:

    • Spatial Props - Generate spatial properties for all cells/trials in currently selected data set. Result output is into base work space as a cell array named datasetName__Property with cellID in output{:,1} and values as cell-by-trial array in output{:,2}. Note that cells removed by current filter will be ignored for analysis output

GUI shortcuts

  • h - display help
  • CTRL+D - make Dir Maps
  • CTRL+F - save Figures
  • CTRL+L - Load data
  • CTRL+M - make Rate Maps
  • CTRL+S - save GUI state
  • CTRL+F - save Figures
  • Up/Down - browse through cells
  • Insert/Delete - browse through trials
  • PageUp/PageDown - browse through datasets
  • Wheelscroll - vertical scroll in figures
  • CTRL+Wheelscroll - horizontal scroll in figures

Batch Loading of data

It's possible to load a batch of datasets into the GUI by supplying paths etc via a spreadsheet. For information on the format of the spreadsheet, see scanpix.readExpInfo.

3B Use the code to edit/analyse the data in objects from command window

analysis package

• functions/code for data analysis; e.g. calculation of map properties like spatial information etc. as well as cell properties like gridness etc.

scanpix.analysis.functionName(someInput);

dacq/npix Utils maps packages

• functions/code that is specific for data type

scanpix.dacqUtils.functionName(someInput);
scanpix.npixUtils.functionName(someInput);

fxchange package

• functions/code from Matlab file exchange

scanpix.fxchange.functionName(someInput);

GUI packages

• app designer code for GUI(s) as well as a few GUI specific functions. Note that when you debug code in App designer while running the GUI, updates of plots in the GUI tend to become quite sluggish

scanpix.GUI.functionName(someInput);

helpers package

• functions/code that e.g. helps with data management/processing in objects

scanpix.helpers.functionName(someInput);

maps package

• functions/code to generate various types of spatial rate maps

scanpix.maps.functionName(someInput);

plot package

• functions/code to make various types of beautiful plots of your data

scanpix.plot.functionName(someInput);

B) Parameter space:

There are two different parameter spaces that are used within scaNpix

1. General

Parameters that are used when loading data and doing some basic pre-processing (e.g. position smoothing). Typically, you will not need to change the majority of these and they are stored in obj.params as a map container. You can access values by using the name of the individual parameter as the key (e.g. obj.params(‘posFS’) will give you the position sample rate and obj.params.keys will give you a list of all parameters in the container). The default values are generated with defaultParamsContainer.m and you should leave these as they are, but you can save your own version to a file (scanpix.helpers.saveParams(obj,‘container’) can write the current map container in object to disk). You should store your parameter file in 'PathOnYourDisk\+scanpix\files\YourFile.mat'

Full List DACQ:

  • ScalePos2PPM – scale position data to this pix/m (default=400). This is particularly useful for keeping rate map sizes in proportion, if you recorded data across different environments that have a different size and/or pix/m setting for their camera setup
  • posMaxSpeed – speed > posMaxSpeed will be assumed tracking errors and ignored (set to NaN); in m/s (default=4)
  • posSmooth – smooth position data over this many seconds (default=0.4)
  • posHead – relative position of head to headstage LEDs (default=0.5)
  • posFs – position data sampling rate in Hz; leave empty as will be read from pos file (50Hz)
  • cutFileType – type of cut file, i.e. ‘cut’ (Tint) or ‘clu’ (KlustaKwik); default=’cut’
  • cutTag1 – cut file tag that follows base filename but precedes _tetrodeN in filename (default=’’); this is something historic (and idiosyncratic) for the data of the original pup replay study, so chances are you can just ignore this
  • cutTag2 – cut file tag that follows _tetrodeN in filename (default=’’)
  • lfpFs – low sampling rate EEG data sampling rate in Hz; leave empty as will be read from eeg file (250Hz)
  • lfpHighFs – high sampling rate EEG data sampling rate in Hz; leave empty as will be read from eeg file (4800Hz)
  • defaultDir – default directory for UI dialogues where to look for things, e.g. data (default='Path/To/The/+scaNpix/Code/On/Your/Disk')
  • myRateMapParams'FileNameOfYourRateMapParams.mat' (default=’’)

Full List neuropixel data:

  • ScalePos2PPM – scale position data to this pix/m (default=400). This is particularly useful for keeping rate map sizes in proportion, if you recorded data across different environments that have a different size and/or pix/m setting for their tracking
  • posMaxSpeed – speed > posMaxSpeed will be assumed tracking errors and ignored (set to NaN); in m/s (default=4)
  • posSmooth – smooth position data over this many s (default=0.4)
  • posHead – relative position of head to headstage LEDs (default=0.5)
  • posFs – position data sampling rate in Hz (default=50Hz)
  • lodFromPhy – logical flag to indicate what sorting results to use. If true we'll try Phy otherwise we'll use the raw kilosort results
  • APFs – sampling rate for single unit neuropixel data (30000Hz)
  • lfpFs – sampling rate for lfp from neuropixel data (2500Hz)
  • defaultDir – default directory for UI dialogues where to look for things, e.g. data (default='Path/To/The/+scaNpix/Code/On/Your/Disk')
  • myRateMapParams'FileNameOfYourRateMapParams.mat' (default=’’)

2. Parameters for rate maps.

These are stored as a scalar MATLAB structure in obj.mapParams (note that obj.mapParams is a hidden property) and the default values are generated with scanpix.maps.defaultParamsRateMaps.m – again do not edit anything in there. You can edit these parameters on the fly when generating different kinds of rate maps. If you want to use your own custom values by default you should edit them within object and then save them to disk using scanpix.helpers.saveParams(obj,‘maps’) to PathOnYourDisk/+scanpix/files/YourFile.mat. Then go and set obj.params(‘myRateMapParams’)='Path/To/Your/File’.

Full list:

  • General params:

    • speedFilterLimitLow – lower limit for speed in cm/s (default=2.5)
    • speedFilterLimitHigh – upper limit for speed in cm/s (default=400)
  • 2D rate maps:

    • speedFilterFlagRMaps – logical flag if speed filtering for position data is invoked (default = true)
    • binSizeSpat – bin size for spatial rate maps in cm2 (default=2.5)
    • smooth – type of smoothing; 'adaptive' (default) or 'boxcar’
    • alpha – alpha parameter for adaptive smoothing in seconds (default=200; usually shouldn’t be changed)
    • mapSize – final size of map (xy) in bins (default=[ ]); if empty we will use the max extent of sampled positions, but if you had a badly sampled environment you could reconstruct the sampled space in relation to the actual size (you will need a record of e.g. coordinates of diagonal opposite corners in a rectangular environment)
  • directional maps:

    • speedFilterFlagDMaps – logical flag if speed filtering for position data is invoked (default=true)
    • binSizeDir - bin size for directional maps in degrees (default=6)
    • dirSmoothKern – size of smoothing kernel for directional maps in degrees (default=5)
  • linear rate maps:

    • binSizeLinMaps – bin size for linear rate maps in cm (default=2.5)
    • smoothFlagLinMaps – logical flag if maps should be smoothed (default=true)
    • smoothKernelSD – SD of Gaussian smoothing kernel in bins (default=2). Kernel is 5*SD in length (should we make this smaller?)
    • speedFilterFlagLMaps – logical flag if speed filtering for position data is invoked (default=true)
    • posIsCircular – logical flag position data is assumed to be circular, as e.g. on square track (default=false)
    • remTrackEnds – set this many bins to NaN at each end of the linear track (default=0). Don’t use for square track data
  • Parameters for linearisation of position data:

    • minDwellForEdge – minimum dwell of animal in bin at edge of environment in seconds (default=1)
    • durThrCohRun – threshold for minimum duration of run in one direction in seconds (default=2); set to 0 if you don’t want to remove position data of run periods < threshold
    • filtSigmaForRunDir – SD of the Gaussian filter to pre-filter the data before finding CW and CCW runs in seconds (default=3); Kernel is 2*SD in length.
    • durThrJump – threshold for short periods of change of running direction that will be discounted if gradient < gradThrForJumpSmooth; in seconds (default=2)
    • gradThrForJumpSmooth - gradient of the smoothed linear positions in the jump window in cm/s (default=2.5);
    • runDimension – will be estimated from data if left empty (default=[ ]). Only used for linear track data (somewhat historic parameter as could generally be estimated from data)
    • dirTolerance - Tolerance for heading direction in degrees, relative to arm direction, for calculating run direction on track (default=70)

C) Class properties ( property(format) ):

General:

  • params (containers.Map) – map container with general params

  • chanMap (struct) – Kilosort channel map (scanpix.npix only)

  • dataPath (char) – FullPathToDataOnDisk

  • dataSetName (char) – unique identifier for dataset (note that for DACQ this will generate a non-decript name as we don't have a metadata file for Axona data where we could get this information from)

  • trialNames (string array) – list of filenames in dataset

  • cellID (double) – (nCells,3) array. For DACQ this contains cell_ID (column 1), tetrode_ID (column 2) and simple numeric index (column 3). For neuropixel data this is cluster_ID (column 1), cluster depth (column 2) and channnel closest to centre of mass of cluster (column 3)

  • cellLabel (string) – (nCells,1) string array that contains label for clusters from Kilosort ('good' or 'mua') or Phy ('good', 'mua' or 'noise') (scanpix.npix only)

  • metaData (struct) – store metadata in fields metaData.(someString)

  • trialMetaData (struct) – non-scalar structure that contains various trial specific meta data (from e.g. .set or .meta files :

    • DACQ:

      • tracked_spots – n of LEDs
      • xmin – min of camera window X
      • xmax – max of camera window X
      • ymin – min of camera window Y
      • ymax – max of camera window Y
      • sw_version – software version
      • trial_time – start of recording in time of day (as set on machine)
      • ADC_fullscale_mv – scale max for channels at gain=1 in mV (for USB systems should be 1.5V)
      • lightBearing – direction of LEDs in degrees (up to 4 lights)
      • colactive – logic index of active LEDs (probably only relevant if using multi-colour LED tracking in DACQ)
      • gains – nTetrodes x 4 array of channel gains (Note up to 32 tetrodes (128 channels) possible)
      • fullscale – nTetrodes x 4 array of scale max in µV (Note up to 32 tetrodes (128 channels) possible)
      • eeg_channel – nEEGs x 1 array of channels that EEGs were recorded from
      • eeg_recordingChannel – nEEGs x 1 array of channels that were set to EEG in DACQ (this will be same as above if EEG was recorded in mode SIGNAL but different if it was REF)
      • eeg_slot – nEEGs x 1 array of EEG number in DACQ (so .eeg, .eeg2, … , .eegN)
      • eeg_scalemax – nEEGs x 1 array of scale max for EEG channels
      • eeg_filter – nEEGs x 1 array of filter type for EEG (0=DIRECT, 1=DIRECT+NOTCH, 2=HIGHPASS, 3=LOWPASS, 4=LOWPASS+NOTCH)
      • eeg_filtresp – filter type of user defined filter (lowpass, highpass, bandpass, bandstop)
      • eeg_filtkind – filter kind for user defined filter (most likely Butterworth)
      • eeg_filtfreq1 – lower bound for user defined bandpass filter (default=300Hz)
      • eeg_filtfreq2 – upper bound for user defined bandpass filter (default=7kHz)
      • eeg_filtripple – mystery parameter of user defined filter (should be left at 0.1 according to manual)
      • ppm: pixel/m – This will contain the final ppm for the position data, i.e. will be different to original when scaling data to common ppm value
      • ppm__org: pixel/m – We store the actual raw data pix/m value here
      • trackType – 'sqtrack or 'linear' (default='');
      • trackLength – track length in cm (default=[ ]); as will differ for each type of track). For square track it should be length for 1 arm only. This is crucial to make rate map size match across datasets.
    • Neuropixel data:

      • PARAMETERS GO HERE
      • ppm: pixel/m – This will contain the final ppm for the position data, i.e. will be different to original when scaling data to common ppm value
      • ppm__org: pixel/m – We store the actual raw data pix/m value here
      • trackType – 'sqtrack or 'linear' (default='');
      • trackLength – track length in cm (default=[ ]); as will differ for each type of track). For square track it should be length for 1 arm only. This is crucial to make rate map size match across datasets.
  • posData (struct) – scalar structure with position data, with fields:

    • XY_raw – cell arrays of raw animal position in pixels (xy-coordinates)
    • XY – cell arrays of binned (i.e. integers) animal position in pixels (xy-coordinates)
    • direction – cell arrays of head direction of animal in degrees
    • speed – cell arrays of running speed of animal in cm/s
    • linXY – cell arrays of linerised position (will only be generated when making linear rate maps)
    • sampleT - sample times of position frames grabbed off Bonsai data. Not really used for anything (scanpix.npix only)
  • spikeData (struct) – scalar structure with spike data, with fields:

    • spk_times – cell arrays of spike times (in seconds)
    • spk_waveforms – cell arrays of waveforms (in µV); format for each cell is nSpikes-by-nSample-by-nChannel (i.e. nSpikesx50x4 for DACQ)
    • sampleT - timestamps for position frames in neuropixel time. only relevant for scanpix.npix
  • lfpData (struct) – scalar structure with eeg data, with fields:

    • lfp – cell arrays of low sample rate (250Hz) EEG data for DACQ data or the standard lfp data (2.5kHz)for neuropixel recordings (respectively in µV)
    • lfpHigh – cell arrays of high sample rate (4800Hz) EEG data (in µV) (scanpix.dacq only)
    • lfpTet – cell array of tetrode IDs EEGs were recorded from (scanpix.dacq only)
  • maps (struct) – scalar structure with rate maps

    • rate – cell arrays of standard 2D rate maps
    • spike – cell arrays of 2D spike maps
    • pos – cell arrays of 2D position maps
    • dir – cell arrays of directional maps
    • sACs – cell arrays of spatial autocorellograms
    • lin – cell(3,1) arrays of linearised rate maps. Format is full rate maps ({1}), rate map for CW ({2}) and CCW ({3}) runs
    • linPos – cell arrays of linearised position maps. Format is same as for lin

Hidden properties:

  • fileType (char) - identifier for data file type (i.e. .set for DACQ and .ap.bin for neuropixel data)
  • fields2spare (cell array) – cell array of fields that will not be changed when e.g. deleting data from object. Typically, fields that contain only 1 value/dataset (default DACQ={'params','dataSetName','cell_ID'}; default NPIX={'params','dataSetName','cell_ID','cell_Label'})
  • mapParams (struct) – stores default rate map params (as in defaultRateMapParams.m)
  • loadFlag (logical) – flag to indicate if any data has been loaded into object (default=false)