It contains all the necessary python scripts to replicate the cross-temporal decoding (CTD) analyses of the paper: Enel, Pierre, Joni D. Wallis, and Erin L. Rich. "Stable and dynamic representations of value in the prefrontal cortex." eLife 9 (2020): e54313.
To replicate the results, you must first download the data set of the paper.
A conda environment file (valuedynamics.yml) is provided to avoid module
requirements issues. You can re-create the environment that was used to analyze
the data from this file. Here is how to do it with the conda
command line tool
conda env create -f valuedynamics.yml
There are three scripts to run, each corresponding to a different step. First ensembles must be optimized, then out of sample accuracy and permutations are calculated and finally the results are plotted.
In the parameters.py, the directories must be set so that the scripts can
find the data and you can decide where to save the results. Open that file and
replace and set the variables unitfolder
so that it corresponds to the folder
where the 3 data set files can be found and resfolder
as a folder of your
choice where all the results generated by these scripts will be saved. The data
and analyzes parameters are the ones that have been used to in the published
article. They can be modified, however, we suggest not to modify the EVT_WINS
parameters.
Ensemble searching and permutation testing are computationally intensive. To
spread the computational load across many CPU cores, we used the parallelization
tool dask. It allows to distribute jobs across cores on a
local machine or on a cluster of distant machines. To obtain the published
results we used a cluster comprising 128 cores to be able to compute the results
in a reasonable amount of time, especially for the OFC data set that contained
around 800 neurons. If you wish to reproduce the analysis on a smaller data set,
less computational power may be sufficient. Tutorials on how to setup a dask
cluster can be found online. In the parameters.py file there are two
variables that can be set for parallelization: DISTCLUSTER
which specifies the
ip address and port for a distributed cluster, and NLOCALWORKERS
which
specifies the number of cores to be used if local processing is chosen instead.
The choice between distributed/local/no parallelization is specified in each
individual script to facilitate debugging.
Ensemble searching is performed with the script finding_ensembles.py. There
are analysis and parallelization parameters. The analysis parameters monkeys
,
regions
, taskvars
, subspaces
and stables
specifies which conditions
should be explored. Each element of the list for each of theses variables
correspond to the aspect of a condition that will be explored in combination
with all the possible state of all the other variables (except in cases in which
they are not compatible, e.g. subspace == True and stable == False
).
For parallelization, there is a parallel
parameter that can be set to True
to use a dask cluster or to False
to process all data sequentially on a
single core (relevant for debugging). If parallel
is set to True
, the
parameter cluster
specifies whether a 'local'
or 'distributed'
cluster
should be used. If 'distributed'
, the ip address of a dask cluster specified
in the parameters.py will be used.
The performance for each ensemble explored in each condition is saved in the folder specified in parameters.py, with one file for each cross-validation fold of each condition. When all folds of a condition have been computed, the files are combined and the individual fold files are deleted.
Testing and permutations are performed with the CTD_test_and_permutations.py
script. The parameters are very much the same as with ensemble searching, except
for two additional parameters: ensembles
which specifies whether optimized
ensembles are used to compute the CTD, and permutation
which specifies the
computations correspond to permutations testing or to testing with original data
(None
for original data testing, or an integer to specify the number of
permutations). Note that the stable
variable can either be True
or False
and in the case of subspace == False and ensemble == False
, stable
can only
be True
and will correspond to performing CTD on the original data. All the
incompatible combinations of variables are ignored and signaled by the script.
The results are saved in the result folder specified in parameters.py.
Plotting is done with the script plot_CTD_results.py by specifying the specific condition that one wants to plot in the end of the script.
Don't hesitate to contact me at pierre.enel@gmail.com if you have trouble running the scripts or if you find bugs in the code.