Our fMRI dataset is managed under version control using datalad
.
This repository holds the 'structure' of the data, along with the smaller files, but if you clone the repository you will notice that the big files show up on your machine as 'empty' symlinks.
The real files are chopped up into little pieces and stored elsewhere (currently this Dropbox folder). The idea behind datalad is that you only pull down the data you need for the analyses you want to run (otherwise you'd end up with over 200GB on your machine). To do so, you'll do something like this (essentially following this guide:
datalad clone git@github.com:hawkrobe/tangrams-fmri-data.git
- Install
rclone
- Install
git-annex-remote-rclone
- Run
rclone config
in the directory and use the following settings as you work through the prompts:
$ rclone config
No remotes found - make a new one
n/s/q> n
name> dropbox-fmri # use this exact name
Storage> dropbox
client_id> # leave this line blank
client_secret> # leave this line blank
y/n> n
If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth
Log in and authorize rclone for access
- Run the line you were told to run when you first ran
datalad clone ...
It should look like this.
datalad siblings -d "<path>" enable -s dropbox-fmri
- Now you can pull down any files you want, e.g.
datalad get bids/sub-p002/*
will download the rawnii
files associated with subjectp002
.
TODO
TODO
cleaned_transcripts
contains cleaned up versions of raw transcripts, checked by hand, and split out by trialcleaned_behavioral
contains cleaned up versions of behavioral data, fixing various naming glitches
cleaned_audio
contains trimmed recordings split by run (manually checked using audacity)raw_transcripts
contains output oftranscribe.sh
which runs WHISPER on thecleaned_audio
files`
raw_audio
contains raw recordings in.wav
formatraw_behavioral
contains raw files exported from Empirica