Skip to content

Installation and Usage Instructions

Tim Coalson edited this page Nov 19, 2024 · 37 revisions

HCP Pipelines

Table of Contents


Prerequisites

The HCP Pipelines have the following software requirements:

  1. A 64-bit Linux Operating System

  2. The FMRIB Software Library (a.k.a. FSL) version 6.0.2 or greater installed and configuration file properly sourced. The latest version of FSL is recommended.

  3. FreeSurfer version 6.0 available at http://surfer.nmr.mgh.harvard.edu/fswiki/DownloadAndInstall/

    FreeSurfer 7.X is not currently supported due to poor quality surface reconstructions.

    NB: You must create and install a license file for FreeSurfer by visiting and submitting the FreeSurfer registration form.

    NB: For now, FreeSurfer 5.3.0-HCP (used for original processing of HCP data) is still supported, but if you use it you must use the FreeSurferPipeline-v5.3.0-HCP.sh pipeline script instead,(i.e. modify FreeSurferPipelineBatch.sh ${queuing_command} line to use ${HCPPIPEDIR}/FreeSurfer/FreeSurferPipeline-v5.3.0-HCP.sh)

  4. Connectome Workbench version 1.4.2 or later

    The HCP Pipelines scripts use wb_command which is part of the Connectome Workbench. They locate wb_command using an environment variable. Instructions for setting this environment variable are provided below in the Running the HCP Pipelines on example data section.

  5. The HCP version of gradunwarp version 1.1.0 (gradient_unwarp.py, if gradient nonlinearity correction is to be done).

  6. MSM_HOCR v3.0/(github v1.0): https://github.com/ecr05/MSM_HOCR/releases This is the multi-modal surface matching algorithm used in MSMSulc and MSMAll. The pre-compiled binaries are fine if they work on your operating system.

  7. FSL FIX v 1.0.6.14+: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FIX sICA+FIX is used for cleaning spatially specific structured noise from fMRI data. Please see this page for supplemental instructions for sICA+FIX: https://github.com/Washington-University/HCPpipelines/blob/master/ICAFIX/README.md


Notes on gradient nonlinearity correction

  1. Gradient Nonlinearity Correction is sometimes also referred to as Gradient Distortion Correction or GDC

  2. As is true of the other prerequisite pieces of software, the HCP version of gradunwarp has its own set of prerequisites. See the HCP gradunwarp README file for those prerequisites.

  3. In order to run HCP gradunwarp, you will need a gradient coefficients file to use as an input to the gradient distortion correction process. Please see questions 7 and 8 in the HCP Pipelines FAQ for further information about gradient nonlinearity correction and obtaining a gradient coefficients file.

  4. The HCP Pipelines scripts expect to be able to find the main module of the gradunwarp tool (gradient_unwarp.py) within a directory specified in the PATH environment variable.

  5. As distributed, the examples scripts that serve as templates for running various types of pipeline processing are set to not run gradient distortion correction. Commented out portions of those scripts illustrate how to change the variable settings to perform gradient distortion correction. These commented out portions assume that you have placed the gradient coefficients file in the standard configuration directory for your installation of HCP Pipelines (the global/config directory within your HCP Pipelines installation directory).


Installation

  1. Install the listed prerequisites first.

    • Installation Notes for FSL

      • Once you have installed FSL, verify that you have the correct version of FSL by simply running the $ fsl command. The FSL window that shows up should identify the version of FSL you are running in its title bar.

      • Sometimes FSL is installed without the separate documentation package, it is most likely worth the extra effort to install the FSL documentation package.

    • Ubuntu Installation Notes for FreeSurfer

      • For Linux, FreeSurfer is distributed in gzipped tarballs for CentOS 4 and CentOS 6.

      • The instructions here provide guidance for installing FreeSurfer on Ubuntu. If following the instructions there, be sure to download version 6.0 of FreeSurfer and not version 5.1.0 as those instructions indicate.

      • Ubuntu (at least starting with version 12.04 and running through version 14.04 LTS) is missing a library that is used by some parts of FreeSurfer. To install that library enter $ sudo apt-get install libjpeg62.

  2. Download the necessary compressed tar file (.tar.gz) for the HCP Pipelines release.

  3. Move the compressed tar file that you download to the directory in which you want the HCP Pipelines to be installed, e.g.

     $ mv HCPpipelines-4.3.0.tar.gz ~/projects
    
  4. Extract the files from the compressed tar file, e.g.

     $ cd ~/projects
     $ tar xvf HCPpipelines-4.3.0.tar.gz
    
  5. This will create a directory containing the HCP Pipelines, e.g.

     $ cd ~/projects/HCPpipelines-4.3.0
     $ ls -F
     bedpostX/                fMRIVolume/                        LICENSE.md       product.txt          TaskfMRIAnalysis/
     DeDriftAndResample/      FreeSurfer/                        MSMAll/          README.md            version.txt
     DiffusionPreprocessing/  GenerateSpinEchoBiasFieldPrereqs/  MSMConfig/       RestingStateStats/
     Examples/                global/                            PostFreeSurfer/  show_version*
     fMRISurface/             ICAFIX/                            PreFreeSurfer/   Supplemental/
     $
    
  6. This newly created directory is your HCP Pipelines Directory.

    In this documentation, in documentation within the script files themselves, and elsewhere, we will use the terminology HCP Pipelines Directory interchangeably with HCPPIPEDIR, $HCPPIPEDIR, or ${HCPPIPEDIR}.

    More specifically, $HCPPIPEDIR and ${HCPPIPEDIR} refer to an environment variable that will be set to contain the path to your HCP Pipelines Directory.


Getting example data

Example data for becoming familiar with the process of running the HCP Pipelines and testing your installation is available from the Human Connectome Project.

If you already have (or will be obtaining) the gradient coefficients file for the Connectome Skyra scanner used to collect the sample data and want to run the pipelines including the steps which perform gradient distortion correction, you can download a zip file containing example data here. The download requires a ConnectomeDB account and signing of the data use terms as described on p.24 of the HCP Release Manual.

In that case, you will need to place the obtained gradient coefficients file (coeff_SC72C_Skyra.grad) in the global/config directory within your HCP Pipelines Directory.

If you do not have and are not planning to obtain the gradient coefficients file for the Connectome Skyra scanner used to collect the sample data and want to run the pipelines on files on which gradient distortion correction has already been performed, you should download a zip file containing example data here. The download requires a ConnectomeDB account and signing of the data use terms as described on p.24 of the HCP Release Manual.

The remainder of these instructions assume you have extracted the example data into the directory ~/projects/Pipelines_ExampleData. You will need to modify the instructions accordingly if you have extracted the example data elsewhere.


Running the HCP Pipelines on example data

Structural preprocessing

Structural preprocessing is subdivided into 3 parts (Pre-FreeSurfer processing, FreeSurfer processing, and Post-FreeSurfer processing). These 3 steps should be executed in the order specified, and each of these 3 parts is implemented as a separate bash script.

Pre-FreeSurfer processing

In the ${HCPPIPEDIR}/Examples/Scripts directory, you will find a shell script for running a batch of subject data through the Pre-FreeSurfer part of structural preprocessing. This shell script is named: PreFreeSurferPipelineBatch.sh. You should review and possibly edit that script file to run the example data through the Pre-FreeSurfer processing.

StudyFolder

The setting of the StudyFolder variable near the top of this script should be verified or edited. This variable should contain the path to a directory that will contain data for all subjects in subdirectories named for each of the subject IDs.

As distributed, this variable is set with the assumption that you have extracted the sample data into a directory named projects/Pipelines_ExampleData within your login or "home" directory.

   StudyFolder="${HOME}/projects/Pipelines_ExampleData"

You should either verify that your example data is extracted to that location or modify the variable setting accordingly.

Subjlist

The setting of the Subjlist variable, which comes immediately after the setting of the StudyFolder variable, should also be verified or edited. This variable should contain a space delimited list of the subject IDs for which you want the Pre-FreeSurfer processing to run.

As distributed, this variable is set showing the syntax for running the script on two subjects in a space delimited list "100307 100610". Under the assumption that you will run the processing only for the single example subject which has a subject ID of 100307, as provided in the example data, you need to change this variable to list only that subject:

   Subjlist="100307"

Using this value in conjunction with the value of the StudyFolder variable, the script will look for a directory named 100307 within the directory ${HOME}/projects/Pipelines_ExampleData. This is where it will expect to find the data it is to process.

You should either verify that your example data is in that location or modify the variable setting accordingly.

EnvironmentScript

The EnvironmentScript variable should contain the path to a script that sets up the environment variables that are necessary for running the Pipeline scripts.

As distributed, this variable is set with the assumption that you have installed the HCP Pipelines in the directory ${HOME}/projects/Pipelines (i.e. that your HCP Pipelines directory is ${HOME}/projects/Pipelines) and that you will use the example environment setup provided in the Examples/Scripts/SetUpHCPPipeline.sh script.

You may need to update the setting of the EnvironmentScript variable to reflect where you have installed the HCP Pipelines.

QUEUE

The QUEUE variable should be set to the processing queue to be used if submitting to a job scheduler. The queue names available to you will depend on your cluster configuration and you may need to consult with your cluster admin to choose the appropriate queue.

As distributed, the script sets the variable to QUEUE="", no job scheduler queue therefore run local. If you have a scheduler queue uncomment and modify the line QUEUE= "hcp-priority.q" to your queue name.

GradientDistortionCoeffs

Further down in the script, the GradientDistortionCoeffs variable is set. This variable should be set to contain either the path to the gradient coefficients file to be used for gradient distortion correction or the value NONE to skip over the gradient distortion correction step.

As distributed, the script sets the variable to skip the gradient distortion correction step.

You will need to update the setting of this variable (and comment out the current setting) if you have a gradient coefficients file to use and want to perform the gradient distortion correction

HCPPIPEDIR and the SetUpHCPPipeline.sh script

The script file referenced by the EnvironmentScript variable in the PreFreeSurferPipelineBatch.sh file (by default the SetUpHCPPipeline.sh file in the Examples\Scripts folder) does nothing but establish values for all the environment variables that will be needed by various pipeline scripts.

Many of the environment variables set in the SetUpHCPPipeline.sh script are set relative to the HCPPIPEDIR environment variable.

As distributed, the setting of the HCPPIPEDIR environment variable is blank. Modify this to export HCPPIPEDIR=${HOME}/projects/Pipelinesor whatever directory you have installed the HCP Pipelines.

As distributed, the SetUpHCPPipeline.sh script assumes that you have:

  • properly installed FSL
  • set the FSLDIR environment variable
  • sourced the FSL configuration script
  • properly installed FreeSurfer
  • set the FREESURFER_HOME environment variable
  • sourced the FreeSurfer setup script

Example statements for setting FSLDIR, sourcing the FSL configuration script, setting FREESURFER_HOME, and sourcing the FreeSurfer setup script are provided but commented out in the SetUpHCPPipeline.sh script prior to setting HCPPIPEDIR.

MSMBINDIR in the SetUpHCPPipeline.sh script

The MSMBINDIR variable must provide the path to the directory in which to find the MSM_HOCR v3.0/(github v1.0) multi-modal surface matching algorithm used in MSMSulc and MSMAll. As distributed, the MSMBINDIR is set with the assumption that the necessary MSM binary is installed in the ${HOME}/pipeline_tools/MSM directory.

It is very likely that you will need to change the value of the MSMBINDIR environment variable to indicate the location of your installed version of MSM.

MATLAB_COMPILER_RUNTIME in the SetUpHCPPipeline.sh script If compiled MATLAB is to be used for any of the pipelines that use MATLAB (ICA+FIX) the MATLAB_COMPILER_RUNTIME variable must provide the path to the directory in which to find 'R2022b/v9.13' MCR, which is the version of the MCR used to compile the MATLAB functions specific to the HCPpipelines. If interpreted MATLAB will be used, the setting of this variable can be incorrect because compiled MATLAB will not be used.

FSL_FIXDIR in the SetUpHCPPipeline.sh script

The FSL_FIXDIR variable must provide the path to the directory in which to find FSL FIX v 1.0.6.14+. sICA+FIX is used for cleaning spatially specific structured noise from fMRI data. As distributed, the MSMBINDIR is set with the assumption that the necessary MSM binary is installed in the /usr/local/fix directory.

It is very likely that you will need to change the value of the FSL_FIXDIR environment variable to indicate the location of your installed version of FSL_FIX.

CARET7DIR in the SetUpHCPPipeline.sh script

The CARET7DIR variable must provide the path to the directory in which to find the Connectome Workbench wb_command. As distributed, the CARET7DIR is set with the assumption that the necessary wb_command binary is installed in the ${HOME}/workbench/bin_linux64 directory.

It is very likely that you will need to change the value of the CARET7DIR environment variable to indicate the location of your installed version of wb_command.

Running the Pre-FreeSurfer processing after editing the setup script

Once you have made any necessary edits as described above, Pre-FreeSurfer processing can be invoked by commands similar to:

    $ cd ~/projects/Pipelines/Examples/Scripts
    $ ./PreFreeSurferPipelineBatch.sh
    This script must be SOURCED to correctly setup the environment
    prior to running any of the other HCP scripts contained here

    100307
    Found 1 T1w Images for subject 100307
    Found 1 T2w Images for subject 100307
    About to use fsl_sub to queue or run /home/user/projects/Pipelines/PreFreeSurfer/PreFreeSurferPipeline.sh

After reporting the number of T1w and T2w images found, the PreFreeSurferPipelineBatch.sh script uses the FSL command fsl_sub to submit a processing job which ultimately runs the PreFreeSurferPipeline.sh pipeline script.

If your system is configured to run jobs via an Oracle Grid Engine cluster (previously known as a Sun Grid Engine (SGE) cluster), then fsl_sub will submit a job to run the PreFreeSurferPipeline.sh script on the cluster and then return you to your system prompt. You can check on the status of your running cluster job using the qstat command. See the documentation of the qstat command for further information.

The standard output (stdout) and standard error (stderr) for the job submitted to the cluster will be redirected to files in the directory from which you invoked the batch script. Those files will be named PreFreeSurferPipeline.sh.o<job-id> and PreFreeSurferPipeline.sh.e<job-id> respectively, where <job-id> is the cluster job ID. You can monitor the progress of the processing with a command like:

    $ tail -f PreFreeSurferPipeline.sh.o1434030

where 1434030 is the cluster job ID. Similarly, you can monitor errors with a command like

    $ more PreFreeSurferPipeline.sh.e1434030

If your system is not configured to run jobs via an Oracle Grid Engine cluster or if you have left the QUEUE setting blank, fsl_sub will run the PreFreeSurferPipeline.sh script directly on the system from which you launched the batch script. Your invocation of the batch script will appear to reach a point at which "nothing is happening." However, the PreFreeSurferPipeline.sh script will be launched in a separate process and the standard output (stdout) and standard error (stderr) will have been redirected to files in the directory from which you invoked the batch script. The files will be named PreFreeSurferPipeline.sh.o<process-id> and PreFreeSurferPipeline.sh.e<process-id> respectively, where <process-id> is the operating system assigned unique process ID for the running process.

A similar tail command to the one above will allow you to monitor the progress of the processing.

Keep in mind that depending upon your processor speed and whether or not you are performing gradient distortion correction, the Pre-FreeSurfer phase of processing can take several hours.

FreeSurfer processing

In the ${HCPPIPEDIR}/Examples/Scripts directory, you will find a shell script for running a batch of subject data through the FreeSurfer part of structural preprocessing. This shell script is named: FreeSurferPipelineBatch.sh. You should review and possibly edit that script file to run the example data through the FreeSurfer processing.

The StudyFolder, Subjlist, and EnvironmentScript variables are set near the top of the script and should be verified and edited as indicated above in the discussion of Pre-FreeSurfer processing.

Your environment script (SetUpHCPPipeline.sh) will need to have the same environment variables set as for the Pre-FreeSurfer processing.

Once you have made any necessary edits as described above and the Pre-FreeSurfer processing has completed, then invoking FreeSurfer processing is quite similar to invoking Pre-FreeSurfer processing. The command will be similar to:

    $ cd ~/projects/Pipelines/Examples/Scripts
    $ ./FreeSurferPipelineBatch.sh
    This script must be SOURCED to correctly setup the environment
    prior to running any of the other HCP scripts contained here

    100307
    About to use fsl_sub to queue or run /home/user/projects/Pipelines/FreeSurfer/FreeSurferPipeline.sh

As above, the fsl_sub command will either start a new process on the current system or submit a job to an Oracle Grid Engine cluster. Also as above, you can monitor the progress by viewing the generated standard output and standard error files.

Post-FreeSurfer processing

In the ${HCPPIPEDIR}/Examples/Scripts directory, you will find a shell script for running a batch of subject data through the Post-FreeSurfer part of structural preprocessing. This shell script is named: PostFreeSurferPipelineBatch.sh. This script follows the same pattern as the batch scripts to run Pre-FreeSurfer and FreeSurfer processing do. That is, you will need to verify/edit the StudyFolder, Subjlist, and EnvironmentScript variables that are set at the top of the script, and your environment script (SetUpHCPPipeline.sh) will need to set environment variables appropriately.

There is an additional variable in the PostFreeSurferPipelineBatch.sh script that needs to be considered. The RegName variable tells the pipeline whether to use MSMSulc for surface alignment. (See the FAQ for further information about MSMSulc.) As distributed, the default in PostFreeSurferPipelineBatch.sh script assumes that you do have access to the msm binary/executable to use for surface alignment. Therefore, the RegName variable is set to "MSMSulc". Note that your environment script (e.g. SetUpHCPPipeline.sh) will need to set an additional environment variable which is used by the pipeline scripts to locate the msm binary/executable. That environment variable is MSMBINDIR and it is set in the distributed example SetUpHCPPipeline.sh file as follows:

    export MSMBINDIR=${HCPPIPEDIR}/MSMBinaries

You will need to either place your msm executable binary file in the ${HCPPIPEDIR}/MSMBinaries directory or modify the value given to the MSMBINDIR environment variable so that it contains the path to the directory in which you have placed your copy of the msm executable. The msm executable/binary file must be named msm.

Alternatively, you can use the FreeSurfer surface alignment by setting the RegName variable to "FS", although that is not the recommended approach.

Once those things are taken care of and FreeSurfer processing is completed, commands like the following can be issued:

    $ cd ~/projects/Pipelines/Examples/Scripts
    $ ./PostFreeSurferPipelineBatch.sh
    This script must be SOURCED to correctly setup the environment
    prior to running any of the other HCP scripts contained here

    100307
    About to use fsl_sub to queue or run /home/user/projects/Pipelines/PostFreeSurfer/PostFreeSurferPipeline.sh

The fsl_sub command used in the batch script will behave as described above and monitoring the progress of the run can be done as described above.

Functional Preprocessing

Functional Preprocessing depends on the outputs generated by Structural Preprocessing. So Functional Preprocessing should not be attempted on data sets for which Structural Preprocessing is not yet complete.

Functional Preprocessing is divided into 2 parts: Generic fMRI Volume Preprocessing and Generic fMRI Surface Preprocessing. Generic fMRI Surface Preprocessing depends upon output produced by the Generic fMRI Volume Preprocessing. So fMRI Surface Preprocessing should not be attempted on data sets for which fMRI Volume Preprocessing is not yet complete.

As is true of the other types of preprocessing discussed above, there are example scripts for running each of the two types of Functional Preprocessing.

Generic fMRI Volume Preprocessing

The GenericfMRIVolumeProcessingPipelineBatch.sh script in the ${HCPPIPEDIR}/Examples/Scripts directory is the starting point for running volumetric functional preprocessing. Like the sample scripts mentioned above, you will need to verify or edit the StudyFolder, Subjlist, EnvironmentScript, and QUEUE variables defined near the top of the batch processing script. Additionally, you will need to verify or edit the GradientDistortionCoeffs variable near the bottom of the script. As distributed, this value is set to "NONE" to skip gradient distortion correction.

In addition to these variable modifications, you should check or edit the contents of the Tasklist variable. This variable holds a space delimited list of the functional tasks that you would like preprocessed. As distributed, the Tasklist variable is set to preprocess the 4 HCP resting state runs (rfMRI_REST1_RL, rfMRI_REST1_LR,rfMRI_REST2_RL, and rfMRI_REST2_LR) and all 7 HCP tasks (2 runs each, e.g. tfMRI_EMOTION_RL, tfMRI_EMOTION_LR, tfMRI_WM_RL, tfMRI_WM_LR, tfMRI_SOCIAL_RL, tfMRI_SOCIAL_LR, etc.). The Tasklist variable can be modified with fewer tasks or with your own list of task and/or rest runs for the data you are processing.

Generic fMRI Surface Preprocessing

The GenericfMRISurfaceProcessingPipelineBatch.sh script in the ${HCPPIPEDIR}/Examples/Scripts directory is the starting point for running surface based functional preprocessing. As has been the case with the other sample scripts, you will need to verify or edit the StudyFolder, Subjlist, and EnvironmentScript variables defined near the top of the batch processing script.

In addition to these variable modifications, you should check or edit the contents of the Tasklist variable. This variable holds a space delimited list of the functional tasks that you would like preprocessed. As distributed, as above, the Tasklist variable is set to preprocess the 4 HCP resting state runs (rfMRI_REST1_RL, rfMRI_REST1_LR,rfMRI_REST2_RL, and rfMRI_REST2_LR) and all 7 HCP tasks. As above in the volume based functional preprocessing, you can process fewer or add other tasks to the list in the Tasklist variable depending on the data to be preprocessed.

Like the Post-FreeSurfer pipeline, you will also need to set the RegName variable to either MSMSulc or FS.


ICA FIX pipeline

fMRI data can be further processed (after Functional Preprocessing is complete) using the FMRIB group's ICA-based Xnoiseifer - FIX (ICA FIX). This processing regresses out motion timeseries and artifact ICA components (ICA run using Melodic and components classified using FIX): (Salimi-Khorshidi et al 2014)

See ICAFIX/README for further details and IcaFixProcessingBatch.sh for an example script. As distributed, the fMRINames variable is set to run multi-run FIX with bandpass = 0 on two concatenated groups of task runs with ConcatNames tfMRI_WM_GAMBLING_MOTOR_RL_LR and tfMRI_LANGUAGE_SOCIAL_RELATIONAL_EMOTION_RL_LR. You may want to modify this to include Resting State runs in fMRINames and ConcatNames.

MSMAll pipeline

After Functional Preprocessing and ICA+FIX processing, re-registration of cortical surfaces can be performed in the MSMAll pipeline using cortical folding along with myelin map and resting state functional information (MSMAll).

See MSM FAQ and MSMAll/README for further details and MSMAllPipelineBatch.sh for an example script.

Task Analysis

Task fMRI (tfMRI) data can be further processed (after Functional Preprocessing is complete [required], and after ICA+FIX and temporal ICA (tICA) is complete [recommended]) using the TaskfMRIAnalysisBatch.sh script in the ${HCPPIPEDIR}/Examples/Scripts directory.
The TaskfMRIAnalysisBatch.sh script runs Level 1 and Level 2 Task fMRI Analysis. As has been the case with the other sample scripts, you will need to verify or edit the StudyFolder, Subjlist, and EnvironmentScript variables defined at the top of this batch processing script.

In addition to these variable modifications, you should check or edit the contents of the LevelOneTasksList, LevelOneFSFsList, LevelTwoTaskList, and LevelTwoFSFList variables. As distributed, these variables are configured to perform Level 1 task analysis only on the RL and LR conditions for all 7 HCP tasks and Level 2 task analysis on the combined results of the RL and LR Level 1 analysis for each of the 7 tasks, e.g. EMOTION, SOCIAL, WM, etc. You can add other conditions for Level 1 and Level 2 analysis by altering the settings of these variables. Please be aware that changing these settings will alter the types of analysis that are done for all subjects listed in the Sublist variable.

Note: If, instead of starting with unprocessed data and doing Structural Preprocessing and Functional Preprocessing yourself as is described in this document, you are starting with already Structurally and Functionally Preprocessed data as supplied by the HCP, the FSF files described below that are necessary for both Level 1 and Level 2 Task Analysis have already been created and supplied in the package of Functionally Preprocessed data supplied by HCP.

Preparing to do Level 1 Task Analysis

Level 1 Task Analysis requires FEAT setup files (FSF files) for each direction of the functional task.

For example, to perform Level 1 Task Analysis for the tfMRI_EMOTION_RL and tfMRI_EMOTION_LR tasks for subject 100307, the following FEAT setup files must exist before running the Task Analysis pipeline:

  • <StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION_LR/tfMRI_EMOTION_LR_hp200_s4_level1.fsf
  • <StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION_RL/tfMRI_EMOTION_RL_hp200_s4_level1.fsf

Templates for these files can be found in the ${HCPPIPEDIR}/Examples/fsf_templates directory. The number of time points entry in the template for each functional task must match the actual number of time points in the corresponding scan.

An entry in the tfMRI_EMOTION_LR_hp200_s4_level1.fsf file might look like:

    # Total volumes
    set fmri(npts) 176

The 176 value must match the number of volumes (number of time points) in the corresponding tfMRI_EMOTION_LR.nii.gz scan file. After you have copied the appropriate template for a scan to the indicated location in the MNINonLinear/Results/<task> directory, you may have to edit the .fsf file to make sure the value that it has for Total volumes matches the number of time points in the corresponding scan file.

There is a script in the ${HCPPIPEDIR}/Examples/Scripts directory named generate_level1_fsf.sh which can be either studied or used directly to retrieve the number of time points from an image file and set the correct Total volumes value in the .fsf file for a specified task.

Typical invocations of the generate_level1_fsf.sh script would look like:

    $ cd ${HCPPIPEDIR}/Examples/Scripts

    $ ./generate_level1_fsf.sh \
    >   --studyfolder=${HOME}/projects/Pipelines_ExampleData \
    >   --subject=100307 --taskname=tfMRI_EMOTION_RL --templatedir=../fsf_templates \
    >   --outdir=${HOME}/projects/Pipelines_ExampleData/100307/MNINonLinear/Results/tfMRI_EMOTION_RL

    $ ./generate_level1_fsf.sh \
    >   --studyfolder=${HOME}/projects/Pipelines_ExampleData \
    >   --subject=100307 --taskname=tfMRI_EMOTION_LR --templatedir=../fsf_templates \
    >   --outdir=${HOME}/projects/Pipelines_ExampleData/100307/MNINonLinear/Results/tfMRI_EMOTION_LR

This must be done for every direction of every task for which you want to perform task analysis.

Level 1 Task Analysis also requires that E-Prime EV files be available in the MNINonLinear/Results subdirectory for the each task on which Level 1 Task Analysis is to occur. These EV files are available in the example unprocessed data, but are not in the MNINonLinear/Results directory because that directory is created as part of Functional Preprocessing. Since Functional Preprocessing must be completed before Task Analysis can be performed, the MNINonLinear/Results folder should exist prior to Task Analysis.

There is a script in the ${HCPPIPEDIR}/Examples/Scripts directory named copy_evs_into_results.sh. This script can be used to copy the necessary E-Prime EV files for a task into the appropriate place in the MNINonLinear/Results directory.

Typical invocations of the copy_evs_into_results.sh script would look like:

    $ cd ${HCPPIPEDIR}/Examples/Scripts

    $ ./copy_evs_into_results.sh \
    >   --studyfolder=${HOME}/projects/Pipelines_ExampleData \
    >   --subject=100307 \
    >   --taskname=tfMRI_EMOTION_RL

    $ ./copy_evs_into_results.sh \
    >   --studyfolder=${HOME}/projects/Pipelines_ExampleData \
    >   --subject=100307 \
    >   --taskname=tfMRI_EMOTION_LR

This must be done for every directory of every task for which you want to perform task analysis.

Preparing to do Level 2 Task Analysis

Level 2 Task Analysis requires a FEAT setup file also. For example, to perform Level 2 Task Analysis for the tfMRI_EMOTION task for subject 100307 (combination data from tfMRI_EMOTION_RL and tfMRI_EMOTION_LR) the following FEAT setup file must exist before running the Task Analysis pipeline:

  • <StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION/tfMRI_EMOTION_hp200_s4_level2.fsf

The template file named tfMRI_EMOTION_hp200_s4_level2.fsf in the ${HCPPIPEDIR}/Examples/fsf_templates directory can be copied, unchanged to the appropriate location before running the Task Analysis pipeline. You will likely have to create the level 2 results directory, e.g. <StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION (Notice that this directory name does not end with _LR or _RL) before you can copy the template into that directory.

Diffusion Preprocessing

Diffusion Preprocessing depends on the outputs generated by Structural Preprocessing. So Diffusion Preprocessing should not be attempted on data sets for which Structural Preprocessing is not yet complete.

The DiffusionPreprocessingBatch.sh script in the ${HCPPIPEDIR}/Examples/Scripts directory is much like the example scripts for the 3 phases of Structural Preprocessing. The StudyFolder, Subjlist, and EnvironmentScript variables set at the top of the batch script need to be verified or edited as above.

Like the PreFreeSurferPipelineBatch.sh script, the DiffusionPreprocessingBatch.sh also needs a variable set to the path to the gradient coefficients file or NONE if gradient distortion correction is to be skipped. In the DiffusionPreprocessingBatch.sh script that variable is Gdcoeffs. As distributed, this example script is setup with the assumption that you will skip gradient distortion correction. If you have a gradient coefficients file available and would like to perform gradient distortion corrrection, you will need to update the Gdcoeffs variable to contain the path to your gradient coefficients file.


A note about resource requirements

The memory and processing time requirements for running the HCP Pipelines scripts is relatively high. To provide an reference point, when the HCP runs these scripts to process data by submitting them to a cluster managed by a Portable Batch System (PBS) job scheduler, we generally request the following resource limits.

  • Structural Preprocessing (Pre-Freesurfer, FreeSurfer, and Post-FreeSurfer combined)

    • Walltime
      • Structural Preprocessing usually finishes within 24 hours
      • We set the walltime limit to 24-48 hours and infrequently have to adjust it up to 96 hours
    • Memory
      • We expect Structural Preprocessing to have maximum memory requirements in the range of 12 GB. But infrequently we have to adjust the memory limit up to 24 GB.
  • Functional Preprocessing (Volume and Surface based preprocessing combined)

    • Time and memory requirements vary depending on the length of the fMRI scanning session
    • In our protocol, resting state functional scans (rfMRI) are longer duration than task functional scans (tfMRI) and therefore have higher time and memory requirements.
    • Jobs processing resting state functional MRI (rfMRI) scans usually have walltime limits in the range of 36-48 hrs and memory limits in the 20-24 GB range.
    • Jobs processing task functional MRI (tfMRI) scans may have resource limits that vary based on the task's duration, but generally are walltime limited to 24 hours and memory limited to 12 GB.
    • Often tfMRI preprocessing takes in the neighborhood of 4 hours and rfMRI preprocessing takes in the neighborhood of 10 hours.
  • Task fMRI Analysis

    • Walltime limits on Task fMRI Analysis are generally set to 24 hours with the actual expected walltimes to run from 4-12 hours per task.
    • Memory limits are set at 12 GB.
  • Diffusion Preprocessing

    • Time and memory requirements for Diffusion preprocessing will depend upon whether you are running the eddy portion of Diffusion preprocessing using the Graphics Processing Unit (GPU) enabled version of the eddy binary that is part of FSL. (The FMRIB group at Oxford University recommends using the GPU-enabled version of eddy whenever possible.)
    • Memory requirements for Diffusion preprocessing are generally in the 24-50 GB range.
    • Walltime requirements can be as high as 36 hours.

The limits listed above, in particular the walltime limits listed, are only useful if you have some idea of the capabilities of the computer node on which the jobs were run. For information about the configuration of the cluster nodes used to come up with the above limits/requirements, see the description of the equipment available at the Washington University Center for High Performance Computing (CHPC) hardware resources.


Hint for detecting Out of Memory conditions

If one of your preprocessing jobs ends in a seemingly inexplicable way with a message in the stderr file (e.g. DiffPreprocPipeline.sh.e<process-id>) that indicates that your process was Killed, it is worth noting that many versions of Linux have a process referred to as the Out of Memory Killer or OOM Killer. When a system running an OOM Killer gets critically low on memory, the OOM Killer starts killing processes by sending them a -9 signal. This type of process killing immediately stops the process from running, frees up any memory that process is using, and causes the return value from the killed process to be 137. (By convention, this return value is 128 plus the signal number, which is 9. Thus a return value of 128+9=137.)

For example, if the eddy executable used in Diffusion preprocessing attempts to allocate more memory than is available, it may be killed by the OOM Killer and return a status code of 137. In that case, there may be a line in the stderr that looks similar to:

    /home/username/projects/Pipelines/DiffusionPreprocessing/scripts/run_eddy.sh: line 182: 39455 Killed  ...*further info here*...

and a line in the stdout that looks similar to:

    Sat Aug 9 21:20:21 CDT 2014 - run_eddy.sh - Completed with return value: 137

Lines such as these are a good hint that you are having problems with not having enough memory. Out of memory conditions and the subsequent killing of jobs by the OOM Killer can be confirmed by looking in the file where the OOM Killer logs its activities (/var/log/kern.log on Ubuntu systems, /var/log/messages* on some other systems).

Searching those log files for the word Killed may help find the log message indicating that your process was killed by the OOM Killer. Messages within these log files will often tell you how much memory was allocated by the process just before it was killed.

Note that within the HCP Pipeline scripts, not all invocations of binaries or other scripts print messages to stderr or stdout indicating their return status codes. The above example is from a case in which the return status code is reported. So while this hint is intended to be helpful, it should not be assumed that all out of memory conditions can be discovered by searching the stdout and stderr files for return status codes of 137.


I still have questions

Please review the FAQ. You can also search the hcp-users Google Group or the older hcp-users-archive. If you don't find an answer there, go to the hcp-users Google Group and click Sign In. For instructions on joining without a Google account: hcp-users-join-wiki. You can also use the contact form on the HCP/CCF contact page if you don't want your inquiry to be public.