Problem

You need to execute a task for each record in one or more CSV files.

Solution

Read the CSV file line-by-line using the splitCsv operator, then use the map operator to return a tuple with the required field for each line and convert any string path to a file path object using the file function. Finally, use the resulting channel as input for the process.

Code

Given the file index.csv with the following content:

sampleId	read1	read2
FC816RLABXX	reads/110101_I315_FC816RLABXX_L1_HUMrutRGXDIAAPE_1.fq.gz	reads/110101_I315_FC816RLABXX_L1_HUMrutRGXDIAAPE_2.fq.gz
FC812MWABXX	reads/110105_I186_FC812MWABXX_L8_HUMrutRGVDIABPE_1.fq.gz	reads/110105_I186_FC812MWABXX_L8_HUMrutRGVDIABPE_2.fq.gz
FC81DE8ABXX	reads/110121_I288_FC81DE8ABXX_L3_HUMrutRGXDIAAPE_1.fq.gz	reads/110121_I288_FC81DE8ABXX_L3_HUMrutRGXDIAAPE_2.fq.gz
FC81DB5ABXX	reads/110122_I329_FC81DB5ABXX_L6_HUMrutRGVDIAAPE_1.fq.gz	reads/110122_I329_FC81DB5ABXX_L6_HUMrutRGVDIAAPE_2.fq.gz
FC819P0ABXX	reads/110128_I481_FC819P0ABXX_L5_HUMrutRGWDIAAPE_1.fq.gz	reads/110128_I481_FC819P0ABXX_L5_HUMrutRGWDIAAPE_2.fq.gz

This workflow parses the file and executes a process for each line:

params.index = "$baseDir/data/index.csv"

process foo {
    debug true
    input:
    tuple val(sampleId), file(read1), file(read2)

    script:
    """
    echo your_command --sample $sampleId --reads $read1 $read2
    """
}

workflow {
    Channel.fromPath(params.index) \
        | splitCsv(header:true) \
        | map { row-> tuple(row.sampleId, file(row.read1), file(row.read2)) } \
        | foo
}

!!! note Relative paths are resolved by the file function against the execution directory. In practice, it is preferable to use absolute file paths.

Run it

Use the the following command to execute the example:

nextflow run nextflow-io/patterns/process-per-csv-record.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process-per-csv-record.md

process-per-csv-record.md

Problem

Solution

Code

Run it

Files

process-per-csv-record.md

Latest commit

History

process-per-csv-record.md

File metadata and controls

Problem

Solution

Code

Run it