Skip to content

Easily download a list of files from CSV

Scott Hoag edited this page Oct 19, 2021 · 2 revisions

Download from CSV input with PowerShell or Bash and AzCopy

The scripts detailed in this document can be used to download files listed in a CSV file. A sample CSV format is detailed below. The scripts rely on several dependencies, including AzCopy v10.

CSV format

AccountName PartitionKey
mystorageaccount /container1/myfile1.txt
mystorageaccount /container2/myfile2.txt
mystorageaccount /container3/myfile3.txt

Download instructions

  1. Copy your CSV file to the same directory as the AzCopy executable.

  2. Rename your CSV file to AzCopyInputObjects.csv.

  3. Next, determine what type of authorization credential you will use with AzCopy. You can provide authorization credentials by using Azure Active Directory (AD), or by using a Shared Access Signature (SAS) token.

    Use this table as a guide:

    Storage type Currently supported method of authorization
    Blob storage Azure AD & SAS
    Blob storage (hierarchical namespace) Azure AD & SAS
    File storage SAS only
  4. From the directory that contains both your CSV file execute one of the provided scripts for Windows or Linux.

Windows PowerShell Script

  1. Save this script with the name AzCopyDownloadFromCSV.ps1 to the same directory that contains your CSV file.

  2. If you are using a SAS token for authorization, set the $sasToken variable in the script to the value of the token you obtained earlier.

    Note: If you are using Azure AD for authorization, this variable does not need to be set. Uncomment the line .\azcopy.exe login.

  3. Execute the script:

    .\AzCopyDownloadFromCSV.ps1

    Note: The script will create a child directory (Downloads) and any necessary child directories to download objects to the same path presented in the PartitionKey column in the CSV.

PowerShell Script

# Set the following variable to your SAS token. If you are using Azure AD for authorization, this variable does not need to be set.

$sasToken = ""
$blobEndpoint = ".blob.core.windows.net"

$invocation = (Get-Variable MyInvocation).Value
$directoryPath = Split-Path $invocation.MyCommand.Path  

# Download and extract AzCopy
$azCopyZip = "$directoryPath\AzCopy.Zip"
$azCopyUri = "https://aka.ms/downloadazcopy-v10-windows"

Write-Host "Downloading AzCopy..." -ForegroundColor Green

Start-BitsTransfer -Source "https://aka.ms/downloadazcopy-v10-windows" -Destination $azCopyZip

Write-Host "`tExtracting AzCopy..." -ForegroundColor Yellow
Expand-Archive $azCopyZip $directoryPath -Force

Get-ChildItem "$($directoryPath)\*\*" | Move-Item -Destination "$($directoryPath)\" -Force

# Uncomment this line if you are using Azure AD authorization

#.\azcopy.exe login

$directoryPathForDownload= $directoryPath + "\" +"Downloads"
if(!(Test-Path -path $directoryPathForDownload)) {  
    New-Item -ItemType directory -Path $directoryPathForDownload
    Write-Host "Folder path has been created successfully at: $($directoryPathForDownload)" -ForegroundColor Green
} else { 
    Write-Host "The folder path $($directoryPathForDownload) already exists" -ForegroundColor Yellow
}

$csv = Import-Csv -Path .\AzCopyInputObjects.csv

Write-Host "Downloading $($csv.Length) objects..." -ForegroundColor Green

foreach ($object in $csv) {
    $objectUrl = [string]::Format("https://{0}{1}{2}{3}",$object.AccountName,$blobEndpoint,$object.PartitionKey,$sasToken)

    Write-Host "`tDownloading $($objectUrl)" -ForegroundColor Yellow

    $partitionKey = $object.PartitionKey
    $partitionPath = $partitionKey.Substring(0, $partitionKey.LastIndexOf("/")) -replace "/", "\"

    $directoryPathForDownloadTemp=$directoryPathForDownload+$partitionPath
    if(!(Test-Path -path $directoryPathForDownloadTemp)) {  
        New-Item -ItemType directory -Path $directoryPathForDownloadTemp
        Write-Host "`tFolder path has been created successfully at: $($directoryPathForDownloadTemp)" -ForegroundColor Green
    } else { 
        Write-Host "`tThe folder path $($directoryPathForDownloadTemp) already exists" -ForegroundColor Yellow
    }

    $downloadPath =  [string]::Format("{0}{1}",$directoryPathForDownload,$object.PartitionKey) -replace "/", "\"
    
    .\azcopy.exe copy "$objectUrl" "$downloadPath" --recursive --from-to=BlobLocal
}

Write-Host "Download complete!" -ForegroundColor Green

Linux/macOS Bash Script

  1. Save this script with the name AzCopyDownloadFromCSV.sh to the same directory that contains your CSV file.

  2. If you are using a SAS token for authorization, set the $sasToken variable in the script to the value of the token you obtained earlier.

    Note: If you are using Azure AD for authorization, this variable does not need to be set. Uncomment the line ./azcopy login.

  3. Execute the script:

    ./AzCopyDownloadFromCSV.sh

    Note: The script will create a child directory (Downloads) and any necessary child directories to download objects to the same path presented in the PartitionKey column in the CSV.

Bash Script

#!/bin/bash

# Set the following variable to your SAS token. If you are using Azure AD for authorization, this variable does not need to be set.

sasToken=""
blobEndpoint=".blob.core.windows.net"

directoryPath=$(pwd)

# Download and extract AzCopy
azCopyZip="$directoryPath/AzCopy.tar.gz"
azCopyUri="https://aka.ms/downloadazcopy-v10-linux"

if [ "$(uname)" == "Darwin" ]; then
    # macOS 
    azCopyUri="https://aka.ms/downloadazcopy-v10-mac"
fi 

echo "Downloading and extracting AzCopy..."

curl -L -o "$azCopyZip" "$azCopyUri"

tar --strip-components=1 -xzf "$azCopyZip"

# Uncomment this line if you are using Azure AD authorization

#./azcopy login

directoryPathForDownload="$directoryPath/Downloads"

if [[ ! -e $directoryPathForDownload ]]; then
    mkdir -p $directoryPathForDownload
    echo "Folder path has been created successfully at: $directoryPathForDownload"
elif [[ -e $directoryPathForDownload ]]; then
    echo "The folder path $directoryPathForDownload already exists" 1>&2
fi

echo "Downloading objects..."

{
    read
    while IFS=, read -r acctName partKey; do
        objectUrlBase="https://$acctName$blobEndpoint$partKey"
        objectUrlBase=${objectUrlBase//$'\r'}
        objectUrl="$objectUrlBase$sasToken"
    
        echo "Downloading $objectUrl"

        partitionPath=$(dirname $partKey)
        directoryPathForDownloadTemp="${directoryPathForDownload}${partitionPath}"

        if [[ ! -e $directoryPathForDownloadTemp ]]; then
            mkdir -p $directoryPathForDownloadTemp
            echo "Folder path has been created successfully at: $directoryPathForDownloadTemp"
        elif [[ -e $directoryPathForDownloadTemp ]]; then
            echo "The folder path $directoryPathForDownloadTemp already exists" 1>&2
        fi

        downloadPath="${directoryPathForDownload}${partKey}"
        downloadPath=${downloadPath//$'\r'}

        # force an empty stdin to azcopy - https://github.com/Azure/azure-storage-azcopy/issues/974
        : | ./azcopy copy "$objectUrl" "$downloadPath" --recursive --from-to=BlobLocal
    done 
} < "$directoryPath/AzCopyInputObjects.csv"

echo "Download complete!"