Tools to 'simplify' transferring data from NERSC to GridPP storage elements.
- DIRAC tools - tested with version 6.21.10
- GFAL2 (and python bindings) - tested with version 2.15.2
- Access to relevant systems from local client machine (e.g. via voms-proxy-init).
During development of this tool a bug was found in the version of GFAL that was accessible with a CernVM via CVMFS. A fix was created, and can be sourced as below, but this results in incompatibility with DIRAC. It is not possible to have access to both GFAL and DIRAC tools in the same environment as things currently stand. Therefore, the script has been split in to two stages (with the previous version retained for posterity and ease of update if things are fixed later on).
If using a CernVM (with SL6/Centos6), the following will source the correct version of GFAL:
source /cvmfs/grid.cern.ch/umd-sl6ui-test/etc/profile.d/setup-ui-example.sh
To access DIRAC tools, in a clean environment, use:
source /cvmfs/ganga.cern.ch/dirac_ui/bashrc
For both cases, a valid proxy is required. For ease of use, set up for the longest valid time allowed:
voms-proxy-init --voms lsst --valid 24:00
If using a system without CernVM, similar incompatibilities are present. You must still use two different environments for each stage of the script.
Since the required packages are incompatible, the tools now has two distinct steps, and must make use of a tracking file to store previously-transferred files and relevant metadata.
python transfer.py -o <track-file> -s <source-dir> -d <dest-dir>
The script should only transfer files that have not been transferred correctly previously. This is useful as transfer of large amounts of data can take longer than proxy is valid for.
Future improvements:
- Multiprocessing to transfer multiple files at once
- Use local track-file to speed up batch transfer and save checking files through GFAL (takes significant amount of time when having to check many files)
python register.py -i <track-file> -l <LFN-path> -e <storage-element>
Again, this should (in theory) work for files already registered, but extensive testing is still ongoing.