Scripts used for analysis of MRD TCR clonal sequences in the paper
[paper name](link will be here later)
The project was done in the Laboratory of Comparative and Functional Genomics.
- mrd_tcr_clones.txt
- mrd_tcr_table.txt
build_links.py <file with input files>
Example of file is input_files.txt
. It consists of blocks with files for each individual:
<path to the folder with individuals 1 and 2>
#Individual_1
file_1_1 pattern
file_1_2 pattern
#Individual_2
file_2_1 pattern
file_2_2 pattern
}
<path to the folder with individuals 3, etc.>
...
Resulting file will be input_files.links.txt
.
search_clones.py <file with links> <file with MRD clones> <optional postfix for the output file>
fuzzy_search_clones.py <file with links> <file with MRD clones> <number of max errors (mism / indels)> <optional postfix for the output file>
Output is a number of files with various information about the search results.
extract_seq_vseg.py <search result with "lines" in the name from the previous function>
Output is a number of files each corresponding to the specific hamming distance between one of the MRD clonal sequences and found sequence in the data.
montecarlo.py <file with links> <output file name>
From each file take a number of clones equal to the number of MRD clones and find in how many people they are occurred. Output is a tab-delimited files with number of occurrences for each try.
make_neighbors.py <file with MRD clones>
process_neis_prob.py <resulting file from the previous script>