Skip to content

Matching contigs

luissian edited this page Dec 5, 2018 · 3 revisions

The matching contig file shows the information about the match result from the core gene and the sample.

The file name is called matching_contigs.tsv.

Inside of this file will have the information for all the samples and every core gene. It is a tabulate separated file with this heading.

Sample Name Contig Core Gene Start Stop Direction Codification

Sample Name is the name of the sample file.

Contig is the number of the contig sample.

Core Gene is the name of the gene in the Schema.

Start is the position number, inside the contig, where the gene starts in the sample.

Stop is the position number, inside the contig, where the gene stops in the sample.

Direction shows the sequence direction in the sample. The sign + means that it is in forward direction , if value is - means it is on reverse.

Codification shows the code result of the matching. The possible values can be:

  • EXACT when it is an exact match was found.
  • INF for a new inferred allele
  • ALM_DELETE when there was a deletion and the new protein is longer that in the schema.
  • ALM_INSERT when there was a insertion and the new protein is longer that in the schema.
  • ASM_DELETE when there was a deletion and the new protein is shorter that in the schema.
  • ASM_INSERT when there was a insertion and the new protein is shorter that in the schema.
  • PLOT when the matching stops because it reaches the start of the end of the contig.
  • LNF when the locus was not found in the sample.
  • ERROR It is an exceptional case that is showed when not stop codon is found when finding the new generated protein because of a insertion/deletion.