Skip to content

Commit

Permalink
Update README.md (#212)
Browse files Browse the repository at this point in the history
* Update README.md
* Update doc gff_to_bed.md
  • Loading branch information
Juke34 authored Jan 3, 2022
1 parent 9b12ef1 commit 8664d4a
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 17 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,11 @@ AGAT has the power to check, fix, pad missing information (features/attributes)

| task | tool |
| --- | --- |
| convert any **GTF/GFF** into **tabulated format** | `agat_sp_to_tabulated.pl` |
| convert any **GTF/GFF** into **BED** format | `agat_convert_sp_gff2bed.pl` |
| convert any **GTF/GFF** into **GTF** format | `agat_convert_sp_gff2gtf.pl` |
| convert any **GTF/GFF** into **tabulated format** | `agat_sp_gff2tsv.pl` |
| convert any **BAM** from minimap2 into **GFF** format | `agat_convert_sp_minimap2_bam2gff.pl` |
| convert any **GTF/GFF** into **ZFF** format | `agat_sp_gff2zff.pl` |
| convert any **GTF/GFF** into any **GTF/GFF** (bioperl) format | `agat_convert_sp_gxf2gxf.pl` |
| convert **BED** format into **GFF3** format | `agat_convert_bed2gff.pl` |
| convert **EMBL** format into **GFF3** format | `agat_convert_embl2gff.pl` |
Expand Down Expand Up @@ -578,6 +580,7 @@ Some examples of publications that have used AGAT
| Preprint | [Using historical museum samples to examine divergent and parallel evolution in the invasive starling](https://www.biorxiv.org/content/10.1101/2021.08.22.457241v1.full)|
| GBE | [A Chromosome-Level Genome Assembly of the Reed Warbler (Acrocephalus scirpaceus)](https://helda.helsinki.fi/bitstream/handle/10138/336322/evab212.pdf?sequence=1&isAllowed=y)|
| Preprint | [A genome assembly of the Atlantic chub mackerel (Scomber colias): a valuable teleost fishing resource](https://www.biorxiv.org/content/10.1101/2021.11.19.468211v1.full.pdf)|
| Current Protocols | [BUSCO: Assessing Genomic Data Quality and Beyond](https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpz1.323)
| [...] | [...]
</details>

Expand Down
5 changes: 3 additions & 2 deletions docs/gff_to_bed.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ scaffold625 337817 343277 CLUHART00000008717 0 + 337914 343033 0 4 154,109,111,1
Detailed information can be found here: [https://genome.ucsc.edu/FAQ/FAQformat.html](https://genome.ucsc.edu/FAQ/FAQformat.html)
Below a description of the different fields:

column | feature type | mandatory | comment
column | feature type | mandatory | comment |
-- | -- | -- | -- |
1 | chrom | X | The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
2 | chromStart | X | The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
Expand All @@ -126,4 +126,5 @@ column | feature type | mandatory | comment


/!\ location BED format is 0-based, half-open [start-1, end), while GFF is 1-based, closed [start, end].
<img align="center" src="pictures/coordinate_systems.jpg"/>

![](img/coordinate_systems.jpg "coordinate_systems")
Binary file added docs/img/coordinate_systems.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 16 additions & 14 deletions docs/tools/agat_convert_sp_gff2bed.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,22 @@ It will convert level2 features from gff (mRNA, transcripts) into bed features.
If the selected level2 subfeatures (defaut: exon) exist, they will be reported
in the block fields (9-12th colum in bed).

Definintion of the bed format:
\## 1 chrom - The name of the chromosome (e.g. chr3, chrY, chr2\_random) or scaffold (e.g. scaffold10671).
\## 2 chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
\## 3 chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
\########### OPTIONAL fields ##########
\## 4 name - Defines the name of the BED line. This label is displayed to the left of the BED line in the Genome Browser window when the track is open to full display mode or directly to the left of the item in pack mode.
\## 5 score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray).
\## 6 strand - Defines the strand - either '+' or '-'.
\## 7 thickStart - The starting position at which the feature is drawn thickly
\## 8 thickEnd - The ending position at which the feature is drawn thickly
\## 9 itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0). If the track line itemRgb attribute is set to "On", this RBG value will determine the display color of the data contained in this BED line. NOTE: It is recommended that a simple color scheme (eight colors or less) be used with this attribute to avoid overwhelming the color resources of the Genome Browser and your Internet browser.
\## 10 blockCount - The number of blocks (exons) in the BED line.
\## 11 blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
\## 12 blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
Definintion of the bed format:
```
## 1 chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
## 2 chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
## 3 chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
########### OPTIONAL fields ##########
## 4 name - Defines the name of the BED line. This label is displayed to the left of the BED line in the Genome Browser window when the track is open to full display mode or directly to the left of the item in pack mode.
## 5 score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray).
## 6 strand - Defines the strand - either '+' or '-'.
## 7 thickStart - The starting position at which the feature is drawn thickly
## 8 thickEnd - The ending position at which the feature is drawn thickly
## 9 itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0). If the track line itemRgb attribute is set to "On", this RBG value will determine the display color of the data contained in this BED line. NOTE: It is recommended that a simple color scheme (eight colors or less) be used with this attribute to avoid overwhelming the color resources of the Genome Browser and your Internet browser.
## 10 blockCount - The number of blocks (exons) in the BED line.
## 11 blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
## 12 blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
```

## SYNOPSIS

Expand Down

0 comments on commit 8664d4a

Please sign in to comment.