Skip to content

[0.11.0] - 2021-11-18

Compare
Choose a tag to compare
@siebrenf siebrenf released this 18 Nov 10:17
· 110 commits to master since this release

Added

  • extened docstrings
  • GENCODE support (GENCODE gene annotations with UCSC genomes)
    • only contains the main chromosomes, no scaffolds or alternate haplotypes.
    • only contains 4 assemblies (2 mouse, 2 human)
    • excellent annotations for these regions & species though!
  • Ensembl's GRCh37 can now be downloaded through genomepy
  • Local fasta/gtf/gff(3)/bed file support
    • you can install a local genome and/or annotation by providing local path(s) to genomepy install
      • if annotation downloading is requested, but not annotation path is provided,
        a gtf/gff(3) annotation will be sought in the genome's source directory.
  • Annotation.gtf_dict creates a dictionary for any key-value pair in the GTF columns or attribute fields!
    • e.g. Annotation.gtf_dict("seqname", "gene_name")

Changed

  • Genome.track2fasta can now ignore comment lines (starting with #)
  • Genome.track2fasta will skip header lines (a warning will be printed)
  • Genome.track2fasta will ignore regions that cannot be parsed (a warning will be printed)
    • these fixes should improve gimme scan performance and feedback
  • UCSC annotation conversion tool settings tweaked. Better results with source gff files.
  • Ensembl now uses HTTP instead of FTP (in some cases). This improves stability on some servers.
  • tweaked search result alignment for clarity
  • explained UCSC annotations in the README
  • better file path handling (relative paths, user home and variables are expanded)
  • Annotation now accepts a file/directory/genomepy name as first argument.
    • this merges 2 arguments into one.
  • Annotation.map_genes now works without a README file
    • you can now set Annotation.tax_id manually.

Fixed

  • Ensembl annotations from previous releases can now be downloaded as intended.
  • Genome.track2fasta will skip regions that clearly dont make sense (start>end, and start<0)