Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor additions #4

Merged
merged 39 commits into from
Nov 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
be1e7cc
add docstring and add more flexibility in writting results
JeanMainguy Aug 30, 2023
ef8df0e
fix bin name of union bins
JeanMainguy Aug 30, 2023
b75bda3
write all bin info in debug mode
JeanMainguy Aug 30, 2023
eb8035a
fix prefix and suffix
JeanMainguy Aug 31, 2023
1a78f96
add index to contig file in degub mode
JeanMainguy Aug 31, 2023
162177b
recompute hash when replacing contig by ids
JeanMainguy Sep 22, 2023
b0e5e77
add docstring
JeanMainguy Sep 22, 2023
590126f
add test with real data and external data test repo
JeanMainguy Sep 22, 2023
d791707
Update python-package.yml
JeanMainguy Sep 22, 2023
bbdd13f
update to pyrodigual v3 and fix broke call
JeanMainguy Sep 22, 2023
f7e1c8e
adjust multithreading for pyrodigal
JeanMainguy Sep 22, 2023
c3ce262
add fct test in gh action
JeanMainguy Sep 22, 2023
3e9905e
add a more complicated way of cheking expected results
JeanMainguy Sep 23, 2023
93d32c9
clean binette.py
JeanMainguy Sep 23, 2023
0d43212
update README with conda install
JeanMainguy Sep 23, 2023
de3287e
add conda badges
JeanMainguy Sep 23, 2023
5faf381
Update README.md
JeanMainguy Sep 23, 2023
c4ce47c
remove pycache dir and add gitignore
JeanMainguy Oct 24, 2023
74257cf
Update python-package.yml
JeanMainguy Oct 28, 2023
ee4c887
add results section in readme
JeanMainguy Oct 28, 2023
fb3d043
Update version in binette.yaml
JeanMainguy Oct 28, 2023
1f3c8c5
temporary change in conda recipe action
JeanMainguy Nov 3, 2023
7ed119e
fix check_conda_recipe.yml
JeanMainguy Nov 3, 2023
4adefa5
Update check_conda_recipe.yml
JeanMainguy Nov 3, 2023
c94d9de
improve conda install
JeanMainguy Nov 3, 2023
106a7b8
Update check_conda_recipe.yml
JeanMainguy Nov 3, 2023
267ef1f
install binette from bioconda and then install current version
JeanMainguy Nov 4, 2023
982d670
fix readme title
JeanMainguy Nov 4, 2023
f592ecc
rename gh action
JeanMainguy Nov 4, 2023
5dbbe12
rm pyrodigal 3 requirement in setup
JeanMainguy Nov 4, 2023
65cfe08
fix conda recipe check action
JeanMainguy Nov 4, 2023
00d5899
try install with pyrodigal <3
JeanMainguy Nov 4, 2023
1c48bb1
remove requirements in setup
JeanMainguy Nov 4, 2023
7228473
add shell activation
JeanMainguy Nov 4, 2023
088e9bf
rm typo
JeanMainguy Nov 4, 2023
f6ee247
improve gh action
JeanMainguy Nov 4, 2023
456e102
fix typo
JeanMainguy Nov 4, 2023
7116010
improve gh action
JeanMainguy Nov 4, 2023
8c08913
add support for pyrodigal v2 and v3
JeanMainguy Nov 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/binette_ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: Test Binette

on: [push] #to any branch

jobs:
build:

runs-on: ubuntu-latest
defaults:
run:
shell: bash -el {0}
strategy:
fail-fast: false
matrix:
python-version: [3.8] #["3.8", "3.9", "3.10"]

steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v3

# Install requirements
- uses: conda-incubator/setup-miniconda@v2
with:
mamba-version: "*"
python-version: ${{ matrix.python-version }}
channels: conda-forge,bioconda,defaults
environment-file: binette.yaml
activate-environment: binette

- name: Install binette
run: |
pip install .
binette -h

- name: Lint with flake8
run: |
mamba install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Test with pytest
run: |
mamba install pytest
pytest

- name: Download test data
uses: actions/checkout@v3
with:
repository: genotoul-bioinfo/Binette_TestData
path: test_data

- name: Run simple test case
run: |
cd test_data
binette -b binning_results/* --contigs all_contigs.fna --checkm2_db checkm2_tiny_db/checkm2_tiny_db.dmnd -v -o test_results

- name: Compare results with expectation
run: |
cd test_data
head expected_results/final_bins_quality_reports.tsv test_results/final_bins_quality_reports.tsv
python scripts/compare_results.py expected_results/final_bins_quality_reports.tsv test_results/final_bins_quality_reports.tsv
42 changes: 42 additions & 0 deletions .github/workflows/check_conda_recipe.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: conda recipe

# Controls when the workflow will run
on:
# Triggers the workflow on schedule but only for the default branch (which is master)
schedule:
- cron: '0 7 5,20 * *'
# on: [push] #to any branch

# Allows you to run this workflow manually from the Actions tab
# workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "check_recipe"
check_recipe:
name: test bioconda recipes on ${{ matrix.os }} with python ${{ matrix.python-version }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: ['ubuntu-latest','macos-latest']
python-version: ['3.8']
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2
# Setting up miniconda
- uses: conda-incubator/setup-miniconda@v2
with:
mamba-version: "*"
python-version: ${{ matrix.python-version }}
channels: conda-forge,bioconda,defaults
activate-environment: test
- name: Set up test environment
shell: bash -l {0}
run: |
mamba install -y binette
- name: check installation
shell: bash -l {0}
run: |
python --version
binette --version
60 changes: 0 additions & 60 deletions .github/workflows/python-package.yml

This file was deleted.

65 changes: 65 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Taken from https://github.com/github/gitignore/blob/main/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so
*.c

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Sphinx documentation
docs/_build/



# Jupyter Notebook
.ipynb_checkpoints


# IPython
profile_default/
ipython_config.py

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/


# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/


# Cython debug symbols
cython_debug/


# Custome folder
# testing
test_data/
65 changes: 36 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Overview
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/binette/README.html) [![Anaconda-Server Badge](https://anaconda.org/bioconda/binette/badges/downloads.svg)](https://anaconda.org/bioconda/binette)

# Binette

Binette is a fast and accurate binning refinement tool to constructs high quality MAGs from the output of multiple binning tools.

Expand All @@ -11,12 +13,22 @@ It then uses checkm2 to assess bins quality to finally select the best bins poss

Binette is inspired from the metaWRAP bin-refinement tool but it effectively solves all the problems from that very tool.
- Enhanced Speed: Binette significantly improves the speed of the refinement process. It achieves this by launching the initial steps of checkm2, such as prodigal and diamond runs, only once on all contigs. These intermediate results are then utilized to assess the quality of any given bin, eliminating redundant computations and accelerating the refinement process.
- No Limit on Input Bin Sets: Unlike its predecessor, Binette is not constrained by the number of input bin sets. It can handle and process multiple bin sets simultaneously, accommodating a broader range of data and experimental setups.
- No Limit on Input Bin Sets: Unlike its predecessor, Binette is not constrained by the number of input bin sets. It can handle and process multiple bin sets simultaneously.
<!-- - Bin selection have been improved. It selects the best bins in a more accurate and elegant manner.
- It is easier to use. -->

# Installation

## With Bioconda

Binette can be esailly installed with conda

```bash

conda install -c bioconda binette

```

## From a conda environnement

Clone this repository:
Expand All @@ -30,33 +42,6 @@ Then create a Conda environment using the `binette.yaml` file:
conda env create -n binette -f binette.yaml
conda activate binette
```
<!--
Binette need checkm2 to be fully installed with pip.

Follow Chekm2 installation instruction:

You can install it with git:

```
git clone --recursive https://github.com/chklovski/checkm2.git

pip install checkm2/

```
Or download the archive from github:

```bash
# get the archive
wget https://github.com/chklovski/CheckM2/archive/refs/tags/1.0.2.tar.gz

# decompress
tar -xf 1.0.2.tar.gz
rm 1.0.2.tar.gz

# install
pip install CheckM2-1.0.2/

``` -->

Download checkm2 database

Expand Down Expand Up @@ -155,6 +140,28 @@ For example, consider the following two `contig2bin_tables`:

In both formats, the `--contigs` argument should specify a FASTA file containing all the contigs found in the bins. Typically, this file would be the assembly FASTA file used to generate the bins. In these exemple the `assembly.fasta` file should contain at least the five contigs mentioned in the `contig2bin_tables` files or in the bin fasta files: `contig_1`, `contig_8`, `contig_15`, `contig_9`, and `contig_10`.

## Outputs

Binette results are stored in the `results` directory. You can specify a different directory using the `--outdir` option.

In this directory you will find:
- `final_bins_quality_reports.tsv`: This is a TSV (tab-separated values) file containing quality information about the final selected bins.
- `final_bins/`: This directory stores all the selected bins in fasta format.
- `temporary_files/`: This directory contains intermediate files. If you choose to use the `--resume` option, Binette will utilize files in this directory to prevent the recomputation of time-consuming steps.


The `final_bins_quality_reports.tsv` file contains the following columns:
| Column Name | Description |
|---------------------|--------------------------------------------------------------------------------------------------------------|
| **bin_id** | This column displays the unique ID of the bin. |
| **origin** | Indicates the source or origin of the bin, specifying from which bin set it originates or the intermediate set operation that created it. |
| **name** | The name of the bin. |
| **completeness** | The completeness of the bin, determined by CheckM2. |
| **contamination** | The contamination of the bin, determined by CheckM2. |
| **score** | This column displays the computed score, which is calculated as: `completeness - contamination * weight`. You can customize the contamination weight using the `--contamination_weight` option. |
| **size** | Represents the size of the bin in nucleotides. |
| **N50** | Displays the N50 of the bin. |
| **contig_count** | The number of contigs contained within the bin. |

# Bug reporting and feature requests

Expand Down
2 changes: 1 addition & 1 deletion binette.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ dependencies:
- tqdm
- networkx
- pyfastx
- pyrodigal
- pyrodigal<3
Loading
Loading