Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dorado plugin #344

Merged
merged 19 commits into from
May 9, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ the read in progress and so direct sequencing capacity towards reads of interest

**This implementation of readfish requires Guppy version >= 6.0.0 and MinKNOW version core >= 5.0.0 . It will not work on earlier versions.**

**Since MinKNOW version core >=5.9.0 and Dorado server version >=7.3.9, Dorado requires an alternate library, `ont-pybasecall-client-lib`. We have introduced a new`dorado` module to handle this.**


The code here has been tested with Guppy in GPU mode using GridION Mk1 and
NVIDIA RTX2080 on live sequencing runs and an NVIDIA GTX1080 using playback
Expand Down Expand Up @@ -121,9 +123,9 @@ conda env create -f development.yml
conda activate readfish_dev
```

| <h2>‼️ Important! </h2> |
|:---------------------------|
| The listed `ont-pyguppy-client-lib` version will probably not match the version installed on your system. To fix this, Please see this [issue](https://github.com/LooseLab/readfish/issues/221#issuecomment-1381529409) |
| <h2>‼️ Important !! </h2> |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| MinKNOW is transitioning from Guppy to Dorado. Until MinKNOW version 5.9 both Guppy and Dorado used ont-pyguppy-client-lib.<br/>As of MinKNOW version 5.9 and Dorado server version 7.3.9 and greater Dorado requires an alternate library, `ont-pybasecall-client-lib`.<br/>The listed `ont-pyguppy-client-lib` or `ont-pybasecaller-client-lib` version may not match the version installed on your system. To fix this, Please see this [issue](https://github.com/LooseLab/readfish/issues/221#issuecomment-1381529409), using the appropriate library. |


[ONT's Guppy GPU](https://community.nanoporetech.com/downloads) should be installed and running as a server.
Expand Down Expand Up @@ -333,8 +335,8 @@ Note: The plots here are generated from running readfish unblock-all on an Apple
<details style="margin-top: 10px">
<summary id="testing-basecalling-and-mapping"><h3 style="display: inline;">Testing base-calling and mapping</h3></summary>

To test selective sequencing you must have access to a
[guppy basecall server](https://community.nanoporetech.com/downloads/guppy/release_notes) (>=6.0.0)
To test selective sequencing you must have access to either a
[guppy basecall server](https://community.nanoporetech.com/downloads/guppy/release_notes) (>=6.0.0) or a [dorado basecall server](https://community.nanoporetech.com/downloads/dorado/release_notes).

and a readfish TOML configuration file.

Expand All @@ -348,6 +350,10 @@ NOTE: guppy and dorado are used here interchangeably as the basecall server. Dor
```toml
[mapper_settings.mappy-rs]
```
1. If on MinKNOW core>=5.9.0 and Dorado server version >=7.3.9, edit the `basecaller` section to read:
```toml
[caller_settings.dorado]
```
1. Modify the `fn_idx_in` field in the file to be the full path to a [minimap2](https://github.com/lh3/minimap2) index of the human genome.

1. Modify the `targets` fields for each condition to reflect the naming convention used in your index. This is the sequence name only, up to but not including any whitespace.
Expand Down Expand Up @@ -553,13 +559,19 @@ And for our Awesome Logo please checkout out [@tim_bassford](https://twitter.com

<!-- start-changelog -->
# Changelog
## 2024.2.0
1. Add a dorado base-caller which addressed issue [#347](https://github.com/LooseLab/readfish/issues/347) - chiefly in Dorado 7.3.9 ONT have moved to `ont-pybasecall-client-lib`,
and connections from `ont_pyguppy_client_lib` raise `Connection error. ... LOAD_CONFIG. Reply: INVALID_PROTOCOL` [(#344)](https://github.com/LooseLab/readfish/pull/344)
1. Adds version checking for MinKNOW and Guppy/Dorado, logs if not compatibile [(#351)](https://github.com/LooseLab/readfish/pull/351)

## 2024.1.0
1. bug fix type for `--wait-on-ready` type and actual function [(#327)](https://github.com/LooseLab/readfish/pull/327), [(#323)](https://github.com/LooseLab/readfish/pull/323)
1. mutiple suffix `.mmi` support [(#330)](https://github.com/LooseLab/readfish/pull/330)
1. Change the default `unblock_duration` on the `Analysis` class to use `DEFAULT_UNBLOCK` value defined in `_cli_args.py`. Change type on the Argparser for `--unblock-duration` to float. [(#313)](https://github.com/LooseLab/readfish/pull/313)
1. Big dog Duplex feature - adds ability to select duplex reads that cover a target region. See pull request for details [(#324)](https://github.com/LooseLab/readfish/pull/324)

## 2023.1.1
1. Fix Readme Logo link 🥳 (#296)
1. Fix bug where we had accidentally started requiring barcoded TOMLs to specify a region. Thanks to @jamesemery for catching this. (#299)
1. Correctly handle overriding a decision in internal statistics tracking. (#299)
2. Fix bug where we had accidentally started requiring barcoded TOMLs to specify a region. Thanks to @jamesemery for catching this. (#299)
3. Correctly handle overriding a decision in internal statistics tracking. (#299)
<!-- end-changelog -->
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/barcoded_human.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_bed_file_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_chr_depletion.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_chr_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
10 changes: 7 additions & 3 deletions docs/_static/example_tomls/human_csv_file_selection.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,13 @@
# - https://looselab.github.io/readfish/toml.html.

[caller_settings.guppy]
# Caller Configuration for Guppy Basecaller
config = "dna_r9.4.1_450bps_fast" # Specify the basecaller configuration
address = "ipc:///tmp/.guppy/5555" # Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r9.4.1_450bps_fast"
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
debug_log = "live_reads.fq" # Fastq output for individual reads (Optional, delete line to disable)

[mapper_settings.mappy]
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_minimap2_extra_params.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
6 changes: 4 additions & 2 deletions docs/_static/example_tomls/human_regions_barcoded.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,11 @@
# Basecaller configuration
[caller_settings.guppy]
# ^^^^^^ - ".guppy" specifies our chosen basecaller
# Guppy base-calling configuration file name
# If using dorado >7.3.9, this should be ".dorado".
# All other parameters are shared between the two basecallers.
# Guppy/Dorado base-calling configuration file name
config = "dna_r10.4.1_e8.2_400bps_5khz_fast"
# Address of the guppy basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
# Address of the guppy/dorado basecaller - The default address for guppy is ipc:///tmp/.guppy/5555.
address = "ipc:///tmp/.guppy/5555"
# Fastq output for individual reads. This is OPTIONAL - as these files can become quite large.
# Remove line to disable.
Expand Down
2 changes: 1 addition & 1 deletion docs/developers-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ conda env create -f docs/development.yml

## Readfish versioning
Readfish uses [calver](https://calver.org/) for versioning. Specifically the format should be
`YYYY.MINOR.MICRO.Modifier`, where `MINOR` is the feature addiiton, `MICRO` is any hotfix/bugfix, and `Modifier` is the modifier (e.g. `rc` for release candidate, `dev` for development, empty for stable).
`YYYY.MINOR.MICRO.Modifier`, where `MINOR` is the feature addition, `MICRO` is any hotfix/bugfix, and `Modifier` is the modifier (e.g. `rc` for release candidate, `dev` for development, empty for stable).

## Changelog

Expand Down
10 changes: 9 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@ dev = ["readfish[all,docs,tests]", "pre-commit"]
mappy = ["mappy"]
mappy-rs = ["mappy-rs >= 0.0.6"]
guppy = ["ont_pyguppy_client_lib"]
all = ["readfish[mappy,mappy-rs,guppy]"]
dorado = ["ont-pybasecall-client-lib"]
all = ["readfish[mappy,mappy-rs,guppy,dorado]"]

[project.urls]
Documentation = "https://looselab.github.io/readfish"
Expand Down Expand Up @@ -83,3 +84,10 @@ markers = [
"alignment: marks tests which rely on loading or using Mappy or Mappy-rs aligners, used to test with both. (deselect with '-m \"not slow\", select with '-k alignment')",
]
addopts = ["-ra", "--doctest-modules", "--ignore=src/readfish/read_until/base.py"]

[tool.coverage.report]
omit = [
"src/readfish/plugins/dorado.py",
"src/readfish/_read_until_client.py",
"src/readfish/plugins/guppy.py",
]
2 changes: 1 addition & 1 deletion src/readfish/__about__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""__about__.py
Version of the read until software
"""
__version__ = "2024.1.0"
__version__ = "2024.2.0"
103 changes: 103 additions & 0 deletions src/readfish/_compatibility.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
"""_compatibility.py

Contains utilities for checking readfish compatibility with various versions of MinKNOW.

Checks ranges of `readfish` against the `MinKNOW` version

Attributes:
LATEST_TESTED (str): The latest tested version of MinKNOW.
MINKNOW_COMPATIBILITY_RANGE (tuple): The compatibility range of MinKNOW versions for this version of readfish.
DIRECTION (Enum): An enumeration representing upgrade, downgrade, or no change directions.

"""

from __future__ import annotations

from enum import Enum

from minknow_api.manager import Manager
from packaging.version import parse as parse_version
from packaging.version import Version

LATEST_TESTED = "5.9.7"

# The versions of MinKNOW which this version of readfish can connect to
# Format - (lowest minknow version, highest version of minknow supported as an upper bound)
MINKNOW_COMPATIBILITY_RANGE = (
Version("5.0.0"),
Version(LATEST_TESTED),
)


class DIRECTION(Enum):
"""
Represents the direction in which the version of the readfish software should be changed
to be compatible with the tested version of an external tool (likely MinKNOW).

Attributes:
UPGRADE: Indicates that the readfish software version should be upgraded.
DOWNGRADE: Indicates that the readfish software version should be downgraded.
JUST_RIGHT: Indicates that the readfish software version is already compatible
with the tested version of the external tool.
"""

UPGRADE = "upgrade"
DOWNGRADE = "downgrade"
JUST_RIGHT = "do nothing"


def _get_minknow_version(host: str = "127.0.0.1", port: int = None) -> Version:
"""
Get the version of MinKNOW

:param host: The host the RPC is listening on, defaults to "127.0.0.1"
:param port: The port the RPC is listening on, defaults to None

:return: The version of MinKNOW readfish is connected to
"""
manager = Manager(host=host, port=port)
minknow_version = parse_version(manager.core_version)
return minknow_version


def check_compatibility(
comparator: Version,
version_range: tuple[Version, Version] = MINKNOW_COMPATIBILITY_RANGE,
) -> DIRECTION:
"""
Check the compatibility of a given software version, between a given range,
inclusive of the right edge.

:param comparator: Version of the provided software, for example MinKNOW 5.9.7
:param version_ranges: A tuple of lowest supported version, highest supported version

:return: A direction variant indicating if this version of readfish needs to be changed.

Examples:
>>> from packaging.version import Version
>>> check_compatibility(Version("5.9.5"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.JUST_RIGHT: 'do nothing'>
>>> check_compatibility(Version("5.9.7"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.JUST_RIGHT: 'do nothing'>
>>> check_compatibility(Version("5.9.8"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.UPGRADE: 'upgrade'>
>>> check_compatibility(Version("4.9.0"), (Version("5.0.0"), Version("5.9.7")))
<DIRECTION.DOWNGRADE: 'downgrade'>
>>> if (action := check_compatibility(Version("6.0.0"), MINKNOW_COMPATIBILITY_RANGE)) in (
... DIRECTION.UPGRADE,
... DIRECTION.DOWNGRADE,
... ):
... action
<DIRECTION.UPGRADE: 'upgrade'>
"""
(
lowest_supported_version,
highest_supported_version,
) = version_range
if comparator < lowest_supported_version:
return DIRECTION.DOWNGRADE
return (
DIRECTION.JUST_RIGHT
if comparator <= highest_supported_version
else DIRECTION.UPGRADE
)
1 change: 1 addition & 0 deletions src/readfish/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ def load_module(self, override=False):
"""
builtins = {
"guppy": "guppy",
"dorado": "dorado",
"mappy": "mappy",
"mappy_rs": "mappy_rs",
"no_op": "_no_op",
Expand Down
1 change: 1 addition & 0 deletions src/readfish/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from minknow_api.manager import Manager, FlowCellPosition
from minknow_api import Connection


if sys.version_info < (3, 11):
from exceptiongroup import BaseExceptionGroup

Expand Down
1 change: 1 addition & 0 deletions src/readfish/entry_points/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@
readfish stats --toml tests/static/stats_test/yeast_summary_test.toml --fastq-directory tests/static/stats_test/ --html summary_adaptive

"""

from __future__ import annotations
import argparse
from pathlib import Path
Expand Down
Loading
Loading