Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pydustmasker #51118

Merged
merged 1 commit into from
Oct 3, 2024
Merged

Add pydustmasker #51118

merged 1 commit into from
Oct 3, 2024

Conversation

apcamargo
Copy link
Contributor

@apcamargo apcamargo commented Oct 2, 2024

This PR adds the pydustmasker Python library.


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

Summary by CodeRabbit

  • New Features

    • Introduced the pydustmasker package (version 1.0.0) for nucleotide sequence processing.
    • Added a new build script to automate the package build process.
  • Documentation

    • Included metadata and configuration details in the meta.yaml file, such as package dependencies and project information.

Copy link
Contributor

coderabbitai bot commented Oct 2, 2024

📝 Walkthrough

Walkthrough

This pull request introduces two new files for the pydustmasker Python package. The build.sh script automates the build process using Rust, handling environment setup and package installation. The meta.yaml file defines the package's metadata, including its name, version, source URL, checksum, build requirements, and additional details for installation and testing.

Changes

File Path Change Summary
recipes/pydustmasker/build.sh New script added to automate the build process for the pydustmasker package using Rust.
recipes/pydustmasker/meta.yaml New metadata file created for the pydustmasker package, defining its name, version, source URL, checksum, build requirements, and additional information.

Possibly related PRs

  • add recipe for sizemeup #51033: The build.sh script in the sizemeup PR shares a similar purpose of automating the installation process for a Python package, akin to the build.sh script in the main PR for pydustmasker. Both scripts involve setting up the environment and managing package installations.

Suggested labels

please review & merge


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (5)
recipes/pydustmasker/build.sh (2)

5-7: Improve macOS-specific setup

The macOS-specific setup is good for cross-platform compatibility. However, there are two minor improvements we can make:

  1. Quote the uname command to prevent word splitting.
  2. Declare and assign the HOME variable separately to avoid masking return values.

Here's the suggested fix:

-if [ `uname` == Darwin ]; then
-  export HOME=`mktemp -d`
+if [ "$(uname)" == Darwin ]; then
+  HOME=$(mktemp -d)
+  export HOME
 fi
🧰 Tools
🪛 Shellcheck

[warning] 5-5: Quote this to prevent word splitting.

(SC2046)


[warning] 6-6: Declare and assign separately to avoid masking return values.

(SC2155)


15-17: LGTM: Proper package build and installation

The package build using maturin and the subsequent installation with pip look correct. The use of --no-deps and --ignore-installed flags is appropriate for conda environments.

For consistency with the rest of the script, consider using $PYTHON for the maturin command as well:

-maturin build --interpreter python --release
+$PYTHON -m maturin build --release

This change ensures that the same Python interpreter is used for both building and installing the package.

recipes/pydustmasker/meta.yaml (3)

17-25: LGTM: Requirements are well-defined. Consider pinning Python version.

The requirements section is correctly structured:

  • Build requirements include the C compiler, necessary for compiling Rust code.
  • Host requirements (pip, maturin, python) are appropriate for building a Rust-based Python package.
  • Run requirements only include python, which is correct for a compiled extension.

However, consider pinning the Python version to ensure compatibility:

requirements:
  host:
    - python >=3.7
  run:
    - python >=3.7

This helps prevent potential issues with incompatible Python versions while still allowing for some flexibility.


27-29: LGTM: Basic import test is present. Consider adding more comprehensive tests.

The import test is sufficient to verify that the package can be imported. However, to ensure the package's functionality, consider adding more comprehensive tests. For example:

test:
  imports:
    - pydustmasker
  commands:
    - python -c "import pydustmasker; assert pydustmasker.__version__ == '1.0.0'"
    - python -c "from pydustmasker import dust_mask; assert callable(dust_mask)"

These additional tests would verify the package version and the presence of expected functions.


1-37: Minor style suggestion: Remove extra blank line at the end of the file.

The yamllint tool reported a warning about too many blank lines at the end of the file. Consider removing the extra blank line to adhere to YAML style guidelines.

Note: The syntax error reported by yamllint for the Jinja2 templating syntax ({%) can be safely ignored, as it's a false positive. Jinja2 templating is correctly used in Conda recipes.

🧰 Tools
🪛 yamllint

[warning] 37-37: too many blank lines

(1 > 0) (empty-lines)


[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ecb3d6e and eff2022.

📒 Files selected for processing (2)
  • recipes/pydustmasker/build.sh (1 hunks)
  • recipes/pydustmasker/meta.yaml (1 hunks)
🧰 Additional context used
🪛 Shellcheck
recipes/pydustmasker/build.sh

[warning] 5-5: Quote this to prevent word splitting.

(SC2046)


[warning] 6-6: Declare and assign separately to avoid masking return values.

(SC2155)

🪛 yamllint
recipes/pydustmasker/meta.yaml

[warning] 37-37: too many blank lines

(1 > 0) (empty-lines)


[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (5)
recipes/pydustmasker/build.sh (1)

1-3: LGTM: Proper script setup

The script starts with the correct shebang and uses set -ex for proper error handling and debugging. This is a good practice for bash scripts.

recipes/pydustmasker/meta.yaml (4)

1-6: LGTM: Package name and version are correctly defined.

The package name and version are properly set using Jinja2 templating, which is the recommended approach for Conda recipes. The version "1.0.0" matches the information provided in the PR objectives.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


12-15: LGTM: Build section is correctly configured.

The build section is well-defined:

  • Build number is set to 0, which is appropriate for a new package.
  • The run_exports section uses the recommended pin_subpackage macro with max_pin="x.x", ensuring compatibility up to minor version changes. This aligns with the PR objectives regarding API, ABI, and CLI stability.

31-36: LGTM: About section is comprehensive and well-defined.

The about section provides all necessary metadata:

  • Home URL is correctly specified.
  • License information (MIT) is provided along with the license file location.
  • The summary clearly describes the package's functionality.

This information is crucial for users and complies with Bioconda's requirements.


8-10: Verify the source URL structure.

The source section is generally well-defined with the correct URL format and SHA256 checksum. However, there seems to be a potential issue with the URL structure:

url: https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/{{ name }}-{{ version }}.tar.gz

The package name appears twice in the URL ({{ name }}-{{ name }}), which is unusual. This might lead to a 404 error when trying to download the source. Please verify if this is correct or if it should be:

url: https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/{{ name }}-{{ version }}.tar.gz

To confirm the correct URL, you can run the following command:

This will check both URL variants and show which one returns a successful HTTP status code.

Comment on lines +9 to +11
curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain nightly --profile=minimal -y

export PATH="$HOME/.cargo/bin:$PATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using stable Rust toolchain

The Rust installation and PATH update look good. However, using the nightly toolchain might lead to instability in the build process. Unless there's a specific requirement for nightly features, it's generally recommended to use the stable toolchain.

Consider changing the Rust installation command to use the stable toolchain:

-curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain nightly --profile=minimal -y
+curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain stable --profile=minimal -y

If nightly is required, please add a comment explaining why.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain nightly --profile=minimal -y
export PATH="$HOME/.cargo/bin:$PATH"
curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain stable --profile=minimal -y
export PATH="$HOME/.cargo/bin:$PATH"


export PATH="$HOME/.cargo/bin:$PATH"

export CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER="$CC"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve cross-platform linker setup

The current linker setup is specific to the x86_64 Linux target. To improve cross-platform support, consider making this setup more generic.

Here's a suggestion for a more generic linker setup:

-export CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER="$CC"
+if [ -n "$CC" ]; then
+  # Get the target triple from rustc
+  TARGET_TRIPLE=$(rustc -vV | sed -n 's|host: ||p')
+  # Set the linker for the current target
+  export CARGO_TARGET_${TARGET_TRIPLE//-/_}_LINKER="$CC"
+fi

This change will set the appropriate linker for the current target, regardless of the platform.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER="$CC"
if [ -n "$CC" ]; then
# Get the target triple from rustc
TARGET_TRIPLE=$(rustc -vV | sed -n 's|host: ||p')
# Set the linker for the current target
export CARGO_TARGET_${TARGET_TRIPLE//-/_}_LINKER="$CC"
fi

@apcamargo
Copy link
Contributor Author

@BiocondaBot please add label

@BiocondaBot BiocondaBot added the please review & merge set to ask for merge label Oct 2, 2024
Copy link
Contributor

@mencian mencian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @apcamargo!

@mencian mencian merged commit 9e26ba1 into bioconda:master Oct 3, 2024
7 checks passed
@apcamargo apcamargo deleted the pydustmasker branch October 16, 2024 03:56
@coderabbitai coderabbitai bot mentioned this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please review & merge set to ask for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants