Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update filtering and subsampling docs #222

Merged
merged 5 commits into from
Aug 20, 2024

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Aug 15, 2024

preview

Description of proposed changes

Prepping for upcoming weighted sampling docs. These changes are applicable to current augur filter features.

Related issue(s)

Checklist

  • Checks pass

@victorlin victorlin self-assigned this Aug 15, 2024
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 704f86d to b9ec0bb Compare August 15, 2024 22:18
@joverlee521 joverlee521 self-requested a review August 16, 2024 16:22
joverlee521
joverlee521 previously approved these changes Aug 16, 2024
Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement! I only left non-blocking suggestions, but I think all the additions are very helpful.

src/guides/bioinformatics/filtering-and-subsampling.rst Outdated Show resolved Hide resolved
src/guides/bioinformatics/filtering-and-subsampling.rst Outdated Show resolved Hide resolved
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 555a3e1 to c334b31 Compare August 16, 2024 21:34
@victorlin victorlin dismissed joverlee521’s stale review August 16, 2024 21:36

significant changes to "Filtering" section added

@victorlin victorlin changed the title Update subsampling docs Update filtering and subsampling docs Aug 16, 2024
@victorlin victorlin mentioned this pull request Aug 16, 2024
3 tasks
Reword some text and add an example for --query.
Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice additions -- the added "notes" are especially helpful

src/guides/bioinformatics/filtering-and-subsampling.rst Outdated Show resolved Hide resolved
src/guides/bioinformatics/filtering-and-subsampling.rst Outdated Show resolved Hide resolved
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 5962ddb to 65f8fe4 Compare August 19, 2024 21:54
@victorlin
Copy link
Member Author

After working on #223 and finding it hard to fit weighted sampling docs nicely, I've come back here and replaced c334b31 with 65f8fe4. This seems much better.

@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 65f8fe4 to b5cb803 Compare August 19, 2024 22:57
Comment on lines +11 to +16
Overview
========

``augur filter`` provides the flexibility to choose different subsets of input
data for various types of analysis. There are several options which can be
categorized based on the information source and selection method.
Copy link
Member Author

@victorlin victorlin Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new overview portion looks more like reference material that should go on the CLI page. But also, the rest of this page relies heavily on the terminology laid out here. Maybe fine to keep it here and leave the CLI page as a reference for what the options are.

Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! Thanks for humouring my ideas

src/guides/bioinformatics/filtering-and-subsampling.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@genehack genehack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some non-blocking suggestions.

@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch 2 times, most recently from 96f68da to 62e01e5 Compare August 20, 2024 23:29
victorlin and others added 3 commits August 20, 2024 16:34
Note that I'm introducing new terminology here: "preliminary" vs.
"subsampling" vs. "force-inclusive" filtering options. These are clearly
distinct in the order of operations, making these labels helpful for
explaining that process.

For "preliminary", I had considered a term such as "exclusive" to better
contrast with "force-inclusive". However, the expression syntax used for
options in this category can be both exclusive (--exclude-where
region!=Asia) and inclusive (--min-date 2012). This is also why
"inclusive" is not a sufficient name for the "force-inclusive" category.

Co-authored-by: James Hadfield <hadfield.james@gmail.com>
Reword the subsampling introduction with *what* it is, followed by
examples on *why* paired with *how*.

This also allows future sampling methods such as weighted sampling to be
added by simply including a new section.
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 62e01e5 to 5779a70 Compare August 20, 2024 23:35
@victorlin victorlin merged commit ab86278 into master Aug 20, 2024
3 of 4 checks passed
@victorlin victorlin deleted the victorlin/update-subsampling-docs branch August 20, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants