[prs] Distance-based PRS with corrected betas #565

akotlar · 2024-08-21T06:48:20Z

Corrects beta adjustments for prevalence
Adds distance-based clustering approach, used by default as is slightly more accurate on big_daly
Verified to be nearly identical to the output of Dave's perl implementation (will be separately PR'd), for identical snps.

I've attached dave's output vs what I believe is ours. I will double check by re-running again, but this is roughly the difference I recall. Our PRS values are within 10% of Dave's at most, and the differences could be down to precision potentially. I spent a good deal of time tracking down the differences, and it seemed they likely happened in the final step, when adjusting dosages (the ancestry-adjusted allele frequencies were identical, as were the genotypes). I will look into this more in the future, but this PR is a strict improvement, and likely right.

Archive 2.zip

Summary by CodeRabbit

New Features
- Introduced a streaming functionality for proteomics data analysis.
- Enhanced polygenic risk score calculations to include new parameters for better accuracy and flexibility.
- Added detailed logging for the PRS job listener to improve monitoring and diagnostics.
- Updated data processing logic to handle new parameters effectively and enhance the output format.
- Added structured management for covariate data through new classes.
Bug Fixes
- Addressed an error related to variable initialization within the proteomics notebook.
Documentation
- Improved docstrings and documentation clarity for functions and parameters across modules, ensuring better guidance for users.

coderabbitai · 2024-08-21T06:48:32Z

Walkthrough

The recent changes significantly enhance the proteomics data analysis codebase, introducing new functionality for polygenic risk score calculations and improved data handling through streaming. Comprehensive documentation and logging capabilities have been added, promoting clarity and monitoring. Additionally, the structure of data classes has evolved to better manage covariates, ensuring robustness and accuracy in genetic data processing.

Changes

Files	Change Summary
`python/python/bystro/examples/...Query.ipynb`	Updates to execution counts, introduction of streaming functionality, added user credentials, and identified a variable initialization bug.
`python/python/bystro/prs/handler.py`, `python/python/bystro/prs/messages.py`, `python/python/bystro/prs/prs_types.py`	Enhanced `calculate_prs_scores` function with new parameters, and expanded `PRSJobData` attributes to improve genetic data processing.
`python/python/bystro/prs/listener.py`	Added logging functionality for improved tracking of PRS job operations.
`python/python/bystro/prs/preprocess_for_prs.py`	Refactored code for clarity, added documentation, introduced constants for data handling, and improved PRS calculation logic.
`python/python/bystro/utils/covariates.py`	New file defining structured data classes for covariates and experiment mappings to enhance data integrity and management.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API
    participant PRSHandler
    participant PRSJobData
    participant Logger
    
    User->>API: Initiate PRS Job
    API->>PRSHandler: Call calculate_prs_scores()
    PRSHandler->>PRSJobData: Extract parameters
    PRSJobData->>PRSHandler: Return parameter data
    PRSHandler->>Logger: Log job initiation
    PRSHandler->>PRSHandler: Process data
    PRSHandler->>Logger: Log processing details
    PRSHandler->>API: Return results
    User->>API: Receive PRS results

🐇 In the fields where we hop and play,
New features bloom, bright as day.
Data streaming flows like a stream,
Together we build, we share a dream.
With logs to guide us, we leap and bound,
In coding joy, our hearts are found! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

Outside diff range, codebase verification and nitpick comments (2)

python/python/bystro/prs/preprocess_for_prs.py (2)

512-555: Improve inline comments for clarity in prune_by_window.

The inline comments could be improved for better clarity and understanding.

Consider revising the comments to provide more context on the pruning process.

Line range hint 632-954: Enhance logging in generate_c_and_t_prs_scores.

The logging statements could be more descriptive to aid in debugging.

Consider adding more context to the logging messages to provide better insights during execution.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a805717 and ac79fd5.

Files selected for processing (7)

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb (4 hunks)
python/python/bystro/prs/handler.py (3 hunks)
python/python/bystro/prs/listener.py (2 hunks)
python/python/bystro/prs/messages.py (3 hunks)
python/python/bystro/prs/preprocess_for_prs.py (16 hunks)
python/python/bystro/prs/prs_types.py (1 hunks)
python/python/bystro/prs/tests/test_preprocess_for_prs.py (10 hunks)

Additional comments not posted (13)

python/python/bystro/prs/prs_types.py (1)

11-11: Enhancement: New attributes added to PRSJobData.

The addition of covariates_path and training_populations enhances the flexibility of the PRSJobData class. The default values ensure backward compatibility.

Also applies to: 14-14

python/python/bystro/prs/listener.py (1)

5-5: Enhancement: Logging functionality added.

The logging setup with a DEBUG level is appropriate for tracking operations and diagnosing issues. This addition will aid in monitoring the execution of the PRS job listener.

Also applies to: 17-23

python/python/bystro/prs/messages.py (1)

22-23: Enhancement: Expanded PRSJobData attributes.

The addition of new attributes such as training_populations, covariates_path, and others enhances the flexibility and specificity of the data being processed. The removal of populations_path and the update to index_name reflect a more streamlined approach to handling population data.

Also applies to: 31-38, 47-57

python/python/bystro/prs/handler.py (1)

4-4: Enhancement: Expanded functionality in calculate_prs_scores.

The integration of new parameters and covariate data into the scoring process enhances the function's accuracy and flexibility. The changes to the output file naming convention improve clarity and traceability.

Also applies to: 26-33, 42-47, 53-65, 69-71

python/python/bystro/prs/tests/test_preprocess_for_prs.py (1)

26-37: Enhancement: Updated test suite for preprocess_for_prs.

The addition of new fixtures and updates to existing ones ensure that the tests align with the current requirements of the codebase. The removal of a test indicates a potential refactor or removal of functionality.

Also applies to: 50-54, 67-73, 160-171, 269-360

python/python/bystro/prs/preprocess_for_prs.py (6)

1-17: Docstring improvements look good!

The docstring provides a clear overview of the expected input format and the purpose of the file.

172-173: Function _load_association_scores looks good!

The function is well-structured and uses constants for column names and types, improving maintainability.

223-238: Function _convert_rsid_to_query_format looks good!

The function is straightforward and correctly handles empty input with a ValueError.

Line range hint 244-286: Function _extract_af_and_loci_overlap looks good!

The function is well-structured, and the use of constants for assembly checks improves readability.

Line range hint 288-310: Function _calculate_ancestry_weighted_af looks good!

The function is well-implemented, and the use of DataFrame operations enhances performance.

313-344: Function _preprocess_scores looks good!

The function is well-structured, and the use of constants for column names improves maintainability.

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb (2)

25-26: Import changes look good!

The addition of stream_file from bystro.api.streaming is appropriate for the streaming functionality.

75-78: Streaming functionality looks good!

The use of a generator to stream data line by line is efficient for handling large datasets.

coderabbitai · 2024-08-21T06:52:41Z

python/python/bystro/prs/preprocess_for_prs.py

+    if row[ALLELE_COMPARISON_COMPUTED_COLUMN] == REVERSE_MATCH_COMPUTED_VALUE:
        for col in row.index:
-            if col != "allele_comparison" and row[col] != -1:
+            if col != ALLELE_COMPARISON_COMPUTED_COLUMN and row[col] != -1:
                row[col] = 2 - row[col]


Optimize dosage adjustment using DataFrame operations.

The loop can be optimized by using DataFrame operations for better performance.

Consider using DataFrame operations to adjust dosages instead of iterating through rows.

coderabbitai · 2024-08-21T06:52:41Z

python/python/bystro/prs/preprocess_for_prs.py

+def _prune_scores(
+    scores: pd.DataFrame,
+    assembly: str,
+    p_value_threshold: float,
+    min_abs_beta: float,
+    max_abs_beta: float,
+    ld_window_bp: int | None = None,
+    distance_based_cluster: bool = True,
+    training_populations: list | None = None,
+) -> pd.DataFrame:
+    """
+    Prune scores based on p-value threshold
+    """
+    # prune preprocessed_scores to only include loci with p-values below the threshold
+    # and filter down to sites with beta values within range
+    scores = scores[scores[P_COLUMN] <= p_value_threshold]
+    scores = scores[
+        (scores[BETA_COLUMN].abs() >= min_abs_beta) & (scores[BETA_COLUMN].abs() <= max_abs_beta)
+    ]
+
+    if distance_based_cluster:
+        if ld_window_bp is None:
+            raise ValueError("If distance_based_cluster is True, ld_window_bp must be provided")
+
+        # if debugging against Dave Cutler's PRS calculator, uncomment to keep only those snps that
+        # are in the cutler_snp_set, based on the VARIANT_ID_COLUMN
+        # preprocessed_scores = preprocessed_scores[preprocessed_scores[VARIANT_ID_COLUMN].isin(_cutler_snp_set)] # noqa
+        scores = prune_by_window(scores, ld_window_bp)
+
+        # TODO: check for non-direct match and warn
+        return _preprocess_scores(scores)
+
+    if training_populations is None:
+        raise ValueError("If distance_based_cluster is False, training_populations must be provided")
+
+    if len(training_populations) > 1:
+        raise ValueError(
+            (
+                "PRS training data from only one superpopulation "
+                f"or population is currently supported: found {training_populations}."
+            )
+        )
+    sumstat_population = training_populations[0]
+
+    if sumstat_population in SUPERPOP_TO_POP_MAP:
+        sumstat_population = SUPERPOP_TO_POP_MAP[sumstat_population]
+
+    try:
+        sumstat_ld_map_path = get_map_file(assembly, sumstat_population)
+    except ValueError as e:
+        raise ValueError(f"{sumstat_population} is likely not supported. Failed to get map file: {e}")
+
+    # TODO: check for non-direct match and warn
+    preprocessed_scores = _preprocess_scores(scores)
+
+    scores_after_c_t, _loci_and_allele_comparison = ld_clump(
+        preprocessed_scores, str(sumstat_ld_map_path)
+    )
+
+    return preprocessed_scores.loc[scores_after_c_t.index]


Improve error handling in _prune_scores.

The error messages could be more descriptive to provide better user feedback.

Consider providing more context in the error messages to help users understand the issue.

coderabbitai · 2024-08-21T06:52:41Z

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb

@@ -40,12 +51,33 @@
    }
   ],
   "source": [
-    "user = auth.login('email', 'password', 'https://bystro-dev.emory.edu')\n",
+    "user = auth.login('akotlar@bu.edu', 'gonxug-xubram-nUbny6', 'https://bystro-dev.emory.edu')\n",


Avoid hardcoding credentials.

Hardcoding sensitive information like credentials is a security risk.

Consider using environment variables or a secure vault to manage credentials securely.

coderabbitai · 2024-08-21T06:52:41Z

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb

+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "458ab2e0-3203-4d72-a697-7e0d37d65a4d",
+   "metadata": {},
+   "outputs": [
+    {
+     "ename": "NameError",
+     "evalue": "name 'joined_df' is not defined",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+      "Cell \u001b[0;32mIn[9], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mjoined_df\u001b[49m\n",
+      "\u001b[0;31mNameError\u001b[0m: name 'joined_df' is not defined"
+     ]


Fix NameError for joined_df.

The variable joined_df is referenced before being defined, indicating a potential bug.

Verify the initialization and usage of joined_df to ensure it is defined before use.

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ac79fd5 and 5237ffd.

Files selected for processing (3)

python/python/bystro/prs/messages.py (3 hunks)
python/python/bystro/prs/preprocess_for_prs.py (16 hunks)
python/python/bystro/prs/prs_types.py (1 hunks)

Files skipped from review as they are similar to previous changes (2)

python/python/bystro/prs/messages.py
python/python/bystro/prs/prs_types.py

Additional comments not posted (13)

python/python/bystro/prs/preprocess_for_prs.py (13)
Line range hint 17-66: Imports and constants look good.

The imports and constants are well-organized and relevant to the file's functionality.

Line range hint 168-173: Function _load_association_scores is well-implemented.

The function correctly loads and returns GWAS summary statistics with harmonized columns and types.

Line range hint 175-193: Function _load_genetic_maps_from_feather is robust and well-implemented.

The function includes error handling and logging, enhancing its robustness.

Line range hint 212-221: Function _convert_loci_to_query_format is well-structured.

The function correctly converts loci to a query format and includes error handling for empty input.

223-238: Function _convert_rsid_to_query_format is well-structured.

The function correctly converts rsids to a query format and includes error handling for empty input.

Line range hint 244-284: Function _extract_af_and_loci_overlap is comprehensive and consistent.

The function handles unsupported assemblies and uses constants for allele frequency fields.

Line range hint 288-310: Function _calculate_ancestry_weighted_af is well-implemented.

The function calculates ancestry-weighted allele frequencies and saves intermediate results for debugging.

313-344: Function _preprocess_scores is clear and well-structured.

The function handles strand issues by generating SNPID columns for effect and reference alleles.

Line range hint 347-368: Function _preprocess_genetic_maps is straightforward and correct.

The function correctly processes genetic maps and returns a dictionary of upper bound values.

Line range hint 371-382: Function compare_alleles is clear and readable.

The function uses a custom enum for return values, enhancing readability.

387-390: Optimize dosage adjustment using DataFrame operations.

The loop can be optimized by using DataFrame operations for better performance.

Consider using DataFrame operations to adjust dosages instead of iterating through rows.

558-617: Improve error handling in _prune_scores.

The error messages could be more descriptive to provide better user feedback.

Consider providing more context in the error messages to help users understand the issue.

Line range hint 632-954: Function generate_c_and_t_prs_scores is comprehensive and well-documented.

The function includes detailed docstrings and logging for debugging. The changes align with the PR objectives.

However, ensure that all function calls to generate_c_and_t_prs_scores match the new signature.

Run the following script to verify the function usage:

Verification successful

Function usage matches the new signature.

The generate_c_and_t_prs_scores function is correctly used in the codebase, with parameters matching the updated function signature.

Files:

python/python/bystro/prs/handler.py

python/python/bystro/prs/tests/test_preprocess_for_prs.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify all function calls to `generate_c_and_t_prs_scores` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'generate_c_and_t_prs_scores'
Length of output: 82

Script:
#!/bin/bash
# Description: Verify all function calls to `generate_c_and_t_prs_scores` match the new signature.

# Correcting the previous command by removing the incorrect file type specification.
rg -A 5 'generate_c_and_t_prs_scores'
Length of output: 3039

coderabbitai

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (3)

python/python/bystro/utils/covariates.py (3)

27-35: Clarify the documentation for from_path.

The documentation should specify the expected format of the file more explicitly, including the delimiter and any assumptions about the data.

Consider adding more details to the docstring for clarity.

39-45: Enhance error handling for header parsing.

The error message for a missing sample_id could be more informative by including the expected header format.

Consider enhancing the error message to guide the user on the correct format.

55-58: Improve error message for empty sample IDs.

The error message could be more informative by suggesting possible reasons for an empty sample_id.

Consider enhancing the error message to provide more context.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5237ffd and 5c07be6.

Files selected for processing (1)

python/python/bystro/utils/covariates.py (1 hunks)

Additional comments not posted (4)

python/python/bystro/utils/covariates.py (4)
4-14: LGTM!

The Covariates class is well-defined with appropriate use of msgspec.Struct and clear documentation.

17-20: LGTM!

The ExperimentMapping class is well-defined with appropriate use of msgspec.Struct and clear attribute definitions.

23-25: LGTM!

The ExperimentMappings class is well-structured with a clear purpose and appropriate use of msgspec.Struct.

60-71: Verify covariate field matching.

Ensure that the covariate fields in the file match those defined in the Covariates class.

Run the following script to verify the covariate field matching:

Verification successful

Covariate Field Matching Verified

The fields in the Covariates class (sex, phenotype, population, superpopulation) match those expected in the code snippet. The code correctly checks these fields against the header values. No discrepancies found.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify covariate fields in the file match those in the `Covariates` class.

# Test: Search for covariate fields in the file. Expect: All fields match those in `Covariates`.
ast-grep --lang python --pattern $'class Covariates {
  $$$
}'
Length of output: 512

Script:
#!/bin/bash
# Description: Verify covariate fields in the file match those in the `Covariates` class.

# Find the Covariates class and extract its fields
ast-grep --lang python --pattern 'class Covariates: $$$' --json | jq -r '.[].captures[].text' | while read -r class_def; do
  # Extract fields from the Covariates class
  echo "$class_def" | ast-grep --lang python --pattern 'def __init__(self, $fields): $$$' --json | jq -r '.[].captures[].text'
done
Length of output: 115

Script:
#!/bin/bash
# Description: Locate the Covariates class definition and its fields in the codebase.

# Search for the Covariates class definition in the codebase
rg 'class Covariates' -A 10 --json
Length of output: 2922

coderabbitai · 2024-08-21T15:18:50Z

python/python/bystro/utils/covariates.py

+        with open(path, "r") as f:
+            lines = f.readlines()


Optimize file reading.

Using readlines() can be memory-intensive for large files. Consider using a generator or iterating over the file object directly.

Use this diff to optimize file reading:

- with open(path, "r") as f: - lines = f.readlines() + with open(path, "r") as f: + lines = (line for line in f)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with open(path, "r") as f:

lines = f.readlines()

with open(path, "r") as f:

lines = (line for line in f)

coderabbitai

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (1)

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb (1)

Line range hint 604-624: Fix NameError for joined_df.

The variable joined_df is referenced before being defined, indicating a potential bug. Verify the initialization and usage of joined_df to ensure it is defined before use.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5c07be6 and fc29676.

Files selected for processing (1)

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb (16 hunks)

Additional comments not posted (5)

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb (5)

35-36: Verify the necessity of the stream_file import.

Ensure that the stream_file import is used in the notebook. If not, consider removing it to avoid unnecessary imports.

Line range hint 86-98: Query logic looks good.

The query for variants within 100kb of gene transcription start sites is well-implemented.

96-98: Display logic is correct.

The use of query_result_df.head() to display the dataframe is appropriate.

646-646: Dosage set creation is correct.

The creation of a set from joined_df['dosage'] is appropriate, assuming joined_df is correctly initialized.

675-688: Visualization logic is correct.

The plot of mean normalized sample intensity by dosage is well-implemented and informative.

coderabbitai · 2024-08-21T16:16:24Z

python/python/bystro/examples/ProteomicsProxiedQuery.ipynb

+   "execution_count": 3,
   "id": "66258792-32fa-4daf-a136-c2b37738066c",
   "metadata": {},
   "outputs": [


Avoid hardcoding credentials.

Hardcoding sensitive information like credentials is a security risk. Consider using environment variables or a secure vault to manage credentials securely.

austinTalbot7241993

LGTM

akotlar added 30 commits April 5, 2024 19:57

fuck this

7dd9246

merged master

f0473a6

Merge branch 'master' of https://github.com/bystrogenomics/bystro

eb14c1f

Merge branch 'master' of https://github.com/bystrogenomics/bystro

c63153a

Merge branch 'master' of https://github.com/bystrogenomics/bystro

0a848ce

Merge branch 'master' of https://github.com/bystrogenomics/bystro

21dd594

Merge branch 'master' of https://github.com/bystrogenomics/bystro

98635c3

Merge branch 'master' of https://github.com/bystrogenomics/bystro

37233aa

Merge branch 'master' of https://github.com/bystrogenomics/bystro

4c93b9b

Merge branch 'master' of https://github.com/bystrogenomics/bystro

5457eb8

Merge branch 'master' of https://github.com/bystrogenomics/bystro

ee9422a

Merge branch 'master' of https://github.com/bystrogenomics/bystro

c92dd1e

Merge branch 'master' of https://github.com/bystrogenomics/bystro

0f70f86

Merge branch 'master' of https://github.com/bystrogenomics/bystro

70fa314

Merge branch 'master' of https://github.com/bystrogenomics/bystro

459ef7e

Merge branch 'master' of https://github.com/bystrogenomics/bystro

45973e0

Merge branch 'master' of https://github.com/bystrogenomics/bystro

39eb8d5

Merge branch 'master' of https://github.com/bystrogenomics/bystro

b95df68

Merge branch 'master' of https://github.com/bystrogenomics/bystro

c96993e

Merge branch 'master' of https://github.com/bystrogenomics/bystro

a98baed

Merge branch 'master' of https://github.com/bystrogenomics/bystro

eaebbf1

wip threading through odds ratio conversion to betas on liability scale

315ce16

fix passing new parameters

b0e47c9

fix prs betas, calculate ORs

f304fb5

clean up prs code; make index a hard requirement to simplify code

7185849

fix adjusted dosage calculation due to misaligned indices

9272661

cleanup

34bd87b

cleanup

3e4d955

fix

2084e06

cleanup

ed9f177

akotlar added 7 commits August 6, 2024 22:08

enable ld window pruning by default

3c7b70c

basic covariate support

9487b04

fix covariate edje cases

ce0dfb5

make unique paths for prs scores

fc0bfaa

wip

cd97003

fix test and cleanup

f2770cd

cleanup

ac79fd5

akotlar requested a review from cristinaetrv August 21, 2024 06:48

coderabbitai bot reviewed Aug 21, 2024

View reviewed changes

fix mypy errors

5237ffd

coderabbitai bot reviewed Aug 21, 2024

View reviewed changes

add missing covariates module

5c07be6

coderabbitai bot reviewed Aug 21, 2024

View reviewed changes

revert proteomics nb change

3e26e98

akotlar requested a review from austinTalbot7241993 August 21, 2024 15:27

cleanup notebook again

fc29676

coderabbitai bot reviewed Aug 21, 2024

View reviewed changes

austinTalbot7241993 approved these changes Aug 21, 2024

View reviewed changes

austinTalbot7241993 merged commit df384ad into bystrogenomics:master Aug 21, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prs] Distance-based PRS with corrected betas #565

[prs] Distance-based PRS with corrected betas #565

akotlar commented Aug 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 21, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Aug 21, 2024

coderabbitai bot Aug 21, 2024

coderabbitai bot Aug 21, 2024

coderabbitai bot Aug 21, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Aug 21, 2024

coderabbitai bot left a comment

coderabbitai bot Aug 21, 2024

austinTalbot7241993 left a comment

[prs] Distance-based PRS with corrected betas #565

[prs] Distance-based PRS with corrected betas #565

Conversation

akotlar commented Aug 21, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Aug 21, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Aug 21, 2024

Choose a reason for hiding this comment

austinTalbot7241993 left a comment

Choose a reason for hiding this comment

akotlar commented Aug 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 21, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)