Run QC for WTS alongside WGS #597

alexiswl · 2023-06-06T06:51:15Z

`cba7478`

Updates by file

data_processors.pipeline.domain.workflow.py

Added abstract workflow type DRAGEN_WGTS_QC.
Added labmetadata rule method must_be_wgts

data_processors.pipeline.lambdas.dragen_wgs_qc.py

Handle WTS data and add enable_rna option. True for WTS data, false for WGS data

data_processors.pipeline.orchestartion.dragen_wgs_qc_step.py

Set batcher runstep parameter to DRAGEN_WGTS_QC abstract workflow type
Replace must_be_wgs labmetadata rule with must_be_wgts
Update warning to use DRAGEN_WGTS_QC abstract workflow type

`5eb62f3`

Update orchestration logic

Now dragen wts happens after QC is complete.
Instead of performing dragen_wts after bclconvert, we first run dragen_wgts_qc (which now includes dragen_wts samples).

Updates by file: * Added abstract workflow type DRAGEN_WGTS_QC. * Added labmetadata rule method must_be_wgts * Handle WTS data and add enable_rna option. True for WTS data, false for WGS data * Set batcher runstep parameter to DRAGEN_WGTS_QC abstract workflow type * Replace must_be_wgs labmetadata rule with must_be_wgts * Update warning to use DRAGEN_WGTS_QC abstract workflow type

Now dragen wts happens after QC is complete. Instead of performing dragen_wts after bclconvert, we first run dragen_wgts_qc (which now includes dragen_wts samples).

alexiswl · 2023-06-06T06:54:46Z

Run WTS QC workflow through Portal before running Dragen WTS Tumor Only Workflow

victorskl

Looking good most of it, Alexis. Just that need to attend comment on Orchestrator on triggering cue.

data_processors/pipeline/domain/workflow.py

data_processors/pipeline/lambdas/dragen_wgs_qc.py

data_processors/pipeline/lambdas/orchestrator.py

\## workflow.py Added in dragen_wts_qc and dragen_wgts_qc into from_value class method \## dragen_wgs_qc.py Moved labmetadata improt to below try / except statement. \## orchestrator.py Split out WGS_QC and WTS_QC notification handling.

Dragen WTS now runs after dragen wts qc. Use similar logic to t/n runs (that work off a metadata list) rather than a batcher (as used in the qc runs). Updates by file \## orchestration/dragen_wts_step \### Perform function * Updated perform function to check if all dragen_wts_qc workflows have finished for a given sequencing run * Invoke prepare_dragen_wts_jobs with a metadata list instead of a batcher object. \### prepare dragen wts jobs function Use similar (but simpler) logic to preparing of tn workflows Collect fastq list rows by the library id input (grouped by library-id / subject-id). And create the job dict with just the subject id, library id and fastq list rows object Rather than use a labmetadatarule we use a query set (see metadata_srv changes for more info). \## orchestration/tumor_normal_step * Move handle rerun function to liborca * Also fixed return in docstring \## tools/liborca * Collected handle rerun function from tumor normal step \## services/metadata_srv * Created queryset for collecting wts metdata by the wts qc runs \## pipeline/lambdas/dragen_wts * Handle new job input * Remove all references to sequence_run and batch_run * Assume subject Id is in the event dict

victorskl

LGTM, few more minor comments to attend.

data_processors/pipeline/orchestration/dragen_wts_step.py

data_processors/pipeline/orchestration/tumor_normal_step.py

…c-pipeline Resolved merge conflicts from arriba mem update

victorskl

Hey Alexis;

Could you please change PR to ready for review from draft, if you are ready..? This looks good so far from my side, minor comments to attend.

If that's all good I'll fix up the tests (assuming I've broken a few with these changes)

Do you still have bandwidth to follow up tests..?
If so, few pointers as follows...

Each python module file that has made changes; should have their test file counterpart in tests package.
First, you run each test file for unit test cases and fix where it broke, accordingly.
Optionally, run integration test cases there, if applicable and/or see fit.
Finally, run smoke test make test on the whole suite and, it must pass.

Here are starter, based on file changes in PR.

Test domain logic code changes

python manage.py test data_processors.pipeline.domain.tests.test_workflow

Test front controller (Lambda) entrypoints changes

python manage.py test data_processors.pipeline.lambdas.tests.test_dragen_wgs_qc
python manage.py test data_processors.pipeline.lambdas.tests.test_dragen_wts
python manage.py test data_processors.pipeline.lambdas.tests.test_orchestrator

Test orchestration step module middleware changes

python manage.py test data_processors.pipeline.orchestration.tests.test_dragen_wgs_qc_step
python manage.py test data_processors.pipeline.orchestration.tests.test_dragen_wts_step
python manage.py test data_processors.pipeline.orchestration.tests.test_tumor_normal_step

Test service layer module changes

python manage.py test data_processors.pipeline.services.tests.test_metadata_srv

data_processors/pipeline/lambdas/dragen_wgs_qc.py

data_processors/pipeline/lambdas/orchestrator.py

data_processors/pipeline/orchestration/__init__.py

alexiswl · 2023-07-26T05:40:42Z

Thanks Victor, I'll work on this tomorrow if you don't mind?

victorskl · 2023-07-26T06:16:10Z

Of course, go for it, please..!

data_processors/pipeline/lambdas/dragen_wgs_qc.py

data_portal/tests/factories: * Made new WtsQC and WgtsQC factories * lambdas/test/test_dragen_wgs_qc.py * Use TestConstant for library id since we need matches on labmetadata types * lambdas/test/test_dragen_wts.oy * Include subject id in event. * Remove seq run id and seq name, not immediately downstream of bclconvert * lambdas/tests/test_orchestrator.py * Exclude DRAGEN_WTS_STEP from tests post bclconvert run * lambdas/dragen_wgs_qc.py * Check labmetadata getter returns not none, otherwise raise value error * lambdas/dragen_wts.py * Use link_library_runs_with_x_seq_workflow over library_runs_with_workflow * lambdas/orchestartor.py * Set DRAGEN_WGTS_QC_STEP, DRAGEN_WGS_QC_STEP and DRAGEN_WTS_QC_STEP in the same 'skip-list' context * orchestration/tests/test_dragen_wts_step.py * Overhaul of test_dragen_wts_step since it is now invoked post qc

…owed list

alexiswl · 2023-08-01T11:42:28Z

Getting close!

======================================================================
ERROR: test_perform (data_processors.pipeline.orchestration.tests.test_dragen_wts_step.DragenWtsStepUnitTests.test_perform)
python manage.py test data_processors.pipeline.orchestration.tests.test_dragen_wts_step.DragenWtsStepUnitTests.test_perform
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/alexiswl/UMCCR/GitHub/data-portal-apis/data_processors/pipeline/orchestration/tests/test_dragen_wts_step.py", line 95, in test_perform
    self.assertEqual(results['submitting_subjects'][0], wts_mock_subject_id)
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

======================================================================
ERROR: test_perform (data_processors.pipeline.orchestration.tests.test_somalier_extract_step.SomalierExtractStepUnitTests.test_perform)
python manage.py test data_processors.pipeline.orchestration.tests.test_somalier_extract_step.SomalierExtractStepUnitTests.test_perform
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/alexiswl/UMCCR/GitHub/data-portal-apis/data_processors/pipeline/orchestration/tests/test_somalier_extract_step.py", line 88, in test_perform
    self.assertIn('index', results['somalier_extract_step'][0])
                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

----------------------------------------------------------------------
Ran 315 tests in 46.693s

FAILED (errors=2, skipped=40)
Destroying test database for alias 'default'...
make: *** [Makefile:65: test] Error 1

alexiswl · 2023-08-02T04:07:34Z

@victorskl would you mind having a look at these? I couldn't figure out the changes needed to get the test perform working and I think the orchestration logic might not be handing the DRAGEN_WGTS_QC step?

victorskl · 2023-08-02T05:19:31Z

Ok, I will pull and check in tick.

victorskl · 2023-08-03T07:14:57Z

Hi Alexis; pls see #607 for

python manage.py test data_processors.pipeline.orchestration.tests.test_dragen_wts_step.DragenWtsStepUnitTests.test_perform

TL;DR - it just need mock WTS tumor Lab meta info <> LibraryRun <> Workflow has to link up / match them up well for test condition.

victorskl · 2023-08-03T07:16:23Z

I will check for somalier step, next..

victorskl · 2023-08-03T23:28:20Z

Ah this one is easy

python manage.py test data_processors.pipeline.orchestration.tests.test_somalier_extract_step.SomalierExtractStepUnitTests.test_perform

Just roll back changes here

data-portal-apis/data_processors/pipeline/orchestration/tests/test_somalier_extract_step.py

Line 67 in 55642c0

mock_wgc_qc_workflow: Workflow = DragenWgtsQcWorkflowFactory()

... to

mock_wgc_qc_workflow: Workflow = DragenWgsQcWorkflowFactory()

Explain

We don't really save to db with a workflow type WorkflowType.DRAGEN_WGTS_QC yet
It is still a "placeholder" or "common" workflow type to process together both WGS/WTS alignment QC
So, we should just still keep this unittest case as is. Which is, originally unit testing WorkflowType.DRAGEN_WGS_QC scenario and, still valid test case.

See #597 (comment) for more info

alexiswl · 2023-08-07T06:09:17Z

Woo!

...
----------------------------------------------------------------------
Ran 315 tests in 48.655s

OK (skipped=40)
Destroying test database for alias 'default'...

victorskl

Go for it, Alexis..!

alexiswl added 2 commits June 6, 2023 16:46

Updated orchestration for WTS

5eb62f3

Now dragen wts happens after QC is complete. Instead of performing dragen_wts after bclconvert, we first run dragen_wgts_qc (which now includes dragen_wts samples).

alexiswl requested a review from victorskl June 6, 2023 06:51

victorskl assigned alexiswl Jun 6, 2023

victorskl added the feature New feature label Jun 6, 2023

victorskl added this to the Release 2.2 milestone Jun 6, 2023

victorskl reviewed Jun 6, 2023

View reviewed changes

data_processors/pipeline/lambdas/orchestrator.py Outdated Show resolved Hide resolved

alexiswl added 2 commits June 7, 2023 10:48

Miscell updates per PR review

1186883

\## workflow.py Added in dragen_wts_qc and dragen_wgts_qc into from_value class method \## dragen_wgs_qc.py Moved labmetadata improt to below try / except statement. \## orchestrator.py Split out WGS_QC and WTS_QC notification handling.

victorskl reviewed Jun 8, 2023

View reviewed changes

data_processors/pipeline/orchestration/dragen_wts_step.py Outdated Show resolved Hide resolved

data_processors/pipeline/orchestration/dragen_wts_step.py Outdated Show resolved Hide resolved

data_processors/pipeline/orchestration/tumor_normal_step.py Outdated Show resolved Hide resolved

alexiswl added 2 commits June 8, 2023 20:41

Fixed typos and moved _handle_rerun function to orchestration init

7f4f03c

Merge remote-tracking branch 'origin/dev' into enhancement/wts-auto-q…

ce0c746

…c-pipeline Resolved merge conflicts from arriba mem update

victorskl reviewed Jul 25, 2023

View reviewed changes

data_processors/pipeline/lambdas/dragen_wgs_qc.py Show resolved Hide resolved

data_processors/pipeline/lambdas/orchestrator.py Show resolved Hide resolved

data_processors/pipeline/orchestration/__init__.py Show resolved Hide resolved

victorskl reviewed Aug 1, 2023

View reviewed changes

data_processors/pipeline/lambdas/dragen_wgs_qc.py Show resolved Hide resolved

alexiswl added 2 commits August 1, 2023 21:22

Added return statement after TN step, added DRAGEN_WTS_QC step to all…

55642c0

…owed list

Fixed unittest for orchestration WTS step changes

8f7527f

Updated somalier extract step to use just WGSQCFactory

dd8ac9a

See #597 (comment) for more info

alexiswl marked this pull request as ready for review August 7, 2023 06:08

victorskl approved these changes Aug 7, 2023

View reviewed changes

alexiswl merged commit bffcb8d into dev Aug 7, 2023
1 check passed

victorskl deleted the enhancement/wts-auto-qc-pipeline branch August 8, 2023 05:18

victorskl mentioned this pull request Jun 12, 2024

Can not re-trigger Alignment QC from the orchestrator batch run point #691

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run QC for WTS alongside WGS #597

Run QC for WTS alongside WGS #597

alexiswl commented Jun 6, 2023

alexiswl commented Jun 6, 2023

victorskl left a comment

victorskl left a comment

victorskl left a comment

alexiswl commented Jul 26, 2023

victorskl commented Jul 26, 2023

alexiswl commented Aug 1, 2023

alexiswl commented Aug 2, 2023

victorskl commented Aug 2, 2023

victorskl commented Aug 3, 2023

victorskl commented Aug 3, 2023

victorskl commented Aug 3, 2023

alexiswl commented Aug 7, 2023

victorskl left a comment

Run QC for WTS alongside WGS #597

Run QC for WTS alongside WGS #597

Conversation

alexiswl commented Jun 6, 2023

cba7478

data_processors.pipeline.domain.workflow.py

data_processors.pipeline.lambdas.dragen_wgs_qc.py

data_processors.pipeline.orchestartion.dragen_wgs_qc_step.py

5eb62f3

alexiswl commented Jun 6, 2023

victorskl left a comment

Choose a reason for hiding this comment

victorskl left a comment

Choose a reason for hiding this comment

victorskl left a comment

Choose a reason for hiding this comment

alexiswl commented Jul 26, 2023

victorskl commented Jul 26, 2023

alexiswl commented Aug 1, 2023

alexiswl commented Aug 2, 2023

victorskl commented Aug 2, 2023

victorskl commented Aug 3, 2023

victorskl commented Aug 3, 2023

victorskl commented Aug 3, 2023

alexiswl commented Aug 7, 2023

victorskl left a comment

Choose a reason for hiding this comment

`cba7478`

`5eb62f3`