-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KPIS] Fix test for KPI45 sporadically failing #352
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Seen on STAGING (merged by @anchit-chandran 5 minutes and 24 seconds ago) Please check your changes! |
anchit-chandran
added a commit
that referenced
this pull request
Nov 16, 2024
Signed-off-by: anchit-chandran <anchit97123@gmail.com>
anchit-chandran
added a commit
that referenced
this pull request
Nov 18, 2024
* add indiv pt kpi types Signed-off-by: anchit-chandran <anchit97123@gmail.com> * update vscode ignore to just settings.json Signed-off-by: anchit-chandran <anchit97123@gmail.com> * implements vscode debugger Signed-off-by: anchit-chandran <anchit97123@gmail.com> * refactor to new individual pts kpi results Signed-off-by: anchit-chandran <anchit97123@gmail.com> * very basic refactor to implement new results obj Signed-off-by: anchit-chandran <anchit97123@gmail.com> * separate debugpy into completely separate override so we don't accidentally mix anything with prod. need slightly different env vars + server run cmds Signed-off-by: anchit-chandran <anchit97123@gmail.com> * add fix for csv heading pdu field no longer having model Signed-off-by: anchit-chandran <anchit97123@gmail.com> * fix csv fn Signed-off-by: anchit-chandran <anchit97123@gmail.com> * skip single pt test for now Signed-off-by: anchit-chandran <anchit97123@gmail.com> * unused imports Signed-off-by: anchit-chandran <anchit97123@gmail.com> * apply same fix from #352 for test kpi 44 Signed-off-by: anchit-chandran <anchit97123@gmail.com> * forgot to remove old Median class, instead calc using same fix as calc 45 Signed-off-by: anchit-chandran <anchit97123@gmail.com> * `string` is the dtype in pandas https://pandas.pydata.org/docs/user_guide/text.html Signed-off-by: anchit-chandran <anchit97123@gmail.com> * handle converting to right dtype with null vals Signed-off-by: anchit-chandran <anchit97123@gmail.com> * bp vals will exceed bounds of Int8 (signed bounds -128 to 127) Signed-off-by: anchit-chandran <anchit97123@gmail.com> * working csv fn Signed-off-by: anchit-chandran <anchit97123@gmail.com> * refactor mutex check to enable test Signed-off-by: anchit-chandran <anchit97123@gmail.com> * adds csv generator upload integration test Signed-off-by: anchit-chandran <anchit97123@gmail.com> * mock remote services Signed-off-by: anchit-chandran <anchit97123@gmail.com> * adds doc strings Signed-off-by: anchit-chandran <anchit97123@gmail.com> * refactor single pt kpi outcomes table Signed-off-by: anchit-chandran <anchit97123@gmail.com> * dont actually need nhs number here as for this object to have been made, requires a Patient object. so just use that Patient object input Signed-off-by: anchit-chandran <anchit97123@gmail.com> * rm nhs number Signed-off-by: anchit-chandran <anchit97123@gmail.com> * single pt table tweaks Signed-off-by: anchit-chandran <anchit97123@gmail.com> * update types Signed-off-by: anchit-chandran <anchit97123@gmail.com> * ASSERT TYPES Signed-off-by: anchit-chandran <anchit97123@gmail.com> * refactor into own module Signed-off-by: anchit-chandran <anchit97123@gmail.com> * adds test calc kpis single pt Signed-off-by: anchit-chandran <anchit97123@gmail.com> * rename test Signed-off-by: anchit-chandran <anchit97123@gmail.com> * adds docs re debugger Signed-off-by: anchit-chandran <anchit97123@gmail.com> * rm breakpoint Signed-off-by: anchit-chandran <anchit97123@gmail.com> --------- Signed-off-by: anchit-chandran <anchit97123@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: anchit-chandran anchit97123@gmail.com
Overview
A few things were going on here, and was a fun rabbit hole!
TL/DR : main things contributing to the random failing were:
Patient.diagnosis_date
All have been fixed. Run the test many times as it should now be fine (importantly, no random values are being used now, so one pass = deterministic!).
Fun details on above:
1. Floating point precision
We have our
Visit.hba1c
field defined as aDecimalField
. The actual kpi calculation method querying this outputs a pythonDecimal( )
object. My test was just usingfloat( )
objects. Good article on difference: https://www.laac.dev/blog/float-vs-decimal-python/ but generally:float
s are good for speed & convenience, use the classicIEEE 754
binary representation, can lead to rounding precision errorsDecimal
, somehow usesbase-10
representation`, but means it can accurately represent decimal numbers without the traditional floating point errorsThis was partly leading to errors as tests are comparing different types of values. Fixed by converting expected values into
Decimal
. The output of thecalculate_kpi_45_median_hba1c
still returns afloat
though for convenience2. SQL median is weird
There is no standard implementation of median in Django's ORM because it's more complicated to calculate. I created a custom SQL class for it but that ended up being harder to reason with. Instead, did a simpler approach of just using ORM to retrieve a list of hba1c values, and then calculate the median in Python. Much more straightforward.
3. Accidentally using random values for
Patient.diagnosis_date
Rookie error! In the tests, just forgot to set
PatientFactory.diagnosis_date
. Expected behavior of thePatientFactory
is to generate random valid values for any not given (to make it easier to createPatient
objects).Fixed by just defining.
Code changes
Decimal()
values for hba1c instead offloat
diagnosis_date
, thus not relying on thePatientFactory()
generating a randomdiagnosis_date
calculate_median
implementation moved into Python instead of SQL, simplifying and circumventing harder-to-reason precision errorscalculate_kpi_45_median_hba1c
method more straightforward because of using a Python way of calculating mediancalculate_kpi_
methods toKPIResult
Related Issues
https://github.com/orgs/rcpch/projects/13?pane=issue&itemId=85211949&issue=rcpch%7Cnational-paediatric-diabetes-audit%7C338
Mentions
@mentions of the person or team responsible for reviewing proposed changes.