Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the timestep to be set from the model specification #48

Merged
merged 7 commits into from
Jan 9, 2020

Conversation

jzuhone
Copy link
Collaborator

@jzuhone jzuhone commented Dec 14, 2017

Most models have a dt of 328 seconds as the default. The ACIS DEA Housekeeping telemetry (most importantly the focal plane temperature) has a much higher time resolution than this.

Currently in xija even if a smaller dt is specified in the model specification it is not changed when the model is run. This PR makes some modest changes to allow the timestep to be ingested from the model specification. In most cases, models will set the dt to 328 seconds, and the code here ensures that the default does not change.

@jzuhone
Copy link
Collaborator Author

jzuhone commented Dec 14, 2017

@taldcroft, @jeanconn, @matthewdahmer I'm not able to request you as reviewers for some reason, but would like you to have a look at this one.

@jeanconn
Copy link
Contributor

@jzuhone Do you have ideas for testing? Should there be a xija module test that uses a new dt value?

@taldcroft
Copy link
Member

@jzuhone - we need to be careful. It's been a long while, but I seem to recall that there were some subtle (or maybe not so subtle) problems when changing the time step. There was certainly a reason that I put "do not change" in the docstring, and Jean is correct that this needs careful testing using all of the models.

The original idea behind the 328 seconds was that thermally speaking nothing significant happens on time scales faster than that. That is true for all our existing models, but I suppose if you are talking about temperatures on a single electronics board then that might not be the case.

The 328 is also tied to the use of 5min (which is really 328 sec) eng archive telemetry in calibration.

@jzuhone
Copy link
Collaborator Author

jzuhone commented Sep 5, 2018

@taldcroft @jeanconn can we revisit this discussion? The current paradigm of turning off all of the FEPs in the DPA is resulting in the temperatures of the FEPs decreasing pretty rapidly for those short periods, and our models can't seem to reproduce those drops. I'm wondering that since these are single boards with better time resolution it would help if we could set dt to a smaller value.

The default would still be 328 seconds, so hopefully that should not affect any existing models. What kind of testing should we do?

@jzuhone
Copy link
Collaborator Author

jzuhone commented May 2, 2019

Hi @taldcroft, @jeanconn, @matthewdahmer:

So I think we should test this by implementing a smaller timestep for the DPA model, interpolating the model outputs to the 328-second values from an otherwise identical model run, comparing the results, and ensuring that the differences are within some accepted tolerance. The idea is that 328 seconds should be the default regardless, so we'd just need to make sure that for the other models the timestep and thus the previous answers do not change.

If someone (say me) would like to update the timestep for any other model in the future, I would propose that the onus is on them to demonstrate that reducing the timestep does not introduce undesirable effects, and that this would require bringing the new model forward for TWG approval. So it would only make sense to do this in the context of a more general recalibration.

It may have been that your note about not changing the timestep was that so it wouldn't be set to something larger than 328 seconds, perhaps? Just a thought. I can also add in a check for a dt greater than 328 seconds and make it throw a warning or even an error.

What do you all think?

@taldcroft
Copy link
Member

@jzuhone - sounds like a good starting point. There is also the converse of showing that reducing the step size actually does make a change where it is expected, in particular for a case of zero FEPs for a short (sub-300 sec) interval.

I think it would be worth trying shorter time steps in a number of models during testing for this patch so we can establish in general if there are any gotchas. I'm agreed approval for actually using a model operationally with a non-standard step size would require specific validation for that model.

@jzuhone
Copy link
Collaborator Author

jzuhone commented May 2, 2019

@taldcroft what was the week where this issue with the zero FEPs came up again originally for the DPA model?

@taldcroft
Copy link
Member

@jzuhone - Sorry, I don't remember, maybe Gregg does. There was a problem with loads failing thermal after a small change in timing.

@jeanconn
Copy link
Contributor

jeanconn commented May 2, 2019

I thought it came up in the context of APR0819T?

@jzuhone
Copy link
Collaborator Author

jzuhone commented May 4, 2019

What models do you both suggest I test against smaller timesteps besides ACIS models?

@matthewdahmer
Copy link
Contributor

Sorry about the late reply to this thread. The APR0819T vs APR0819A loads are the cases that brought this issue to the forefront for us recently. In the T loads, there was 161 seconds of cooling that was not being accounted for, but was in the A loads, due to the timing of the cooling dwell. In the T loads, it fell entirely within the 328 second window, so it wasn't picked up. In the A loads it crossed a 328 second boundary, so it was counted as 328 seconds.

I am still a little confused as to why the time step needs to be built into the model spec itself, unless it serves as a baseline value to interpret the other parameters, similar to a bias (i.e. if the solarheat values would need to be scaled for different time steps). Ideally, the time step should be able to be user modified, independent of the model definition, without significantly changing the overall performance of the model over long-ish time periods, excluding cases with short events. I do see the need to explicitly set a maximum time step to ensure the time step isn't set too large and to ensure this isn't used as a tool to make a schedule work when it shouldn't. Perhaps the time step used for each model could be a reported metric included in the load products/email, subject to review for each schedule.

Given the case highlighted by the APR0819 loads is probably more common than we have realized, I'd like to use a much smaller time step for all models if it doesn't significantly affect tool performance. I can run 6 months of dwell history very quickly (in a pure python environment), so hopefully a 10x increase in resolution would not significantly affect Matlab FOT Tools performance, but perhaps it would for reasons I don't grasp at the moment.

As for other models that could benefit from smaller time steps, I'd suggest the MUPS models as they move the fastest. I'm having trouble refitting these models to account for the rapid cooling that occurs in the deep PLINE region so I'd like to see if a reduced time step could help. I am inclined to believe it won't, but it is worth a check.

@matthewdahmer
Copy link
Contributor

matthewdahmer commented May 6, 2019

I figured out why I was having trouble fitting the MUPS models in the deep PLINE region, Xija clips pitch to the 45 to 170 degree range. I submitted an issue: #58. I have a code fix on my machine and am in the process of testing it. This also solves a minor mystery I observed when fitting the pline models, where the eclipse errors seemed off.

@taldcroft
Copy link
Member

I think a good reason for having the dt set in the model spec is because the "best" dt is dependent on the model. E.g. the IPS tank is very slow in all respects so there is no need for a short time step, while DPA can respond on time scales of a minute. So this should be encoded in the production model spec by the model developer so downstream apps / users don't need to remember. Of course users are still free to override the model default for whatever reason.

@jzuhone
Copy link
Collaborator Author

jzuhone commented Dec 6, 2019

@taldcroft @matthewdahmer I'm coming back to this. Aside from making the 328 seconds a module constant (which I did here), I am having trouble remembering what else we said was needed to push this forward.

I looked at the notes in the twiki for that meeting (TWG 8/20/19), but the only two things I found that might be related to action items are these:

  • There are significant differences between the 328s and 60s timestep results, with a ballpark rms of 0.2
  • Use multiples of 8.2s for this time step

The first seems like something to check, but I am not sure about the second one. Do we enforce that timesteps should be multiples of 8.2 s in xija or do we just ask modelers to do this?

@taldcroft
Copy link
Member

@jzuhone - I would say that xija should enforce this, and issue a warning if the "corrected" dt is different from the requested one by more than something small (1e-8).

@matthewdahmer
Copy link
Contributor

matthewdahmer commented Dec 9, 2019

@jzuhone - I don't believe the prediction differences are a surprise, so this isn't an issue. I agree that multiples of 8.2s should be enforced by Xija since it seems reasonable to always have predictions regularly line up with the 328 second standard time step. That being said, in order to have predictions regularly line up with the 328 second mark every 328 seconds, an additional check to ensure 328 is evenly divisible by the time step is also required, as simply enforcing a multiple of 8.2 seconds won't guarantee this.

In fact, this means that time steps can only be 8.2, 16.4, 32.8, 41.0, 65.6, 82.0, 131.2, 164.0, or 328.0 seconds (8.2 times all the combinations of factors of 40, since 328/8.2 is 40).

@taldcroft, @jzuhone - If you don't believe ensuring predictions regularly line up at the 328 second step mark every 328 seconds is a requirement, then please let me know.

@taldcroft
Copy link
Member

taldcroft commented Dec 9, 2019

In fact, this means that time steps can only be 8.2, 16.4, 32.8, 41.0, 65.6, 82.0, 131.2, 164.0, or 328.0

I can't say for sure if there is a problem with times other than these. All things being equal, maybe let's just start off requiring:

  • dt is a multiple of 8.2
  • 328 / dt is an integer

By my calculation that gives 328. , 164. , 82. , 65.6, 41. , 32.8, 16.4, 8.2 as allowed timesteps. (Same as Matt's list without 131.2).

@jzuhone
Copy link
Collaborator Author

jzuhone commented Jan 9, 2020

I've implemented this such that only certain timesteps can be chosen, namely the ones that @taldcroft mentioned above. Then I ran thermal models for a bunch of these timesteps and compared the results:

https://gist.github.com/471c1d8eac25aa098226ed4c58d22aeb

Per the discussion earlier this week I don't think that we should expect the model temps to exactly match on the 328 s intervals given that the odd/even issue is going to be different for the different steps.

I think this is good to go.

@matthewdahmer
Copy link
Contributor

I know this note is a bit late, but with respect to the 131.2 possible time step, this was included in error. @taldcroft thank you for pointing out that it does not divide into 328 evenly.

xija/model.py Outdated Show resolved Hide resolved
@taldcroft taldcroft merged commit 80fa150 into sot:master Jan 9, 2020
@taldcroft
Copy link
Member

@jeanconn - while we're at it with the mongo 2020.1 ska3 release, shall we toss this in?

@jzuhone
Copy link
Collaborator Author

jzuhone commented Jan 9, 2020

Also, I just realized--what do we have to do now to set the version? Is this done by tagging now?

@jeanconn
Copy link
Contributor

jeanconn commented Jan 9, 2020

Yes, the tags control the version... though preferred to make the tag with a github release.

@jeanconn
Copy link
Contributor

jeanconn commented Jan 9, 2020

Regarding the PR, I had an open question about testing. Do we still need some xija tests with different dts and then are the ACIS model runs enough for integration testing?

@jzuhone jzuhone deleted the dt_from_spec branch January 10, 2020 16:21
@jeanconn jeanconn mentioned this pull request Jan 10, 2020
22 tasks
@jeanconn
Copy link
Contributor

Not really an issue, but I note that there's a lot of printing of the dt during the module tests:

xija/tests/test_models.py::test_dpa Using dt = 328 s.
PASSED
xija/tests/test_models.py::test_plotdate Using dt = 328 s.
PASSED
xija/tests/test_models.py::test_data_types Using dt = 328 s.
Using dt = 328 s.
Using dt = 328 s.
Using dt = 328 s.
Using dt = 328 s.
Using dt = 328 s.
Using dt = 328 s.
PASSED
xija/tests/test_models.py::test_minusz Using dt = 328 s.
PASSED
xija/tests/test_models.py::test_pftank2t Using dt = 328 s.
PASSED
xija/tests/test_models.py::test_multi_solar_heat_values Using dt = 328 s.
Using dt = 328 s.
PASSED

@jzuhone
Copy link
Collaborator Author

jzuhone commented Jan 24, 2020

@jeanconn we can send this message to the logger instead in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants