-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the timestep to be set from the model specification #48
Conversation
@taldcroft, @jeanconn, @matthewdahmer I'm not able to request you as reviewers for some reason, but would like you to have a look at this one. |
@jzuhone Do you have ideas for testing? Should there be a xija module test that uses a new |
@jzuhone - we need to be careful. It's been a long while, but I seem to recall that there were some subtle (or maybe not so subtle) problems when changing the time step. There was certainly a reason that I put "do not change" in the docstring, and Jean is correct that this needs careful testing using all of the models. The original idea behind the 328 seconds was that thermally speaking nothing significant happens on time scales faster than that. That is true for all our existing models, but I suppose if you are talking about temperatures on a single electronics board then that might not be the case. The 328 is also tied to the use of |
@taldcroft @jeanconn can we revisit this discussion? The current paradigm of turning off all of the FEPs in the DPA is resulting in the temperatures of the FEPs decreasing pretty rapidly for those short periods, and our models can't seem to reproduce those drops. I'm wondering that since these are single boards with better time resolution it would help if we could set The default would still be 328 seconds, so hopefully that should not affect any existing models. What kind of testing should we do? |
fe5b160
to
5473c20
Compare
fe5b160
to
5473c20
Compare
Hi @taldcroft, @jeanconn, @matthewdahmer: So I think we should test this by implementing a smaller timestep for the DPA model, interpolating the model outputs to the 328-second values from an otherwise identical model run, comparing the results, and ensuring that the differences are within some accepted tolerance. The idea is that 328 seconds should be the default regardless, so we'd just need to make sure that for the other models the timestep and thus the previous answers do not change. If someone (say me) would like to update the timestep for any other model in the future, I would propose that the onus is on them to demonstrate that reducing the timestep does not introduce undesirable effects, and that this would require bringing the new model forward for TWG approval. So it would only make sense to do this in the context of a more general recalibration. It may have been that your note about not changing the timestep was that so it wouldn't be set to something larger than 328 seconds, perhaps? Just a thought. I can also add in a check for a dt greater than 328 seconds and make it throw a warning or even an error. What do you all think? |
@jzuhone - sounds like a good starting point. There is also the converse of showing that reducing the step size actually does make a change where it is expected, in particular for a case of zero FEPs for a short (sub-300 sec) interval. I think it would be worth trying shorter time steps in a number of models during testing for this patch so we can establish in general if there are any gotchas. I'm agreed approval for actually using a model operationally with a non-standard step size would require specific validation for that model. |
@taldcroft what was the week where this issue with the zero FEPs came up again originally for the DPA model? |
@jzuhone - Sorry, I don't remember, maybe Gregg does. There was a problem with loads failing thermal after a small change in timing. |
I thought it came up in the context of APR0819T? |
What models do you both suggest I test against smaller timesteps besides ACIS models? |
Sorry about the late reply to this thread. The APR0819T vs APR0819A loads are the cases that brought this issue to the forefront for us recently. In the T loads, there was 161 seconds of cooling that was not being accounted for, but was in the A loads, due to the timing of the cooling dwell. In the T loads, it fell entirely within the 328 second window, so it wasn't picked up. In the A loads it crossed a 328 second boundary, so it was counted as 328 seconds. I am still a little confused as to why the time step needs to be built into the model spec itself, unless it serves as a baseline value to interpret the other parameters, similar to a bias (i.e. if the solarheat values would need to be scaled for different time steps). Ideally, the time step should be able to be user modified, independent of the model definition, without significantly changing the overall performance of the model over long-ish time periods, excluding cases with short events. I do see the need to explicitly set a maximum time step to ensure the time step isn't set too large and to ensure this isn't used as a tool to make a schedule work when it shouldn't. Perhaps the time step used for each model could be a reported metric included in the load products/email, subject to review for each schedule. Given the case highlighted by the APR0819 loads is probably more common than we have realized, I'd like to use a much smaller time step for all models if it doesn't significantly affect tool performance. I can run 6 months of dwell history very quickly (in a pure python environment), so hopefully a 10x increase in resolution would not significantly affect Matlab FOT Tools performance, but perhaps it would for reasons I don't grasp at the moment. As for other models that could benefit from smaller time steps, I'd suggest the MUPS models as they move the fastest. I'm having trouble refitting these models to account for the rapid cooling that occurs in the deep PLINE region so I'd like to see if a reduced time step could help. I am inclined to believe it won't, but it is worth a check. |
I figured out why I was having trouble fitting the MUPS models in the deep PLINE region, Xija clips pitch to the 45 to 170 degree range. I submitted an issue: #58. I have a code fix on my machine and am in the process of testing it. This also solves a minor mystery I observed when fitting the pline models, where the eclipse errors seemed off. |
I think a good reason for having the |
@taldcroft @matthewdahmer I'm coming back to this. Aside from making the 328 seconds a module constant (which I did here), I am having trouble remembering what else we said was needed to push this forward. I looked at the notes in the twiki for that meeting (TWG 8/20/19), but the only two things I found that might be related to action items are these:
The first seems like something to check, but I am not sure about the second one. Do we enforce that timesteps should be multiples of 8.2 s in xija or do we just ask modelers to do this? |
@jzuhone - I would say that xija should enforce this, and issue a warning if the "corrected" |
@jzuhone - I don't believe the prediction differences are a surprise, so this isn't an issue. I agree that multiples of 8.2s should be enforced by Xija since it seems reasonable to always have predictions regularly line up with the 328 second standard time step. That being said, in order to have predictions regularly line up with the 328 second mark every 328 seconds, an additional check to ensure 328 is evenly divisible by the time step is also required, as simply enforcing a multiple of 8.2 seconds won't guarantee this. In fact, this means that time steps can only be 8.2, 16.4, 32.8, 41.0, 65.6, 82.0, 131.2, 164.0, or 328.0 seconds (8.2 times all the combinations of factors of 40, since 328/8.2 is 40). @taldcroft, @jzuhone - If you don't believe ensuring predictions regularly line up at the 328 second step mark every 328 seconds is a requirement, then please let me know. |
I can't say for sure if there is a problem with times other than these. All things being equal, maybe let's just start off requiring:
By my calculation that gives 328. , 164. , 82. , 65.6, 41. , 32.8, 16.4, 8.2 as allowed timesteps. (Same as Matt's list without 131.2). |
I've implemented this such that only certain timesteps can be chosen, namely the ones that @taldcroft mentioned above. Then I ran thermal models for a bunch of these timesteps and compared the results: https://gist.github.com/471c1d8eac25aa098226ed4c58d22aeb Per the discussion earlier this week I don't think that we should expect the model temps to exactly match on the 328 s intervals given that the odd/even issue is going to be different for the different steps. I think this is good to go. |
I know this note is a bit late, but with respect to the 131.2 possible time step, this was included in error. @taldcroft thank you for pointing out that it does not divide into 328 evenly. |
@jeanconn - while we're at it with the mongo 2020.1 ska3 release, shall we toss this in? |
Also, I just realized--what do we have to do now to set the version? Is this done by tagging now? |
Yes, the tags control the version... though preferred to make the tag with a github release. |
Regarding the PR, I had an open question about testing. Do we still need some xija tests with different dts and then are the ACIS model runs enough for integration testing? |
Not really an issue, but I note that there's a lot of printing of the dt during the module tests:
|
@jeanconn we can send this message to the logger instead in the next release. |
Most models have a
dt
of 328 seconds as the default. The ACIS DEA Housekeeping telemetry (most importantly the focal plane temperature) has a much higher time resolution than this.Currently in xija even if a smaller
dt
is specified in the model specification it is not changed when the model is run. This PR makes some modest changes to allow the timestep to be ingested from the model specification. In most cases, models will set thedt
to 328 seconds, and the code here ensures that the default does not change.