Replies: 9 comments
-
it's important to distinguish between obs which are not assimilated and obs which are assimilated but have no impact on the model state. in your discussion you conflate the two. i believe dart currently correctly marks with a QC any obs which are not assimilated. whether it is worth doing anything for obs which are assimilated but have no impact on the state is a discussion topic. however it needs to be consistent for all obs which have no impact on the state, not just a selected subset. |
Beta Was this translation helpful? Give feedback.
-
@nancycollins, please point to the comments that conflate the 2 and I'll try to fix them. |
Beta Was this translation helpful? Give feedback.
-
this right here. if the obs is processed in the SEQUENTIAL_OBS loop, it's being assimilated. |
Beta Was this translation helpful? Give feedback.
-
i'll add a last comment and then give up here. i think the best way to handle this is to do as wrf and mpas do, which is to preprocess the obs_seq file before running filter. in those preprocess programs they do many things, such as: remove obs outside the model area, increase obs errors for obs near the boundaries to lessen their impact on the state, superob obs which are too dense, and anything else that improves the observations passed into filter. i think removing the high obs before filter sees them is the best way to handle this. giving them to filter to assimilate and then saying "no, just kidding" doesn't make sense. i don't see this as missing functionality in filter given there is an existing solution which follows what other models already do. |
Beta Was this translation helpful? Give feedback.
-
I agree that a preprocessor is a fine solution to this example. It seems to me that an exclusion by a preprocessor namelist would be robust, My hope was that this broad overview issue will help people be aware of the options |
Beta Was this translation helpful? Give feedback.
-
Yes it is, despite the user trying to exclude it using a variable named "no_obs_assim_above_level". |
Beta Was this translation helpful? Give feedback.
-
very very good point about changing the name of that cam namelist item and the internal cam model_mod variable. it used to make the forward operator fail which did prevent assimilation. now that the functionality moved to get_close it prevents impact. your suggested new name is very apt and should be changed in a pull request along with the code change to where high obs are handled. otherwise namelists are irrelevant to anything else in this discussion. there's no reason a model_mod shouldn't use them anywhere they wish. good variable names are always a plus, however! :) if you think additional documentation is needed i'd suggest this paragraph be added to the end of the docs for both get_close_state() and get_close_obs(): To prevent an individual assimilated observation from impacting the model state, get_close_state can return 0 close items. For consistent results both get_close_state and get_close_obs should return 0 for the same observations. Returning 0 close items will have no affect on the DART Qualtity Control (QC) value. For additional ways to control the assimilation and impact of observations see the documentation for assimilate_these_obs_types and evaluate_these_obs_types in the obs_kind_nml (i intentionally did not say the QC will be 0. it could be set to a different value if posterior FO fails or if the obs value fails the outlier threshold, etc. also, i have no idea how to insert links to the other pages in the docs, sorry!) |
Beta Was this translation helpful? Give feedback.
-
i will add this comment here so there is a record of it. during one of the cam model top discussions, jeff mentioned that if we aren't impacting the state at the model top, then should the interpolate routine really return expected obs values for high locations? if the answer is no then it really complicates forward operator routines which need to query for a column of values. in the past we've always done it by looping over model levels from 1-N. for cam level 1 is at the model top. if it fails there's no way to know the first "good" model level to start at. for now i think the interpolate routine has to return good expected values for the entire model domain but there is a science question here. then there is a difficult implementation question on how to make column values work if we choose to fail interp at some model levels. |
Beta Was this translation helpful? Give feedback.
-
Preprocessor to remove high obs for cam-fv added to dart in #480 |
Beta Was this translation helpful? Give feedback.
-
Summary of the Issue
This discussion stems from #401, which is focused more on possibly incorrect or misleading QC values ending up in obs_seq_final files. Here the focus is on the benefits and drawbacks of methods for enabling users to exclude obs that are in obs_seq.out files which they want to use. The focus is not on handling observations which the user intends to use, but cannot be used for reasons that are not apparent until the assimilation is run. These methods include excluding obs:
Summary of the discussion (@kdraeder view, as of 2022-10-18)
+ means an advantage
- means a disadvantage
1. When creating obs_seq.out files
+ Completely clear which obs the creator might like to be used.
+ Smaller obs_seq.out files.
+ Use obs_sequence_tool (and ... ?) to exclude many classes of obs.
+ Can exclude based on horizontal location (limited area models) and many other characteristics.
- More sets of obs_seq.out files, with varying amounts of redundant content.
- Will require more scripts (or scripts with more, flexible options) to make the larger collection of obs_seq sets. This raises the question of how much we will support and advertize these scripts.
- obs_sequence_tool may not have all of the options a user wants. The rest will need to be done by scripting( ?).
2. By preprocessing obs_seq.out to exclude obs (before a job or as part of the cycle)
+ Starts with a pre-existing, comprehensive, obs_seq.out file, so obs types don't need to be gathered each time an variant set of observations is needed.
+ The preprocessor program can use 3. and 4 (below).
+ Can exclude based on horizontal location (limited area models) and many other characteristics.
+ Avoids partially redundant sets of obs_seq files created in 1.
+ But the variants could be saved for repeated use.
+ Can be done by fortran and scripting, whichever is most efficient.
- If run as part of the assimilation cycle, it adds to the run time. We expect this to be small, but will vary by case. It may cause more queue waiting time for the biggest jobs.
- It may recreate obs_seq files that have been created before (if they weren't saved).
- If an exclusion is done by fortran, changes require recompiling (and testing) the preprocessor.
3. By namelist variable (typically in model_mod.nml),
+ No recompilation is needed to change the exclusion, which is useful during experiment development and debugging.
+ It's convenient for users (if it's already installed).
+ It's recorded in the archived namelist.
- May not be able to pass the reason (QC) for the exclusion to filter or obs_seq.final.
This is already true for many or most kinds of model-specific exclusions.
This may distort the obs space diagnostics (e.g. showing that some obs, which the user intended to exclude,
appear to have been assimilated, but caused no increments).
One exception is obs_kind_nml: assimilate_these_obs and evaluate_these_obs.
Obs types which are not listed in either of those are excluded and given QC=5, which is written to the obs_seq.final.
Some other exclusions generate "failed forward operator" (QC=4), which is also written to obs_seq.final.
- Adding exclusions requires code modification and recompiling (and testing). Removing an exclusion can be done using a special value.
4. By fortran code (typically in model_mod.f90)
+ In some cases the exclusion may generate a meaningful QC value which is passed to obs_seq.out; if it happens at a point where data is in sync, e.g. in model_interpolate. The QC value probably won't identify exactly why the exclusion happened.
+ Documentation may be easier; the exclusion is always done in the same way.
- Changing the exclusion requires recompiling (and testing).
- In some cases the exclusion may not be able to pass a QC value to obs_seq.out; if it's implemented at a point in the code where there has been no sync of the tasks since distributed calculations have been done. E.g. in get_close_obs.
Nancy adds the idea:
Beta Was this translation helpful? Give feedback.
All reactions