-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filters not included for all channels #157
Comments
There was a hardcoded assumption in get_inventory_from_df that only one valid value of channel instance should be added to the list of returned channels from a station. This is not only false in the case of wildcards (e.g. *F*, *Q*), it is also false when a channel has been broken into multiple runs. CAS04 is a good example of this. See mth5 issue #157, and aurora 277 for more details.
This change resulted in obtaining TFs that are in good agreement with SPUD on the CAS04 tests as well as no rows with zero filters in the channel_summary, so I think we want to keep the change. @laura-iris : Could you please also take a look at the logic in Some notes are in issue #157 and aurora issue 277 |
This reminds me a lot of some work I did last year, and when I was looking through my code and notes I see that what appears to be the exact same function used to live in make_mth5.py instead of fdsn.py. I had worked on looping over the epochs and simplifying some of the logic at that point. It seems like the changes weren't translated over to fdsn.py. @kkappler can you take a look at this old merge request and let me know if it also looks like the same code? #106 |
@laura-iris This does look like the same code (cleaned up and fixed already!) It looks like it makes sense to try and replace the existing I don't completely understand it, but here is what it looks like happened: In an effort to generalize make_mth5.py so that is can get data from clients other than earthscope/IRIS, @kujaku11 modified make_mth5.py to use a "client plugin model". In the current implementation there are two client plugins, The factoring of fdsn.py out of make_mth5.py was started on Feb 4, 2022, on the However, your fix to issue #105 was merged into master on August 19, 2022 (while clients branch was still in development). At this point in time however, It looks like what happened is that somehow during this generalization process an older version of the data wrangling wound up in fdsn. It also looks like this had already gone pete tong on |
@laura-iris I took a look at the modifications you made to make_mth5 and have blended those into fdsn.py, together with some clean up I did as well. I kept your structure of using I'll commit to |
Re-add methods: - _loop_stations - _run_010 - _run_020 - _process_list These functions were lost in a merge sometime last year. Some additional notes are in issue #157
There is a bug in |
Substantially modified the code (but not the ultimate approach) in mth5/clients/fdsn.py. Replaced nested loop logic with alternative approach
This should be much easier to debug in future. The only complexity is this idea that you can have namespace clashes with @laura-iris Can you take a look at my latest commit and see if it makes sense to you? |
@laura-iris and I discussed this today, the logic reproduces the old code, BUT, still does not generically support wildcards ... consider the case if a network code has multiple instances, in different time periods, then we only receive the zeroth instance. This should be handled in a later SOW, and the logic for handling should be in either validate dataframe or in in the network handling (build_network_dict) where multiple start-times for a network (station) would be handled |
@kujaku11 This is seems to be because we need an additional operation in fdsn.py I am not sure where the best place to put it is ... here is a snapshot of the main flow of make_mth5_from_fdsn_client In this case, the inventory has only 5-channels (they never changed): but the streams implies two runs:
So we need to do one of the following:
I think that we can take a tack like (B) with a function called: repair_missing_filters(m) |
@kujaku11 almost, but the both the first and second runs are named 001 Almost certain we just need to pop the "id" key from metadata dict before overwriting -- working on fix now |
Fixed with update of FDSN |
This relates to aurora issue 252 Wide scale testing on Earthscope.
In an effort to better understand disagreements in TFs between aurora and SPUD, a closer look at station CAS04 was undertaken. CAS04 does result in an incorrect TF, and the reason appears to be because not all channel-runs are associated with filters in the mth5.
I verify that this problem can be reproduced by running the jupyter notebook:
mth5/docs/examples/notebooks/make_mth5_driver_v0.2.0.ipynb
I modified the notebook on
fc
branch, adding two cells. Right after the thechannel_summary_df
is created, an iterrows operation is performed, and for each row of the channel_summary, the number of filters is counted, and recorded in a new column "n_filters".Three entries in that table have no filters, channel
ey
in runc
and channelsex
,ey
in rund
. We may have encountered this before in aurora issue 31, however, I believe we worked around it with a custom XML provided by Tim. The metadata at IRIS look OK now though.Using IRIS data services, we get this xml for which I manually checked that there do appear to be filters associated with every channel.
So, something is likely going wrong in the mth5 build. As an additional hint, when I build the "dataless" mth5 for CAS04, rather than having the expected number of rows (20, 5ch x 4 runs) there are 17 rows, and the missing rows correspond to exactly the rows that have no filters in the mth5.
The text was updated successfully, but these errors were encountered: