-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequency concatenation yields wrong data #1102
Comments
Data are available on google drive: https://drive.google.com/drive/folders/1BzNo8fa2yeS8kbO7MoNSPY8pY1FS_j8U |
So after a bit of poking around with the data, I was able to replicate the issue, and I think get to the heart of the issue at play. The long and short of this particular case is that there was a mix of concats across both the time-baseline and frequency axes, and after concatenating across the baseline access, the ordering of the baselines was not necessarily the same as it was prior to adding the two This actually leads to a more significant issue: I would have expected that
I think adding a check for this would be relatively simple, that could throw an error when the ordering for these items is different. One alternative would be to make |
Does the baseline ordering vary between input datasets or are the |
@david-macmahon -- the ordering is in fact not different between them, but some files contain different subsets of times or baselines (my guess being the former), which causes |
@kartographer, your guess is exactly right. The first clue was that the file sizes are significantly different. Further examination of the two Not sure if this is originally caused by a "dropped/missing data" problem in the data capture code or mis-matched scan lengths between files. If the former, then the software correlator code should be more robust to dropped/missing data (and the data capture code improved, of course!). In any case it would be good for pyuvdata to either "do the right thing" (whatever that is in this case?) or refuse to combine these mismatched datasets. Either would be preferable to outputting invalid datasets. |
thanks for this report @wfarah, we definitely need to fix this. @kartographer I think we should add the check you're suggesting and we should also check to see if this problem is present in the |
@bhazelton -- sounds good, I'll have a PR for someone to review in just a few minutes... |
@bhazelton -- taking a brief look through |
Ok, I'll make issues for those objects so they can be tracked separately. |
Using
pyuvdata==2.2.4
I’m trying to concatenate
uvh5
files in frequency, and I’m simply usinguv = uv1 + uv2 + uv3 + ...
then writing out.ms
file and parsing it tocasa
. This worked perfectly well all the time I used it, except for now where I started seeing some weirdness. The phase of the autocorrelation for some of the subbands appeared to be non-zero, which is not expected.What I’m doing is the following:
I then read the
ant_1_names
andant_2_names
from all theuv
files, and I could see that the first integrations in the concatenateduvfiles
are not the autocorrelations, unlike the subbands:These are some cross/auto correlations respectively, viewed in casa. I also extracted the visibilities straight from the uvh5 files, and I see the same thing
The text was updated successfully, but these errors were encountered: