Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sync index column order to match order from astropy 4.0 #215

Merged
merged 6 commits into from
Feb 10, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions Ska/engarchive/update_client_archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,13 +476,14 @@ def update_full_h5_files(dat, logger, msid_files, msids, opt):
def get_full_data_sets(ft, index_tbl, logger, opt):
# Iterate over sync files that contain new data
dats = []
for date_id, filetime0, filetime1, row0, row1 in index_tbl:
for row in index_tbl:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sure 'row' is fine. I probably overthought it... I avoided 'row' as the var just because the lines of the index table defined row spaces.


# Limit processed archfiles by date
if filetime0 > DateTime(opt.date_stop).secs:
if row['filetime0'] > DateTime(opt.date_stop).secs:
break

# File names like sync/acis4eng/2019-07-08T1150z/full.npz
ft['date_id'] = date_id
ft['date_id'] = row['date_id']

# Read the file with all the MSID data as a hash with keys like {msid}.data
# {msid}.quality etc, plus an `archive` key with the table of corresponding
Expand Down Expand Up @@ -587,9 +588,11 @@ def _sync_stat_archive(opt, msid_files, logger, content, stat, index_tbl):
def get_stat_data_sets(ft, index_tbl, last_date_id, logger, opt):
# Iterate over sync files that contain new data
dats = []
for date_id, filetime0, filetime1, row0, row1 in index_tbl:
for row in index_tbl:
date_id = row['date_id']

# Limit processed archfiles by date
if filetime0 > DateTime(opt.date_stop).secs:
if row['filetime0'] > DateTime(opt.date_stop).secs:
logger.verbose(f'Index {date_id} filetime0 > date_stop, breaking')
break

Expand Down
4 changes: 2 additions & 2 deletions Ska/engarchive/update_server_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,9 @@ def get_row_from_archfiles(archfiles):
# date like 2019-02-20T2109z, human-readable and Windows-friendly (no :) for a unique
# identifier for this set of updates.
date_id = get_date_id(DateTime(archfiles[0]['filetime']).fits)
row = {'filetime0': archfiles[0]['filetime'],
row = {'date_id': date_id,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sure. Removing the explicit list in file_defs and replacing with this strategy to get the columns in the right order seems fine. You have a better handle on astropy development to know which would be more stable in the long-term.
And we presently don't see a driver for keeping the column order in file_defs for reference (which you might want if you thought this could vary based on environment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our code should always be written to be tolerant of changes to column order or potentially new columns. My original implementation was the root problem, nothing more.

In astropy we strive for API stability, but sometimes allow cleaning up legacy design mistakes (or in these cases, designs that became mistakes due to language changes).

'filetime0': archfiles[0]['filetime'],
'filetime1': archfiles[-1]['filetime'],
'date_id': date_id,
'row0': archfiles[0]['rowstart'],
'row1': archfiles[-1]['rowstop']}
return row
Expand Down