You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Satpy and the li_l2_nc reader currently does not offer a simple way to extract single repeat cycles/timesteps from LI L2 accumulated products (AF/AFA/AFR). These products contain 20 timesteps covering 10 minutes of sensing time.
When loading the main variables in the accumulated_flash_area, flash_accumulation, flash_radiance with the default reader kwarg with_area_definition=True, the reader will currently return a 2-d array with the summation of the entire sensing period of the file (particularly after #2867). This way, the archived products behave the same as the normal dissemination products.
However, some users need to be able to access each timestep individually, or might want to extract a specific range of timesteps.
Describe the solution you'd like
Ideally, an easy interface to select single timesteps returned as a 2-d array, and ranges of timesteps/all timesteps organised in a 3-d xarray with a time coordinate. It could look like
scn.load(["accumulated_flash_area"]) # returns summed-up array as already currentlyscn.load(["accumulated_flash_area"], start_time=[datetime(2024, 8, 8, 12, 0, 0)]) # returns 2d array for given timestampscn.load(["accumulated_flash_area"], start_time='all') # returns 3d array with time coordinate and all timestepsscn.load(["accumulated_flash_area"], start_time=datetime(2024, 8, 8, 12, 0, 0), end_time=datetime(2024, 8, 8, 12, 1, 0)) # returns 3d array with time coordinate with all timesteps within the given range
Or similar. Whether the above is implemented as a load kwarg to be handled by the FileYamlReader, or via the DataID/DataQuery mechanism, it's to be defined.
Describe any changes to existing user workflow
No, this is a new feature.
Additional context
There is currently a workaround/hack to access single timesteps and get them into the gridded format.
First, this is the nominal, recommended code to read the accumulated data as 2-d arrays. It will give you the summed-up 2-d array for the entire timespan of the file.
fromsatpyimportScenefromglobimportglobimportosimportnumpyasnppath_to_testdata='/tcenas/fbf/mtg/RollingBuffer/OPE/l2pf/LI-L2/'# initialise LI L2 acc product Scene to produce 2-d array output (recommended, does not require resampling)scn_2d_adef=Scene(filenames=glob(os.path.join(path_to_testdata, '*AFA*--ARC-*T_0030_0001.nc')), reader='li_l2_nc')
scn_2d_adef.load(['accumulated_flash_area'])
afa_directfrom2d=scn_2d_adef['accumulated_flash_area']
Now the workaround if you need the single datasets, while we wait for a better way to be used with the mode above:
# in alternative, you can initialise it the Scene produce 1-d outputs by setting with_area_definition Falsescn_1d=Scene(filenames=glob(os.path.join(path_to_testdata, '*AFA*--ARC-*T_0030_0001.nc')), reader='li_l2_nc',
reader_kwargs={'with_area_definition': False})
scn_1d.load(['accumulated_flash_area'])
# The array is now 1-d, and we need to resample it to the correct grid (!) and using the bucket_sum resampler (!) as belowscn_1d_r=scn_1d.resample('mtg_fci_fdss_2km', resampler='bucket_sum', empty_bucket_value=np.nan)
# now, thisafa_resampledfrom1d=scn_1d_r['accumulated_flash_area']
# is a 2-d array (practically) the same as afa_directfrom2d as computed in the previous example# note: possible remaining spurious differences are related to resampling close to the disk edge# WORKAROUND/HACK FOR ISSUE# Now, we can use the fact that we can reconstruct the 2-d array from the scn_1d Scene to access single time steps# to do that, we need to load the accumulation_offsets variable, that contains the indices for each time step in the 1-d arrayscn_1d.load(['accumulation_offsets'])
# let's extract the third time step as an exampletimestep=3start_index=int(scn_1d['accumulation_offsets'][timestep-1])
end_index=int(scn_1d['accumulation_offsets'][timestep])
# let's create a new dataset, but in principle we could also overwrite accumulated_flash_area directly if we wanted to, and composites would then workds_name_new='accumulated_flash_area_'+str(timestep)
# and here it comes: we slice the 1-d dataset using the extracted start and end indicesscn_1d[ds_name_new] =scn_1d['accumulated_flash_area'][start_index:end_index]
# we need to do the same for the coordinatesscn_1d[ds_name_new].attrs['area'] =scn_1d['accumulated_flash_area'].attrs['area'].copy()
scn_1d[ds_name_new].attrs['area'].lats=scn_1d[ds_name_new].attrs['area'].lats[start_index:end_index]
scn_1d[ds_name_new].attrs['area'].lons=scn_1d[ds_name_new].attrs['area'].lons[start_index:end_index]
# then let's resample this to the correct grid (!) and using the bucket_sum resampler (!) as belowscn_1d_r=scn_1d.resample('mtg_fci_fdss_2km', resampler='bucket_sum', empty_bucket_value=np.nan)
# and here's our extracted 2-d array for one timestepafa_resampled1d_fortimestep=scn_1d_r[ds_name_new]
note that you need an updated pyresample (>=1.29.0) for the bucket_sum to work as intended here (due to pytroll/pyresample#602)
The text was updated successfully, but these errors were encountered:
Feature Request
Is your feature request related to a problem? Please describe.
Satpy and the
li_l2_nc
reader currently does not offer a simple way to extract single repeat cycles/timesteps from LI L2 accumulated products (AF/AFA/AFR). These products contain 20 timesteps covering 10 minutes of sensing time.When loading the main variables in the
accumulated_flash_area
,flash_accumulation
,flash_radiance
with the default reader kwargwith_area_definition=True
, the reader will currently return a 2-d array with the summation of the entire sensing period of the file (particularly after #2867). This way, the archived products behave the same as the normal dissemination products.However, some users need to be able to access each timestep individually, or might want to extract a specific range of timesteps.
Describe the solution you'd like
Ideally, an easy interface to select single timesteps returned as a 2-d array, and ranges of timesteps/all timesteps organised in a 3-d xarray with a time coordinate. It could look like
Or similar. Whether the above is implemented as a load kwarg to be handled by the FileYamlReader, or via the DataID/DataQuery mechanism, it's to be defined.
This has been discussed a bit on the slack on 08/08/2024: https://pytroll.slack.com/archives/C0LNH7LMB/p1723118806005139
Describe any changes to existing user workflow
No, this is a new feature.
Additional context
There is currently a workaround/hack to access single timesteps and get them into the gridded format.
First, this is the nominal, recommended code to read the accumulated data as 2-d arrays. It will give you the summed-up 2-d array for the entire timespan of the file.
Now the workaround if you need the single datasets, while we wait for a better way to be used with the mode above:
note that you need an updated pyresample (>=1.29.0) for the
bucket_sum
to work as intended here (due to pytroll/pyresample#602)The text was updated successfully, but these errors were encountered: