-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory usage generating composites from ABI/AHI #1902
Comments
For my experience of GK-2A data, setting Besides, I remember a long time ago satpy didn't have this issue. You can get a pretty quick result with acceptable memory usage. |
@Plantain Can you try passing |
If you look at the attached script, that is already set. |
|
@Plantain That is a very good data point to have. This shows very clearly that for some reason dask (or something in this processing) is holding on to the memory before writing. I think I'm seeing the same thing in my own testing, but am still trying to narrow down what is allocating what and trying to understand everything that the dask diagnostics are telling me. What I'm currently working on and what I have figured out are:
My Scriptimport os
os.environ["OMP_NUM_THREADS"] = "1"
os.environ["PYTROLL_CHUNK_SIZE"] = "2200"
from datetime import datetime
from glob import glob
from dask.diagnostics import CacheProfiler, ResourceProfiler, Profiler, visualize
import numpy as np
import dask
import dask.array as da
from satpy import Scene
class PassThroughStore:
def __init__(self, shape, dtype):
self._store = np.empty(shape, dtype=dtype)
def __setitem__(self, key, value):
self._store[key] = value
def run():
# name = "true_color"
# name = "true_color_nocorr"
# name = "overview"
name = 0.65
scn = Scene(reader="ahi_hsd", filenames=glob("/data/satellite/ahi/hsd/2330/*FLDK*.DAT"), reader_kwargs={"mask_space": False})
# scn.load([name], calibration="counts", pad_data=False)
scn.load([name], calibration="counts", pad_data=True)
new_scn = scn
# new_scn = scn.resample(resampler="native")
print(new_scn[name].shape, new_scn[name].data.numblocks)
final_store = PassThroughStore(new_scn[name].shape, new_scn[name].dtype)
# new_scn[name].compute()
# new_scn[name].data[:4000, :4000].compute()
# block_view = new_scn[name].data.blocks
# block_view[0, 0].compute()
# new_scn[name].data.visualize("profile_ahi_visualize_noresample.svg")
da.store(new_scn[name].data, final_store, lock=False)
# da.store(new_scn[name].data[:18000, :18000], PassThroughStore(), lock=False)
print("Done storing")
def run_profiled():
with dask.config.set(num_workers=4), CacheProfiler() as cprof, ResourceProfiler(0.4) as rprof, Profiler() as prof:
run()
filename = f"profile_ahi_{datetime.now():%Y%m%d_%H%M%S}.html"
visualize([prof, rprof, cprof], filename=filename, show=False)
cwd = os.getcwd()
print("file://" + os.path.join(cwd, filename))
run_profiled()
# run() |
Just an small addition, the IR segments are 550 lines, only band 3 has 2200 line chunks. So maybe setting a chunk size of 550 would be more optimal as that can be shared by all bands? |
I've spent a lot of today updating my test script so I can be a little more flexible with what I load and to save various profile HTML files or performance reports. I've decided to switch to the full The first 80% of that task stream (top plot) are the angle generations. After that it starts generating/loading the band data from the input files. It seems once it does that it can finally start removing some of the angle information as it was actually applied to the data and is no longer needed. What's really weird though is I then tried running this with the same thing, but forced the input data chunks so they should have been 1 chunk per file (each file is a segment). With that the profiles now look like this: So it took much less time and used much less memory and the memory doesn't continuously increase, it goes up and down. My best guesses for why this is is:
Still not completely sure, but I do want to point out that the situation in the first screenshot is the same thing that is happening with the ABI reader, just not as much memory is used. |
So I think my local changes have improved this and we have some other small things we could do. As mentioned above I started noticing that dask was scheduling all the angle-related operations first and wasn't even getting to the other calculations (loading the data from the input files) until all (or almost all) of them were finished. So I played around a bit and tried hacking the modifiers so rayleigh correction would just make an adjustment of I then refactored the angle generation after I noticed that the cos(sza) generated for sunz correction wasn't using the same angle generation that the rayleigh correction is. This didn't really change much (and it shouldn't) with my current case of only have sunz correction and no rayleigh. So then I updated the rayleigh correction so not call pyspectral but still use the angles The increased memory usage overall makes sense to me. You're including 3 additional angle arrays (sat azimuth, sat zenith, solar azimuth) and the rayleigh correction uses all the angles + the red band + the data being corrected in the same function. That's a lot. TODO:
Edit: I'll try to make a pull request later tonight. |
As github shows I made #1909 and #1910 which reduce the memory usage a bit. I've now made the biggest improvement by changing how lon/lats are generated in pyresample. It completes in about the same amount of time, but I went from ~30-35GB of memory peak to ~5.5GB. The main idea with the fix (I'll be making a pull request this afternoon) is that pyresample generates lons/lats for an AreaDefinition by first generating the x/y coordinate vectors (so 1D) then uses I'm going to eat some lunch, do some meetings, and then make a pull request on pyresample for this. WARNING: I have not actually tried generating geotiffs with the changes I've made and have no idea if they are even still correct...but none of the tests fail. |
Ooh, that's remarkable, lets hope it works all the way! |
That last improvement is really nice, and will provide a boost to all area-based data! |
@Plantain Do you have a plot of what it looked like with ABI before my changes? |
Pyresample 1.22.2 was released on Friday and includes the major improvements shown here. We'll plan to release Satpy this week which will include the other smaller performance improvements. Closing this as those satpy improvements are now in satpy |
Using AHI/ABI readers with Himawari/GOES data and producing composites uses in excess of 24GB of RAM, even for a single worker/thread. I suspect this is more than is necessary and probably more than when it was originally written.
To Reproduce
Run https://gist.github.com/Plantain/18afecfc8f6c049aa8fbc7f92e7d8284 , with decompressed Himawari8 full-disk imagery
Expected behavior
I don't know what an appropriate amount of memory usage is, but I suspect it is considerably less than 24GB for a single worker. I understand Dask is meant to enable chunking of tasks into smaller size components, and something is not working correctly here.
Actual results
Observing memory usage with top shows it consistently using ~24GB of RAM
Environment Info:
I also tried with dask 2021.03
The text was updated successfully, but these errors were encountered: