Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point Sources Issues in UFS #13

Closed
ytangnoaa opened this issue Sep 19, 2022 · 35 comments · Fixed by #53
Closed

Point Sources Issues in UFS #13

ytangnoaa opened this issue Sep 19, 2022 · 35 comments · Fixed by #53
Labels
Ops Imp Operational Implementation

Comments

@ytangnoaa
Copy link

ytangnoaa commented Sep 19, 2022

Currently the FV3-based online CMAQ puts point sources to the lowest layer, same as area sources. To explicitly treat plume rise of each point sources, one critical issue is I/O. The point source emission contains listed data. Each point source, like power plant stacks, has its own Lat/Lon, stack height, exit temperature/velocity, airmass flow rate etc. One model grid, depending on its size/location, could have up to tens of thousands of point source stacks, and each of them has its own plume rise profile. This plume rise profile depends on meteorological conditions and its stack information.

Possible solutions

  1. Still uses ESMF structures. Assign the point sources to the nearest grid in a preprocessor. Besides the exist I, J, add a K dimension to represent the numbers of point sources at each grid.
    Issues: the K dimension has to be big enough to hold all point sources for each grid cell, though some grid cells, e.g. over ocean, may have zero PT. The input intermediate file could be very big, and its I/O would become a burden.

  2. Skip ESMF, and read PT list file (plain netcdf or csv file) directly. Then assign point source (PT) to the nearest grid cell and make plume rise in each PE.
    Issue: need to re-structure the I/O.

2.1 Master PET (Persistent Execution Threads of a Virtual Machine) read the PT file and send subset to each subdomain correspondingly, based on PT X,Y index and subdomain halo
Issue: Need structure change. The loading imbalance could be significant: urban area has many PT, and ocean subdomain has none.

2.2 Split PT files for each PET’s subdomain, and each PET or DE (Decomposition Element) read its own pierce of point sources. In current AQM component, each PET contains one DE, which makes the DE trackable.
Issue: need to know the domain decomposition information before the model run.

2.3 Master PET or DE read the PT file, calculate plume-rise and send out risen emissions in 3D grid.
Issue: Master PE is doing scientific calculation for millions of PT while other PEs are idling

2.4 Each PET or DE reads the PT file individually, and skip the sources not belonging to the subdomain.
Issue: Hundreds or Thousands PE read the same file simultaneously and could cause I/O bottleneck. Each PE read in many unused data.

We tried 2.4 first, and it worked, but caused 44% longer runtime, as each PET or DE read the whole domain's point sources (1.2 million point sources, 63 emitted species in NEI2016v1). We can also pre-sort the PT emission and only explicitly treat certain sources, like
A. Treat the 613,870 point sources that could enter the second lowest layer
(plume rise > 45m) explicitly and put the rest to the lowest layer.
Issues: take 28% longer time.
B. Three-Tier emission input: 1. area/mobile/biogenic and lower level point sources (plume rise < 45m). 2. Point sources could enter the second lowest layer (<97m) which are split between the lowest and second lowest layers. 3. other higher point sources needed to treat explicitly.
Issues: two sets of area sources; mandatory splitting of the tier-2 emissions

C. Use the existing emission input structure, and only explicitly treat the point sources with the strongest emission intensities (annual/monthly).
Issues: need to find a way to sort/select point sources depending on sectors, species, seasons.

All these sorting could bring biases and complexities, and some of them could also take significant time. The best solution in term of science and performance is treating all point sources explicitly (method 2.2)

@ytangnoaa
Copy link
Author

ytangnoaa commented Sep 20, 2022

Here are the test kits for method 2.2. It includes two parts. First is the preprocessors for merging and decomposing point sources based on online CMAQ's "config.sh" and "var_defns.sh"

https://github.com/noaa-oar-arl/AQM-utils/tree/feature/decomp-pt

where https://github.com/noaa-oar-arl/AQM-utils/blob/feature/decomp-pt/sorc/decomp-pt-mpi/tests/test-hera.ksh is the testing script in hera. The users need to change the $PDY etc for their cases.

The second part includes the code change of the AQM component for reading decomposed PT and calculating plume rise

https://github.com/noaa-oar-arl/AQM/tree/feature/pt-source

which read the $PDY/PT/pt-????.nc under the experimental run directory. This hardwired filenames may be changed in the future. So far, the PT emission factors uses fire emissions' factors from the "aqm.rc" file, which will be changed in the future, too. To rebuild the executable file "bin/ufs_model", under the code build directory,

rm -rf build/src/ufs-weather-model/src/ufs-weather-model-build/AQM
devbuild.sh -p=hera -a=ATMAQ

To test it, one needs to run the test-hera.ksh to generate the PT emission files first. After all other preprocessors are finished before the "run_fcst" step , under the run directory, do

rocotoboot -w FV3LAM_wflow.xml -d FV3LAM_wflow.db -c ${PDY}00 -t run_fcst

@ytangnoaa
Copy link
Author

ytangnoaa commented Sep 25, 2022 via email

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Sep 25, 2022 via email

@ytangnoaa
Copy link
Author

ytangnoaa commented Sep 30, 2022 via email

@ytangnoaa
Copy link
Author

The "feature/pt-source" branch is merged into the "develop" branch of

https://github.com/noaa-oar-arl/AQM

and the corresponding area source without PT emissions can be processed with
https://github.com/noaa-oar-arl/NEXUS/tree/feature/no_pt

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Oct 6, 2022 via email

@ytangnoaa
Copy link
Author

ytangnoaa commented Oct 6, 2022 via email

@bbakernoaa
Copy link
Contributor

@JianpingHuang-NOAA @chan-hoo @rmontuoro Has there been any movement on this? I think that there is workflow issues that need to be addressed to make this possible

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Oct 11, 2022 via email

@ytangnoaa
Copy link
Author

Jianping
I updated the code/script for WCOSS2 usage with Chan-Hoo's latest workflow. You can find the changes compared the original online CMAQ
cd /lfs/h2/emc/physics/noscrub/Youhua.Tang/UFS/ufs-srweather-app
find . -name *.orig

the AQM-utils scripts was also updated for WCOSS2 usage
https://github.com/noaa-oar-arl/AQM-utils/tree/feature/decomp-pt

You can find my run test folder for big domain in
/lfs/h2/emc/physics/noscrub/Youhua.Tang/expt_dirs/aqm_cold_aqmna13_1day
/lfs/h2/emc/physics/noscrub/Youhua.Tang/nco_dirs

Since the point source file is very big, so far I put them under ptmp
/lfs/h2/emc/ptmp/Youhua.Tang/nei2016v1-pt

We need find permanent location for these inputs

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Oct 14, 2022 via email

@ytangnoaa
Copy link
Author

ytangnoaa commented Oct 14, 2022 via email

@ytangnoaa
Copy link
Author

Jianping

Your issue was caused by
/lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/run_fcst.17178779.cbqs01/INPUT/NEXUS_Expt.nc

having no CO emission. It is strange, since the https://github.com/noaa-oar-arl/NEXUS/tree/develop turned off point source,
but still should have the species of CO

Barry and Patrick, could you help take a look?

Another issue, the point source directory was not linked to the run directory, and it missed this step

ln -s /lfs/h2/emc/ptmp/jianping.huang/emc.para/com/aqm/v7.0/aqm.v7.0.c14.20221028/00/PT /lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/run_fcst.17178779.cbqs01

Chan-hoo, could you add this step based on /lfs/h2/emc/physics/noscrub/Youhua.Tang/UFS/ufs-srweather-app/scripts/exregional_run_fcst.sh?

cd_vrfy ${DATA}
ln -fs ${INPUT_DATA}/PT PT

@chan-hoo
Copy link

@ytangnoaa, the workflow has been updated. The link for the 'nco' mode was added to the 'run_fcst' script.

@bbakernoaa
Copy link
Contributor

There should definitely be CO in the output species. You can see it here for example: https://github.com/noaa-oar-arl/NEXUS/blob/develop/config/cmaq/NEXUS_Config.rc#L919

And here as a https://github.com/noaa-oar-arl/NEXUS/blob/develop/config/cmaq/HEMCO_sa_Diagn.rc#L5

@chan-hoo
Copy link

chan-hoo commented Nov 1, 2022

@ytangnoaa @JianpingHuang-NOAA , the hash of NEXUS has been updated in Online-CMAQ. The issue @ytangnoaa mentioned above looks resolved.

@JianpingHuang-NOAA
Copy link
Collaborator

I am reopening this issue since we are seeing over-prediction of PM2.5. Please see Slid 4 from the link.

@bbakernoaa
Copy link
Contributor

@JianpingHuang-NOAA

2 things...

  1. when you share the links can you open the files up to anyone at NOAA to see? I don't have access to see this file you linked to.

  2. Can you please be more descriptive when you post here? It would be very helpful for others to get more context into the issue so that we can help. From what you posted there is no context into why this is being reopened.

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Nov 14, 2022 via email

@HaixiaLiu-NOAA
Copy link
Collaborator

@ytangnoaa @bbakernoaa Would you please provide an update on this issue? Based on the discussion (between Raffaele and you) from last Friday's EMC internal FIRE meeting, current code setup still does not work correctly, would you please share your plan on how to have this issue solved and an estimated date to solve this? This is the last major scientific issue to be solved before the code freeze code 12/1. Thank you very much!

@bbakernoaa
Copy link
Contributor

@HaixiaLiu-NOAA This was just brought to us today and the issue was reopened. We have not had a moment to figure out exactly what is happening or why. Little analysis has occurred so far.

@ytangnoaa
Copy link
Author

@ytangnoaa @bbakernoaa Would you please provide an update on this issue? Based on the discussion (between Raffaele and you) from last Friday's EMC internal FIRE meeting, current code setup still does not work correctly, would you please share your plan on how to have this issue solved and an estimated date to solve this? This is the last major scientific issue to be solved before the code freeze code 12/1. Thank you very much!

The major issues of the high PM2.5 are related to overgrowth of secondary organic aerosols, like APCSOJ, and they are not directly emitted. The CMAQ 5.2.1's chemistry has this known issue. We are investigating the solution

@HaixiaLiu-NOAA
Copy link
Collaborator

@ytangnoaa Thank you for the update. What is your plan to solve this and could you please provide me an estimated date to solve this issue? We are really close to the code freeze date now. I can use the date you provide to update the AQM issue spreadsheet. Thank you.

@ytangnoaa
Copy link
Author

@ytangnoaa Thank you for the update. What is your plan to solve this and could you please provide me an estimated date to solve this issue? We are really close to the code freeze date now. I can use the date you provide to update the AQM issue spreadsheet. Thank you.

In principle, we can follow the previous solution for the high APCSOJ related to other emissions, to solve the same issue related to point sources. If it can be done quickly, we may catch up the deadline.

@ytangnoaa
Copy link
Author

Fixed the issues in point source, and one file needs to be updated
https://github.com/noaa-oar-arl/AQM/blob/feature/pt-source/src/model/src/PT3D_DEFN.F

@HaixiaLiu-NOAA
Copy link
Collaborator

From Youhua:

"Jianping

Thank you for your and Ho-Chun's efforts. I found the bug. You need not change "aqm.rc", but need update one file

https://github.com/noaa-oar-arl/AQM/blob/feature/pt-source/src/model/src/PT3D_DEFN.F

This issue was due to the complex chemical species lists (more than 6) and corresponding mapping in that file.
Some species lists, like GC_EMIS, are not unique, meaning that there are some repeated species, which caused the double count
issues for fire emissions and anthropogenic point sources. I tested running 24-hour for the NA13km domain, and the results are
reasonable.

Thanks

Youhua"

@HaixiaLiu-NOAA
Copy link
Collaborator

@ytangnoaa IF you have any results regarding the impact of this point source bug fix on the AQM simulation, please put them here. Thank you.

@ytangnoaa
Copy link
Author

@ytangnoaa IF you have any results regarding the impact of this point source bug fix on the AQM simulation, please put them here. Thank you.

Here are the plots before and after the fixing.

spatial-apcsoj-before-fix-2019071612
spatial-apcsoj-after-fix-2019071612

@JianpingHuang-NOAA
Copy link
Collaborator

Youhua @ytangnoaa

I am testing the latest workflow online-cmaq (3d7069e and the ufs-weather-model (d31ee42) for 20221121 at 06z cycle and met failures with the first part of point source job.

Please see the run log files
/lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20221121
point_source_2022112106.id_1669151465.log.0
point_source_2022112106.id_1669151465.log

Thanks,

Jianping

@JianpingHuang-NOAA
Copy link
Collaborator

Youhua, is this your updated code ?

https://github.com/noaa-oar-arl/AQM/tree/feature/pt-source

@ytangnoaa
Copy link
Author

ytangnoaa commented Nov 22, 2022 via email

@ytangnoaa
Copy link
Author

ytangnoaa commented Nov 23, 2022 via email

@JianpingHuang-NOAA
Copy link
Collaborator

JianpingHuang-NOAA commented Nov 23, 2022 via email

@ytangnoaa
Copy link
Author

ytangnoaa commented Nov 23, 2022

Please try the workflow-version python script that Chan-Hoo just created
https://github.com/NOAA-EMC/AQM-utils/blob/develop/python_utils/stack-pt-merge.py

The point-source alone should only slightly increases the running time. I tested the 24-hr big-domain run: without PT (3295 seconds), with PT (3308 seconds).

If you include other changes in your C18, you may want to check their impacts

I checked your log file of C15
/lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/expt_dirs/ufs_na_rt_v15/log/run_fcst_2022112800.log
RESOURCE STATISTICS**************
The total amount of wall time = 1176.943923
The total amount of time in user mode = 823.191897
The total amount of time in sys mode = 331.450321
The maximum resident set size (KB) = 1352664

vs C19
/lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/expt_dirs/ufs_na_rt_v19/2022112800/log/run_fcst_2022112800.log
RESOURCE STATISTICS**************
The total amount of wall time = 1025.594372
The total amount of time in user mode = 828.622798
The total amount of time in sys mode = 362.926589

C19 is faster (1025 vs 1176).

For the 72-hr run, C15 log /lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20221127/run_fcst_2022112706.id_1669724069.log

RESOURCE STATISTICS**************
The total amount of wall time = 12909.519732
The total amount of time in user mode = 11215.873828
The total amount of time in sys mode = 4236.213294

vs C18 log
/lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20221127/run_fcst_2022112706.id_1669795963.log
RESOURCE STATISTICS**************
The total amount of wall time = 13513.010937
The total amount of time in user mode = 15072.950922
The total amount of time in sys mode = 3038.977186

The C18 used only 4% more time.

Hi Youhua, The real-time test seems to pass, but we met another failure with point-source code run for all the cases on August 1-7, 2019. Please see example run log files for 20190801 at 12z :/lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20190801 point_source_2019080112.id_1669177702.log.0 point_source_2019080112.id_1669177702.log Meanwhile, for the NRT run C18, the forecast model runtime is much longer than the case without point source, which is about a 36% increase. 15729 seconds (4.37 hours) vs. 11556 s (3.21 hours), for 72-hr forecast over the large domain. Please see both log files, log.launch_FV3LAM_wflow at /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/expt_dirs/ufs_na_rt_v15 and /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/expt_dirs/ufs_na_rt_v19 Did you see such a large increase in runtime? Thanks, Jianping On Tue, Nov 22, 2022 at 7:55 PM Youhua Tang @.> wrote:

Jianping It is a holiday bug in python script or NEI inventory. The Thanksgiving day of 2016 (for NEI2016) should be 2016-11-24. For unknown reason, the EPA used 2016-11-25. It also affects the days before and after Thanksgiving. The python scripts are updated to fit the NEI dataset https://github.com/noaa-oar-arl/AQM-utils/tree/feature/decomp-pt/python_utils stack-pt-merge.py stack-pt-merge-wcoss2.py You can also find in /lfs/h2/emc/physics/noscrub/Youhua.Tang/UFS/AQM-utils/python_utils Thanks Youhua On Tue, Nov 22, 2022 at 4:27 PM JianpingHuang-NOAA @.
> wrote: > Youhua @ytangnoaa https://github.com/ytangnoaa > > I am testing the latest workflow online-cmaq (3d7069e > < JianpingHuang-NOAA/ufs-srweather-app@3d7069e > > and the ufs-weather-model (d31ee42) for 20221121 at 06z cycle and met > failures with the first part of point source job. > > Please see the run log files > /lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20221121 > point_source_2022112106.id_1669151465.log.0 > point_source_2022112106.id_1669151465.log > > Thanks, > > Jianping > > — > Reply to this email directly, view it on GitHub > <#13 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/AFULDHRFYPGYUVYX254GYNDWJU3DZANCNFSM6AAAAAAQQNCHYI > > . > You are receiving this because you were mentioned.Message ID: > @.> > — Reply to this email directly, view it on GitHub <#13 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANA2PI75ZOUZIK6SGU5FZCDWJVTRBANCNFSM6AAAAAAQQNCHYI . You are receiving this because you were mentioned.Message ID: @.**>

@HaixiaLiu-NOAA
Copy link
Collaborator

HaixiaLiu-NOAA commented Jan 9, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ops Imp Operational Implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants