Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix oom() compile time linker error #2238

Closed
9 of 22 tasks
JohnHalleyGotway opened this issue Aug 23, 2022 · 4 comments · Fixed by #2255
Closed
9 of 22 tasks

Fix oom() compile time linker error #2238

JohnHalleyGotway opened this issue Aug 23, 2022 · 4 comments · Fixed by #2255
Assignees
Labels
component: build process Build process issue priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project type: bug Fix something that is not working
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Aug 23, 2022

Describe the Problem

@robdarvell reports via Slack that MET-11.0.0-beta2 fails to compile due to a linker error.

../../../../src/basic/vx_log/libvx_log.a(main.o): In function `set_handlers()':
main.cc:(.text+0x243): undefined reference to `oom()'
collect2: ld returned 1 exit status
make[4]: *** [ensemble_stat] Error 1
make[4]: Leaving directory

This was added in beta2 but introduced a new dependency. The vx_log library now depends on the vx_util library. But I'm sure that vx_util already depends on vx_log. To avoid a circular dependency, recommend main.cc from vx_log into the vx_util library. Listed below is a

The first step here is trying to replicate this linker error. Check for ld options to enforce strict dependency checking. It would be nice for this to be flagged during development rather than leaving it to users to find.

Once we can replicate the error, confirm that moving main.cc into vx_util actually fixes it. Of course, this may impact the linking for other applications.

Here's a more detailed description of the problem:
The oom() (i.e. out-of-memory) handler is defined in vx_util/memory.cc. But it is called by vx_log/main.cc. That means that vx_log has a dependency on vx_util. And that means that we need "-lvx_log" to be followed by "-lvx_util" somewhere in the link in src/tools/core/ensemble_stat/Makefile.am. But this is NOT the case. "-lvx_log" is basically last and "-lvx_util" does not appear after it.

Only need to fix this for MET-11.0.0-beta3. We do not provide bugfixes for beta releases.

Expected Behavior

MET-11.0.0-beta3 should compile/link even when strict library dependencies are enforced.

Environment

Describe your runtime environment:
1. Machine: (e.g. HPC name, Linux Workstation, Mac Laptop)
2. OS: (e.g. RedHat Linux, MacOS)
3. Software version number(s)

To Reproduce

Describe the steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Post relevant sample data following these instructions:
https://dtcenter.org/community-code/model-evaluation-tools-met/met-help-desk#ftp

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

2799991

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required: No scientist needed.

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Organization level Project for support of the current coordinated release
  • Select Repository level Project for development toward the next official release or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next bugfix version

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Organization level software support Project for the current coordinated release
    Select: Milestone as the next bugfix version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added type: bug Fix something that is not working component: build process Build process issue priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office alert: NEED ACCOUNT KEY Need to assign an account key to this issue required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project labels Aug 23, 2022
@JohnHalleyGotway JohnHalleyGotway added this to the MET 11.0.0 milestone Aug 23, 2022
@JohnHalleyGotway
Copy link
Collaborator Author

@robdarvell this is a temporary fix that should technically work, but is not the real longterm solution:
For Makefile.am:

for file in `find ./ -name "Makefile.am"`; do
  cat $file | sed 's/-lvx_log/-lvx_log -lvx_util/g' > new
  mv new $file
done

For Makefile.in:

for file in `find ./ -name "Makefile.in"`; do
  cat $file | sed 's/-lvx_log/-lvx_log -lvx_util/g' > new
  mv new $file
done

@hsoh-u
Copy link
Collaborator

hsoh-u commented Aug 24, 2022

The solution will be 1) removing the circular dependencies and 2) adjusting link order.

  • vx_log & vx_util
    • main.cc requires vx_util
    • files at vx_util uses logger at vx_log
    • Solution: move main.cc and main.h to vx_util and keep vx_log without dependencies
  • vx_math & vx_util
    • so3.cc at vx_math requires vx_util (ascii_table.h & fix_float.h)
    • data_plane.cc & polyline.cc at vx_util requires vx_math
    • Solution: separate vx_math & vx_maths and vx_util & vx_utils
      • vx_math contains source files which depends vx_log only
      • vx_maths contains source files which depends more than vx_log (only so3.cc/so3.h)
      • vx_util contains source files which do not requires vx_math
        • moving nint.h to vx_util from vx_math ???
      • vx_utils contains other source files from vx_util with dependency of vx_math
  • vx_config & vx_util
    • config_file.cc at vx_config requires poly_line.h from vx_util
    • data_plane.cc at vx_util requires vx_config
    • Solution: working on this
      • should solve without circular dependency between vx_utils and vx_config

Goal: Recommend adding README.md file to top-level library file to define these dependencies and update the Contributor's Guide to point to that README.md.

  • vx_log: no dependency
  • vx_cal: vx_log and vx_util only
  • vx_math: vx_log and vx_cal
  • vx_maths: vx_log & vx_util (or vx_math_util)
  • vx_config: vx_log, vx_math, vx_util
  • vx_util: vx_log only
  • vx_util_math
  • vx_util_config

@TaraJensen TaraJensen removed the alert: NEED ACCOUNT KEY Need to assign an account key to this issue label Aug 24, 2022
@JohnHalleyGotway
Copy link
Collaborator Author

@hsoh-u have you figured out a way to run the linker to actually fail when these circular dependencies exist?
I tried configuring with "CXXFLAGS" to set to defined some "-Xlinker" options but could never get the compilation of beta2 to fail.

If we're able to figure that out, I'd recommend we add those options to our GHA compilation of MET so that we force the compilation to fail when we introduce unresolved dependencies.

@hsoh-u
Copy link
Collaborator

hsoh-u commented Aug 24, 2022

I tried to move where "usage" and "process_command_line" were called (from met_main at various MET applications to main at main.cc). The link errors happened at seneca and was not simple to resolve it.

hsoh-u pushed a commit that referenced this issue Sep 8, 2022
hsoh-u pushed a commit that referenced this issue Sep 8, 2022
hsoh-u pushed a commit that referenced this issue Sep 8, 2022
hsoh-u pushed a commit that referenced this issue Sep 8, 2022
hsoh-u pushed a commit that referenced this issue Sep 8, 2022
@hsoh-u hsoh-u linked a pull request Sep 12, 2022 that will close this issue
15 tasks
@JohnHalleyGotway JohnHalleyGotway changed the title Fix oom() linker error Fix oom() compile time linker error Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: build process Build process issue priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project type: bug Fix something that is not working
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants