Skip to content
This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Python version of flt_envs #168

Closed
wants to merge 16 commits into from
Closed

Python version of flt_envs #168

wants to merge 16 commits into from

Conversation

jeanconn
Copy link
Contributor

@jeanconn jeanconn commented Jun 15, 2018

Replace the primary environment-setting script in the ska runtime environment (flt_envs) with a Python script.

The current flt_envs requires Perl to use the CXC::Envs::Flight package which uses sysarch for OS and architecture detection. The new Python script uses uname to get something reasonable for the arch directories. The new arch names look like their conda equivalents ('linux-64').

The will remove Perl dependence when installing the Python-only parts of the environment. It will also help to support newer machines that don't work with the current sysarch (CentOS 7).

@taldcroft

This comment has been minimized.

Tom Aldcroft (taldcroft@cfa.harvard.edu)

Based on mst_envs by Diab Jerius (djerius@cfa.harvard.edu)
#!/usr/bin/env python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the recursion, maybe the right answer is to call /usr/bin/python here. I think that is going to exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure which platforms we were supporting to know if we'll always want /usr/bin/python or just whatever is in the user's path already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current targets (HEAD linux, GRETA linux, MacOSX) all have /usr/bin/python (at least version 2.4). Windows is a different beast but does not have /usr/bin/env anyway. I'm pretty sure Mark has a custom env setup already. The user Python could be anything, so it means we could never accurately test this. With system Python we can actually try it out on supported OS versions and confirm it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll see if /usr/bin/python fixes the recursion.

I was under the impression that "user Python could be anything" might be something we could address via docs (Use /usr/bin/python if you've got some crazy Python in your path) while still giving flexibility, but am fine with /usr/bin/python. unagi appears to have 2.4 /usr/bin/python as you note. I'd been testing to start on fido (2.6) which is where I noted I'd appear to need to number my modern format strings; looks like those aren't going to work for 2.4 anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops good point about formatting in old python.

@jeanconn

This comment has been minimized.

@taldcroft
Copy link
Member

I think that any environment with /usr/bin/env will have /usr/bin/python. But honestly not sure about Windows.

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally good.

parser.add_argument('-shell', type=str,
default='bash',
help='Shell name (csh|tcsh|bash|sh|ksh) default=bash')
parser.add_argument('-arch', type=str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had 1 test use case, in it could let this modern flt_envs use one of our previous ska environments without needing to make a bunch of sym links. I suppose there is also the in-the-weeds operational use case that it gives you an override if uname is weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ska3/flight and test we will probably want to make that link anyway, right? If some actual case comes up that requires this option then we can put it back in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'm confused about "bunch". I thought just one is needed.

paths = os.environ['PATH'].split(':')
ska_bin = envs['SKA_BIN']
while ska_bin in paths:
# I've forgotten why this is needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep the behavior? In that case we should probably also remove any previous SKA_ARCH_OS bin from that path as well. Otherwise we can just prepend and hope for the best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general let's make this Python port be faithful to the original Perl except where the original is very obviously wrong or out of date. For instance we really don't need CUDA nor handling of SunOS arch. If this will be applied to Ska2 then everything should be (almost) exactly as it now is, warts and all.

We can then tweak for skare3 as desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, though not quite seeing why you'd want to remove SKA/bin but accidentally keep an old SKA_ARCH_OS/bin in the path while you were sourcing a new environment (could this just predate SKA_ARCH_OS/bin?). I generally try not to overlap/overlay runtimes anyway so this shouldn't be an issue either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand what's happening. CXC::Envs::Flight.pm relies on a add_unique_path function which has been ignored here. You need to bring that into the Python port.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I figured you'd decided it was overkill in the pre-first-draft of this, but I suppose it can't hurt.

uname = args.arch
if uname is None:
uname_tuple = platform.uname()
uname = "{0}-{1}".format(uname_tuple[0], uname_tuple[4]).lower()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment describing the two tuple components with examples for linux / mac.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 0 and 1 appeared necessary on Python 2.6 . I wasn't sure how far back to go when I started this (and still am not sure).

@jeanconn
Copy link
Contributor Author

Regarding /usr/bin/python, I wasn't concerned that any supported platform wouldn't have /usr/bin/python, I had just already run across systems where another system Python (/usr/local/bin/python for example) seemed a better choice.

# I've forgotten why this is needed
paths.remove(ska_bin)
paths.insert(0, os.path.join(ska_arch_os, 'bin'))
paths.insert(0, ska_bin)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in general, do we want SKA_ARCH_OS/bin first or second in the path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as it is now, with $SKA/bin first and $SKA_ARCH_OS/bin second.

In skare3 I believe we can take $SKA/bin out of the path entirely.

@taldcroft
Copy link
Member

I had just already run across systems where another system Python (/usr/local/bin/python for example) seemed a better choice.

Better in what way?

@jeanconn
Copy link
Contributor Author

Python 2.7 on unagi instead of 2.4.

@taldcroft
Copy link
Member

Why is Python 2.7 better than 2.4 for this purpose? The old-style format strings are good enough.

@jeanconn
Copy link
Contributor Author

I just hadn't gotten to wanting to redo

print(shell_cmd.format(name, val))

with the % formats.

import platform

parser = optparse.OptionParser(description='Set flight environment vars')
parser.add_option('-s', '--shell', type=str,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OptionParser wasn't happy with "-shell", so I suppose I need to decide if we want to just do it manually (without an option/argument parser).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's annoying. I think the best here (again driven by putting this into Ska2) is stay with -shell and using manual argv parsing. It doesn't have to be very robust against bad input since that basically won't occur.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I was thinking that this just had to work with the other wrappers in template/bin, but there are other wrappers hanging around and we don't want to have to find those and change the syntax. Also looks like most things are supplying a "ska" argument that we don't need (but will make sure the argv parsing doesn't break on it).

cfg/env_vars.cfg Outdated
test -e ${prefix}/bin/flt_envs
install: |
mkdir -p ${prefix}/bin
rsync --archive --verbose --exclude .snv --exclude '*-' template/bin/flt_envs ${prefix}/bin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of work to copy a single file. Can you use cp -p template/bin/flt_envs ${prefix}/bin/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just following our previous convention, though I suppose it didn't make much sense for a single file and it looks like I've got a typo there anyway. And sure, cp -p seems equivalent. Or cp -pv so you get the logging when you are replacing --verbose.

install.py Outdated
@@ -73,6 +74,9 @@ def get_options():
action="store_true",
help="Do not do default env var setup prior to processing",
)
parser.add_option("--platform-arch",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used, remove.

install.py Outdated
@@ -349,6 +350,7 @@ def make_version_file(version_file):
if opt.no_env_vars:
opt.config.pop(0)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra white space.

(options, args) = parser.parse_args()

uname_tuple = platform.uname()
# platform.uname returns a tuple of strings (system,node,release,version,machine,processor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making this match conda designations, which are even simpler.

system, machine = uname_tuple[0].lower(), uname_tuple[4].lower()
system = {'darwin': 'osx'}.get(system, system)
machine = {'x86_64': '64', 'i686': '32'}.get(machine, machine)
uname = '%s-%s' % (system, machine)

In terms of robustness to unexpected uname outputs, this is just as robust. Basically anything that doesn't fit one of our current target platforms will just fall through.

I theory this would need to be replicated in install.py. Note that platform_os_generic is used in a number of defunct binary cfg files and perl_modules.cfg to build PGPLOT. So that would need to be updated for Ska2. (Even without my suggested change, the PGPLOT build would have failed). In fact I think we only need to build PGPLOT on linux-64, right? As far as I can see hw_cpu_generic is not used in any build cfg file.

if libpath in os.environ and os.environ[libpath] != '':
envs[libpath] = ":".join([envs[libpath],
os.environ[libpath]])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra whitespace.

paths = os.environ['PATH'].split(':')
ska_bin = envs['SKA_BIN']
while ska_bin in paths:
# I've forgotten why this is needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand what's happening. CXC::Envs::Flight.pm relies on a add_unique_path function which has been ignored here. You need to bring that into the Python port.

# values in the environment stay
envs['PERL5LIB'] = ":".join([os.path.join(ska, 'lib', 'perl'),
os.path.join(ska, 'lib', 'perl', 'lib')])
envs['LD_LIBRARY_PATH'] = ":".join([os.path.join(ska_arch_os, 'lib'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is missing "$env{SYBASE}/$env{SYBASE_OCS}/lib" compared to CXC::Envs::Flight.

@taldcroft
Copy link
Member

Code-wise this seems good to me. What have you done for testing so far, and thoughts for more tests and deployment (assuming LR approval)?

I'm also wondering about first tossing this into HEAD Ska3 flight to soak for a week or two.

@taldcroft taldcroft changed the title WIP: First/second draft Python flt_envs Python version of flt_envs Jun 26, 2018
@jeanconn
Copy link
Contributor Author

So far for testing I've just installed a new dev ska using this branch and confirmed that sourcing ska_envs.csh works to get me into it. Have not run the testr tests yet etc in the new dev ska to make sure things are getting hit. Have not thought more on testing just yet.

Ska3 flight seems a reasonable place to soak. Could also try to do something to install this in parallel with current flt_envs, but that seems an overcomplicated mess.

Right now, git doesn't seem to be building (don't think that is due to this change) and I'm wondering if we can install git on Ska2 using conda instead of building (which seems broken in tcl land now).

@jeanconn
Copy link
Contributor Author

For deployment, I think we'd just have to do a "make basedirs" and add a link or links in arch. But I'll practice deployment with a dev ska.

@taldcroft
Copy link
Member

Git with conda seems a reasonable idea. In practice I think the only need for that now is Ska test on GRETA (i.e. so there is some git on chimchim). In a logical world we would get the FST to install the RPM.

@jeanconn
Copy link
Contributor Author

Gotcha, I was concerned both that the conda git looks to have a much newer version, and I didn't know if it was set to support the things that were previously working/compiled in the skare git (tcl or whatnot). Will test that separately.

@jeanconn
Copy link
Contributor Author

jeanconn commented Jun 26, 2018

As "make basedirs" copies stuff into arch, the install procedure needs to be to make the symlinks in arch and build first and then do a "make basedirs".

@jeanconn
Copy link
Contributor Author

jeanconn commented Jun 27, 2018

OK, so I was running into this problem when trying to test with starcheck

Traceback (most recent call last):
  File "/data/fido/pinst/bin/flt_envs", line 17, in <module>
    import os
ImportError: No module named os

I think I've narrowed it down to a problem with PYTHONHOME. We override PYTHONHOME to run Inline::Python, and we set it before we run perl.

https://github.com/sot/starcheck/blob/master/sandbox_starcheck#L4

But then the wrappers (sandbox_starcheck and SKA/bin/starcheck) use perl from the current path, so that picks up SKA/bin/perl which is calling the flt_envs wrapper with /usr/bin/python. I think we can't call /usr/bin/python after manually setting PYTHONHOME to an incompatible Python installation.

I think I can make this work with starcheck, though this is one of those other maybe-flt-envs-should-be-bash type issues in the back of my head.

Anyway, I think in the starcheck wrappers we could safely call $SKA_ARCH_OS/bin/perl and skip the flt_envs wrapper after setting PYTHONHOME (we're already in a defined SKA so there's no reason to call the wrapper again at this point). Or we could put SKA_ARCH_OS/bin before $SKA/bin in the path in general and reduce trips through the wrapper.

@jeanconn
Copy link
Contributor Author

And in the flight wrapper we'd probably need to change this up

https://github.com/sot/starcheck/blob/master/starcheck/src/starcheck#L9

so we'd call either the arch python at this point, or just do that step to figure out where starcheck is before setting PYTHONHOME.

@taldcroft
Copy link
Member

What about using the -E option to ignore all PYTHON* env vars:

#!/usr/bin/python -E

@jeanconn
Copy link
Contributor Author

Ooh, that does sound like a better idea. Will test.

@jeanconn
Copy link
Contributor Author

Looks good with sandbox_starcheck. Thanks for the fix! We may want to do those changes to starcheck at some point (to skip extra trips through the wrapper), but it looks like this fixes things.

One issue I'm noting was that even though I had those traceback failures in flt_envs, it looks like $SKA/bin/perl continued. Not sure which piece we should tweak so that if eval of flt_envs fails, the next bit of the wrapper should not just proceed. No big deal in starcheck because we were already in the SKA environment I wanted.

@jeanconn
Copy link
Contributor Author

And I think the continue-even-if-eval-failed issue is not a regression.

@taldcroft
Copy link
Member

I think we can cut much of ska/bin for skare3 and maybe take out of the path.

@jeanconn
Copy link
Contributor Author

jeanconn commented Jul 2, 2018

I had missed that removing sysarch causes problems with a clean Perl build (with the modules that use it). I haven't tried to evaluate the dependency tree determine if those modules can just be removed.

I think for now we should probably leave it sysarch and syspathsubst in bin and also test #159 for CentOS7 support.

@taldcroft
Copy link
Member

I suppose those two files can be conda-packaged and get installed via the perl modules metapackage. Are those perl modules that rely on sysarch from Diab? I couldn't find any of our configured perl modules on GitHub that use it. In any case we can clean up later.

@jeanconn
Copy link
Contributor Author

Right. I put the comment here in skare and not skare3 because right now we have a non-working clean full build of PY2 skare if we merge this approved PR.

@jeanconn
Copy link
Contributor Author

And indeed for skare3 I have them installed via the a perl-cxc-sysarch metapackage.

This seems to steal all processes and die in the Python 2.7 sherpa install.
I assume it gets stuck in some loop.  Probably should have been testing on
Python 3 anyway but both should really just work.

    checking for a Python interpreter with version >= 2.7... python
    checking for python... /data/fido/pinst/bin/python
    checking for python version... /data/fido/pinst/bin/python: fork: retry: Resource temporarily unavailable
    /data/fido/pinst/bin/python: fork: retry: Resource temporarily unavailable
    /data/fido/pinst/bin/python: fork: retry: Resource temporarily unavailable
    /data/fido/pinst/bin/python: fork: retry: Resource temporarily unavailable
    /data/fido/pinst/bin/python: fork: Resource temporarily unavailable
    sh: fork: retry: Resource temporarily unavailable
    sh: fork: retry: Resource temporarily unavailable
@jeanconn
Copy link
Contributor Author

We stopped before actually installing this because it looked like maybe we could just move to ska3, but we need ska2/centos7 support in the interim, especially for eng_archive processing.

I don't know what if anything we need to re-test to just promote the update to flt_envs. The actual installation process in /proj/sot/ska would be to pause all cron jobs, backup the currently used flt_envs and sysarch, install the new files, make needed new links in arch, turn on cron jobs.

While the flt_envs change does hit everything, it is trivial to revert if we end up with unforeseen error.

A "clean" build from this github branch now fails at mysql (which we should probably remove), but if one just continues at that point, the branch is functional. The ska_testr tests look reasonable (fails at all the same places as current flight ska due to old modules). I recall that in original testing I also practiced "updating" a ska; I don't think we need to do that again but could do so if required.

I don't know when we are going to go with ska3 promotion, and don't want to confuse anybody, but it seems like this could go to LR this week or next or get out-of-cycle approval as needed.

@jeanconn
Copy link
Contributor Author

And unfortunately, it looks like the code at the tip of this PR is broken on the clean test skare on centos7. Here I've tried re-running flt_envs because it needs to get called to run the wrapped python.

village: source /proj/sot/ska/test_centos7_redo/bin/ska_envs.csh
village: flt_envs
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
ImportError: No module named site
village: /proj/sot/ska/test_centos7_redo/arch/linux-64/bin/ska_version 
2.18-r836-78d803e

I don't remember what is in /proj/sot/ska/test_centos7 but it looks to have a perl version of flt_envs that was working (I guess that had the "just fix sysarch branch").

It looks like /usr/bin/python even with "-E" is not happy in this environment.

@jeanconn
Copy link
Contributor Author

jeanconn commented Oct 11, 2018

OK. It looks like the issue here is that the CentOS7 system python is a 2.7 python. (CentOS6 was 2.6, CentOS5 was whatever).

So to be more clear on the issue:

  1. Source into SKA by calling SKA/bin/ska_envs.(c)sh
  2. try to call SKA/bin/python
  3. SKA/bin/python is a wrapper that calls flt_envs and then SKA_ARCH_OS/bin/python
  4. flt_envs calls /usr/bin/python -E and breaks

The issue is that when we call ska_envs.csh to get into the environment the first time, we set LD_LIBRARY_PATH such that it includes the conda Python 2.7 lib.

Then when we call /usr/bin/python -E

in the flt_envs wrapper, it doesn't matter that the "-E" is ignoring PYTHONHOME, because the real issue is LD_LIBRARY_PATH . /usr/bin/python basically just loads libpython2.7.so so it is now loading our conda one.

And then I think there's a mismatch between some path information on load of /usr/bin/python and the rest of our python in the conda area.

I'm starting to think again that bash or updated perl would be a better idea on the ska2 side for flt_envs.

@taldcroft
Copy link
Member

Oh my. Well do what you have to do.

@jeanconn
Copy link
Contributor Author

For python 2 do we really care? I'm inclined to just scuttle this PR and just update sysarch. This would leave the python 2 environment calling Perl CXC::Envs::Flight and sysarch.

If you don't like the "grabbed a bunch of updates from Diab and tweaked it myself" version of sysarch I can make a minimal update to sysarch to get it to work with CentOS7.

There are a bunch of ways to make this work, but the "forward thinking" approach we took here looks a bit like a dead end for the installed environment. The python flt_envs approach looks to be ok in the skare3 conda environment because we're just using the that conda python to run it (and don't get this mismatch).

@taldcroft
Copy link
Member

I'm OK now with the grabbed a bunch of updates and made it work approach now.

@jeanconn
Copy link
Contributor Author

OK. I'll close this as not workable in CentOS7 and retest #159 in clean test ska, and also double-check the update process.

I'm not sure how I missed for so long that this PR wasn't working on CentOS7. I must just have looked at module tests from ipython early on (works fine) and hadn't tried to run ska2 ska_testr pieces until this week.

Thanks @taldcroft !

@jeanconn jeanconn closed this Oct 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants