Checkout repos at latest tag by default #2

jeanconn · 2018-06-23T15:31:07Z

Instead of checking out just the tip of master, checkout the most recent tag.

This sets the git clone pieces to checkout at the latest (by commit date) tag for the repos by default. Should also tag a tag as an option for the individual clone/fetches. Also adds a few lines to try to get the repos via ssh from the sot org if possible.

jeanconn · 2018-06-23T15:32:07Z

For an idea of which packages need some tag updates, here's the output with my silly print statements.

In [1]: run scripts/clone_ska_sources all
Cloning source Ska.Shell.
Auto-checked out at 3.3.1 NOT AT tip of master
Cloning source Ska.File.
Auto-checked out at 3.4.1 NOT AT tip of master
Cloning source pyyaks.
Auto-checked out at 3.3.4 NOT AT tip of master
Cloning source ska_path.
Auto-checked out at 3.1 NOT AT tip of master
Cloning source testr.
Auto-checked out at 3.2 which is also tip of master
Cloning source Ska.tdb.
Auto-checked out at 3.5.1 NOT AT tip of master
Cloning source Chandra.Time.
Auto-checked out at 3.20.1 NOT AT tip of master
Cloning source Ska.ParseCM.
Auto-checked out at 3.3.1 which is also tip of master
Cloning source Ska.DBI.
Auto-checked out at 3.8.2 NOT AT tip of master
Cloning source Ska.ftp.
Auto-checked out at 3.4.3 NOT AT tip of master
Cloning source Ska.Numpy.
Auto-checked out at 3.8.1 NOT AT tip of master
Cloning source Quaternion.
Auto-checked out at 3.4.1 which is also tip of master
Cloning source Ska.engarchive.
Auto-checked out at 3.43 NOT AT tip of master
Cloning source kadi.
Auto-checked out at 3.15.2 NOT AT tip of master
Cloning source Ska.Matplotlib.
Auto-checked out at 3.11.2 NOT AT tip of master
Cloning source Ska.quatutil.
Auto-checked out at 3.3.1 NOT AT tip of master
Cloning source Ska.Sun.
Auto-checked out at 3.5 which is also tip of master
Cloning source Chandra.Maneuver.
Auto-checked out at 3.7 which is also tip of master
Cloning source cmd_states.
Auto-checked out at 3.14 which is also tip of master
Cloning source xija.
Auto-checked out at 3.9 which is also tip of master
Cloning source maude.
Auto-checked out at 3.1 NOT AT tip of master

jeanconn · 2018-06-23T15:43:35Z

ska_conda/ska_builder.py

+        # I suppose we could also use github to get the most recent release (not tag)
+        if tag is None:
+            tags = sorted(repo.tags, key=lambda t: t.commit.committed_datetime)
+            repo.git.checkout(tags[-1].name)


This is obviously just one way of going about this. I'm also wondering now if the {{ GIT_DESCRIBE_TAG }} method can break if the tag isn't formatted in a way that conda will like. I note their docs say "Conda acknowledges PEP 440" but I don't really know what acknowledges means in that context. We can obviously just update tags to work if they are broken, so that's just something to keep in mind.

If we have any release tags that are not compliant then they should be fixed. From your previous output it appears this is a non issue.

taldcroft · 2018-06-24T18:48:28Z

This is going in the right direction. Overall I feel like the original implementation and API can be simplified. @jzuhone maybe had some different uses in mind, but I feel like there basically just needs to be three methods, __init__, _get_repo(self, name, tag=None), and build_packages(self, name=None, tag=None).

_get_repo either clones a fresh copy or uses the existing repo, and takes care of the work of fetching / pulling to make sure the return repo is at the correct tag (or latest if no tag is supplied). I don't immediately see a need an API for separately cloning / updating, that just happens as a matter of course when building.

build_packages builds either all the packages (if name is None) or builds the specified package. This can be done neatly with:

if name is None:
    names = [nm.strip() for nm in open(BUILD_LIST, "r") if nm.strip() and not nm.startswith('#')]
else:
    names = [name]
for name in names:
    repo = self._get_repo(name, tag)
    # Now do the few steps for building `repo`, probably checking to see if
    # the expected output conda package is already there.  This code is probably
    # short enough to just be in the loop here, but if not could be factored out to
    # self._build_package(repo)

taldcroft · 2018-06-24T18:03:15Z

ska_conda/ska_builder.py

+            except:
+                yml = os.path.join(pkg_defs_path, name, "meta.yaml")
+                with open(yml) as f:
+                    requires = False


I know this is from #1, but what's going on with this pre-processing of the meta.yaml file? What fails if you just start from data = yaml.load(fh)? (PEP8, avoid using single-letter variable names. My go-to idiom here is fh for filehandle).

taldcroft · 2018-06-24T18:08:26Z

ska_conda/ska_builder.py

+                    url = data['about']['home']
+                repo = git.Repo.clone_from(url, clone_path)
+        else:
+            repo = git.Repo(clone_path)


Don't you need to do the equivalent of git fetch origin to get new tags?

Good point! Though I'm wondering if we need to think about the use cases and problems of the "already-existing" dir some more anyway .

We stipulate that the existing directory is created and maintained only via this script. Problems then seem extremely unlikely.

OK, so do we need a way to build a custom/test package? Or would we just do that outside this tool?

I think we should be able to handle this procedurally by making a release candidate tag such as 3.43.4rc1. Apart from testing the packaging itself, one can do functional/integration testing using pip install from the development git repo. I.e. spin up a dev Ska3, conda uninstall the package in question, pip install the dev version, then test. In that case an RC tag is not needed. But for the full process one can make the tag, build the package locally, and conda install using the local build path. Since "conda acknowledges PEP440", this should work.

But let's defer further specifics of this use case for a future PR if needed, it will be easier based on an already-working system. I'll stub a placeholder into the process wiki.

So adding get fetch origin seems fine if you want to reuse the repos dirs over some period of time and want to make sure you have the newest tags at origin. We could also just clone the repos fresh for every run of the tool and remove this code path, but that does seem a little annoying.

Yes, I envision an "official" build process for production, done as aca user, that uses a permanent location like /proj/sot/ska3/conda-builds, or something like that. Fetching and pulling is definitely faster than re-cloning the whole thing every time.

taldcroft · 2018-06-24T18:14:41Z

ska_conda/ska_builder.py

+        # tags = sorted(repo.tags, key=lambda t: t.tag.tagged_date)
+        # I suppose we could also use github to get the most recent release (not tag)
+        if tag is None:
+            tags = sorted(repo.tags, key=lambda t: t.commit.committed_datetime)


taldcroft · 2018-06-24T18:55:53Z

ska_conda/ska_builder.py

@@ -11,32 +11,54 @@ class SkaBuilder(object):

    def __init__(self, ska_root=None):


I think this should be conda_build_root='.'. Using ska_root is confusing to me since I think that should be $SKA, which in general is a configured (directory). Historically we did build there, but that is considered to be a mistake. Also, just use convention of tools that default to writing in the current directory unless told otherwise. It makes debugging and usage by multiple users go more smoothly.

Then there could be a production process that uses one special place, explicitly specified in a cron job.

I went with build_root instead of conda_build_root but can obviously just edit again.

taldcroft · 2018-06-24T19:05:27Z

ska_conda/ska_builder.py

@@ -11,32 +11,54 @@ class SkaBuilder(object):

    def __init__(self, ska_root=None):


Add user='sot', git_repo_path='git@github.com:{user}/{name}.git' kwargs (and set corresponding instance attrs).

taldcroft · 2018-06-24T19:06:47Z

ska_conda/ska_builder.py

+            # Try ssh first to avoid needing passwords for the private repos
+            # We could add these ssh strings to the meta.yaml for convenience
+            try:
+                git_ssh_path = 'git@github.com:sot/' + name + '.git'


repo = git.Repo.clone_from(self.git_repo_path.format(user=self.user, name=name), clone_path)

taldcroft · 2018-06-24T19:19:32Z

With the changes here, it seems this is not so hardwired to building Ska and instead it is more about generally maintaining a list of conda packages. If you move these into the SkaBuilder init then this can really be used to build anything given a directory of build recipes and a spec of the package order.

pkg_defs_path = os.path.join(ska_conda_path, "pkg_defs")
build_list = os.path.join(ska_conda_path, "build_order.txt")

taldcroft · 2018-06-24T19:24:27Z

The lines if name == "ska" should be replaced with generic "determine if this is a metapackage" code. I'm not sure of the most robust method there, but it seems if the meta.yaml has no build instructions then that is a good indicator? Or no build.* file in the directory.

taldcroft · 2018-06-25T18:01:48Z

For an idea of which packages need some tag updates, here's the output with my silly print statements.

It turns out most of those NOT AT tip of master are because of the sot-wide update to add license lines to code files. I have done a scrub of that list and added appropriate release tags to the four actual cases where the tag doesn't match what is in skare/pkgs.manifest (py3).

jeanconn · 2018-06-25T18:09:19Z

At some point we probably also need to figure out build(er) requirements. ska_builder is presently using the git Python module which I don't think is in our current vanilla/non-ska conda env.

taldcroft · 2018-06-25T18:11:07Z

At some point we probably also need to figure out build(er) requirements. ska_builder is presently using the git Python module which I don't think is in our current vanilla/non-ska conda env.

No worries. This is now captured in the process doc.

jeanconn · 2018-06-25T21:28:44Z

ska_conda/ska_builder.py

+        else:
+            repo = git.Repo(clone_path)
+            repo.remotes.origin.fetch()
+            repo.remotes.origin.fetch("--tags")


As far as I could tell, the fetching of tags behavior seems to differ a bit based on git version, and I'm not sure about gitpython at all. Using both fetch and fetch --tags seemed, at worse, duplication.
I think that fetching from 'origin' will be appropriate in all cases.

We can use conda git for this right? That should make behavior uniform. Git fetch origin alone will always get the tags.

I think gitpython gets the git in your path. And yes, conda git should probably work as the thing it gets in the path. Should get added to requirements (and I don't recall which environments/installs had issues with https). Just when trying to figure out how you do a fetch with gitpython, I had also seen the language that "most" tags should be reachable via git fetch and hadn't figured the exceptions yet.

It looks to me that we need "--tags" to be safe if a tag has actually changed on origin. For example, I retagged skare3 repo and I wasn't getting the associated new commit until adding this back again. I know this shouldn't come up (you should re-use a tag), but...

Could also delete all local tags before the fetch, but that seems more problematic.

jeanconn · 2018-06-26T13:37:33Z

ska_conda/ska_builder.py

-                    self.ska_build_dir, "--no-test",
-                    "--no-anaconda-upload"]
+                    self.build_dir, "--no-test",
+                    "--no-anaconda-upload", "--skip-existing"]


It looks like the --skip-existing option is smart enough that it skips if you have an existing build at the requested version (not just a package built with the name) so this seems to just work for our use cases.

For verification of the behavior, I had just done a single test of this outside this code:

built Ska.Shell at 3.3.1

tried to build again with --skip-existing, but no build was done

checked out the repo at 3.3.2

tried to rebuild with --skip-existing and it built 3.3.2

Hopefully there are no gotchas.

I think this is not going to help conda to "know" when any meta packages should be rebuilt, so if we want that automated, we will need to figure out a mechanism.

jeanconn · 2018-06-26T13:37:57Z

ska_conda/ska_builder.py

        subprocess.run(cmd_list)

    def build_one_package(self, name):
        repo = self._get_repo(name)
-        if repo is not None:
-            repo.remote().pull()


I cut the pulls and the pull version checks below in favor of that --skip-existing option.

Yes, pull was never needed as long as you do a fetch and subsequent checkout at the desired tag.

jeanconn · 2018-06-26T13:39:18Z

ska_conda/ska_builder.py

@@ -59,30 +81,20 @@ def _build_package(self, name):
        print("Building package %s." % name)
        pkg_path = os.path.join(pkg_defs_path, name)
        cmd_list = ["conda", "build", pkg_path, "--croot",
-                    self.ska_build_dir, "--no-test",
-                    "--no-anaconda-upload"]
+                    self.build_dir, "--no-test",


It may take a little longer to build, but I'd prefer to fix the packages as needed so we can actually run the tests if provided.

Can you elaborate, I don't understand this comment.

I'd prefer to run the tests and fix any tests that need fixing, instead of skipping the tests with "--no-test".

Oh, and testing or not, I think we should check the status of the conda build command and do something with it (stop or save to report at the end).

As you know I'm not a big fan of build time testing, but we can discuss later when we have the core requirements all in place.

👍 on checking the build command status. Just call with check=True, timeout=120 so it raises an exception if anything went wrong or it takes too long.

Maybe a test option for the build? Mentioning because we haven't defined a process to make new recipes. If we just plop them in the repo, the easiest way to determine they are complete and correct is to run the tests. For example, by running the build tests, I just discovered that the maude recipe needs to have the requests package added as a runtime dependency/requirement. Of course, I could run the build and build tests outside the ska_builder process, but I think that might be over-complicated.

Also, that timeout won't work for some builds, so I'm not sure if it would be better to just not define one for now.

OK, make it an option that is disabled by default.

Agreed on not having the timeout.

jeanconn · 2018-06-26T13:45:05Z

ska_conda/ska_builder.py

+        self.user = user
+        self.git_repo_path = git_repo_path
+        self.build_dir = os.path.abspath(os.path.join(build_root, "builds"))
+        self.src_dir = os.path.abspath(os.path.join(build_root, "src"))


I don't know if there would be any other issues using abspath when trying to work somewhat relatively (using a build root explicitly defined as "." or something else relative), but this seems to work.

Using abspath tends to make logging output cruftier than required, but otherwise should almost always be OK.

jeanconn · 2018-06-26T13:48:12Z

ska_conda/ska_builder.py

+        self.git_repo_path = git_repo_path
+        self.build_dir = os.path.abspath(os.path.join(build_root, "builds"))
+        self.src_dir = os.path.abspath(os.path.join(build_root, "src"))
+        os.environ["SKA_TOP_SRC_DIR"] = self.src_dir


Not sure if we want to use this as a somewhat global environment variable, or pass it to conda build as an env in the subprocess.

I think it doesn't matter.

taldcroft · 2018-06-26T14:06:31Z

Do you see any need for the clone_ska_sources script? I don't.

jeanconn · 2018-06-26T14:14:34Z

It was helpful in testing/modifying the process to have the script to separate the two tasks but can probably be safely removed when we're done.

jeanconn · 2018-06-28T18:57:59Z

I think I'd like to merge this and do some other changes in separate/smaller PRs. Were there any outstanding issues you really wanted to see addressed in this PR @taldcroft ?

taldcroft

👍 for merge!

jeanconn added 2 commits June 23, 2018 11:24

Change default src dir to someplace in tmp for now

babe610

jeanconn commented Jun 23, 2018

View reviewed changes

taldcroft reviewed Jun 24, 2018

View reviewed changes

Fetch recent tags from remote into clone

eca8683

jeanconn commented Jun 25, 2018

View reviewed changes

jeanconn added 3 commits June 26, 2018 09:29

Use default root of . and change some dir names

caafcc3

Use --skip-existing to build new/updated packages

3e4bd69

Use absolute dir paths (conda build seems to require)

e7986c1

jeanconn commented Jun 26, 2018

View reviewed changes

Remove extra fetch of tags

f88ac92

jeanconn commented Jun 26, 2018

View reviewed changes

taldcroft approved these changes Jun 28, 2018

View reviewed changes

jeanconn merged commit 9c2c7cd into master Jun 28, 2018

jeanconn deleted the last_tag branch June 28, 2018 19:25

javierggt added a commit that referenced this pull request Jul 21, 2020

workflow fix #2

9af2659

javierggt mentioned this pull request Nov 12, 2020

Update sot/ska-sphinx-theme to Release 1.3.0 #525

Closed

javierggt mentioned this pull request Dec 2, 2020

Update sot/ska-sphinx-theme to Release 1.3.0 #544

Closed

		@@ -11,32 +11,54 @@ class SkaBuilder(object):

		def __init__(self, ska_root=None):

Checkout repos at latest tag by default #2

Checkout repos at latest tag by default #2

Conversation

jeanconn commented Jun 23, 2018

jeanconn commented Jun 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taldcroft commented Jun 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taldcroft Jun 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taldcroft commented Jun 24, 2018

taldcroft commented Jun 24, 2018

taldcroft commented Jun 25, 2018

jeanconn commented Jun 25, 2018

taldcroft commented Jun 25, 2018 • edited Loading

jeanconn Jun 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taldcroft commented Jun 26, 2018

jeanconn commented Jun 26, 2018

jeanconn commented Jun 28, 2018

taldcroft left a comment

Choose a reason for hiding this comment

taldcroft Jun 25, 2018 •

edited

Loading

taldcroft commented Jun 25, 2018 •

edited

Loading

jeanconn Jun 25, 2018 •

edited

Loading