-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkout repos at latest tag by default #2
Conversation
This sets the git clone pieces to checkout at the latest (by commit date) tag for the repos by default. Should also tag a tag as an option for the individual clone/fetches. Also adds a few lines to try to get the repos via ssh from the sot org if possible.
For an idea of which packages need some tag updates, here's the output with my silly print statements.
|
# I suppose we could also use github to get the most recent release (not tag) | ||
if tag is None: | ||
tags = sorted(repo.tags, key=lambda t: t.commit.committed_datetime) | ||
repo.git.checkout(tags[-1].name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obviously just one way of going about this. I'm also wondering now if the {{ GIT_DESCRIBE_TAG }} method can break if the tag isn't formatted in a way that conda will like. I note their docs say "Conda acknowledges PEP 440" but I don't really know what acknowledges means in that context. We can obviously just update tags to work if they are broken, so that's just something to keep in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have any release tags that are not compliant then they should be fixed. From your previous output it appears this is a non issue.
This is going in the right direction. Overall I feel like the original implementation and API can be simplified. @jzuhone maybe had some different uses in mind, but I feel like there basically just needs to be three methods,
|
except: | ||
yml = os.path.join(pkg_defs_path, name, "meta.yaml") | ||
with open(yml) as f: | ||
requires = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is from #1, but what's going on with this pre-processing of the meta.yaml
file? What fails if you just start from data = yaml.load(fh)
? (PEP8, avoid using single-letter variable names. My go-to idiom here is fh
for filehandle).
url = data['about']['home'] | ||
repo = git.Repo.clone_from(url, clone_path) | ||
else: | ||
repo = git.Repo(clone_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to do the equivalent of git fetch origin
to get new tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Though I'm wondering if we need to think about the use cases and problems of the "already-existing" dir some more anyway .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We stipulate that the existing directory is created and maintained only via this script. Problems then seem extremely unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so do we need a way to build a custom/test package? Or would we just do that outside this tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be able to handle this procedurally by making a release candidate tag such as 3.43.4rc1
. Apart from testing the packaging itself, one can do functional/integration testing using pip install from the development git repo. I.e. spin up a dev Ska3, conda uninstall the package in question, pip install the dev version, then test. In that case an RC tag is not needed. But for the full process one can make the tag, build the package locally, and conda install using the local build path. Since "conda acknowledges PEP440", this should work.
But let's defer further specifics of this use case for a future PR if needed, it will be easier based on an already-working system. I'll stub a placeholder into the process wiki.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So adding get fetch origin
seems fine if you want to reuse the repos dirs over some period of time and want to make sure you have the newest tags at origin. We could also just clone the repos fresh for every run of the tool and remove this code path, but that does seem a little annoying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I envision an "official" build process for production, done as aca
user, that uses a permanent location like /proj/sot/ska3/conda-builds
, or something like that. Fetching and pulling is definitely faster than re-cloning the whole thing every time.
# tags = sorted(repo.tags, key=lambda t: t.tag.tagged_date) | ||
# I suppose we could also use github to get the most recent release (not tag) | ||
if tag is None: | ||
tags = sorted(repo.tags, key=lambda t: t.commit.committed_datetime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
ska_conda/ska_builder.py
Outdated
@@ -11,32 +11,54 @@ class SkaBuilder(object): | |||
|
|||
def __init__(self, ska_root=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be conda_build_root='.'
. Using ska_root
is confusing to me since I think that should be $SKA
, which in general is a configured (directory). Historically we did build there, but that is considered to be a mistake. Also, just use convention of tools that default to writing in the current directory unless told otherwise. It makes debugging and usage by multiple users go more smoothly.
Then there could be a production process that uses one special place, explicitly specified in a cron job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with build_root
instead of conda_build_root
but can obviously just edit again.
ska_conda/ska_builder.py
Outdated
@@ -11,32 +11,54 @@ class SkaBuilder(object): | |||
|
|||
def __init__(self, ska_root=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add user='sot', git_repo_path='git@github.com:{user}/{name}.git'
kwargs (and set corresponding instance attrs).
ska_conda/ska_builder.py
Outdated
# Try ssh first to avoid needing passwords for the private repos | ||
# We could add these ssh strings to the meta.yaml for convenience | ||
try: | ||
git_ssh_path = 'git@github.com:sot/' + name + '.git' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repo = git.Repo.clone_from(self.git_repo_path.format(user=self.user, name=name), clone_path)
With the changes here, it seems this is not so hardwired to building Ska and instead it is more about generally maintaining a list of conda packages. If you move these into the
|
The lines |
It turns out most of those |
At some point we probably also need to figure out build(er) requirements. ska_builder is presently using the |
No worries. This is now captured in the process doc. |
ska_conda/ska_builder.py
Outdated
else: | ||
repo = git.Repo(clone_path) | ||
repo.remotes.origin.fetch() | ||
repo.remotes.origin.fetch("--tags") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I could tell, the fetching of tags behavior seems to differ a bit based on git version, and I'm not sure about gitpython at all. Using both fetch and fetch --tags seemed, at worse, duplication.
I think that fetching from 'origin' will be appropriate in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use conda git for this right? That should make behavior uniform. Git fetch origin alone will always get the tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think gitpython gets the git in your path. And yes, conda git should probably work as the thing it gets in the path. Should get added to requirements (and I don't recall which environments/installs had issues with https). Just when trying to figure out how you do a fetch with gitpython, I had also seen the language that "most" tags should be reachable via git fetch
and hadn't figured the exceptions yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that we need "--tags" to be safe if a tag has actually changed on origin. For example, I retagged skare3 repo and I wasn't getting the associated new commit until adding this back again. I know this shouldn't come up (you should re-use a tag), but...
Could also delete all local tags before the fetch, but that seems more problematic.
self.ska_build_dir, "--no-test", | ||
"--no-anaconda-upload"] | ||
self.build_dir, "--no-test", | ||
"--no-anaconda-upload", "--skip-existing"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the --skip-existing option is smart enough that it skips if you have an existing build at the requested version (not just a package built with the name) so this seems to just work for our use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For verification of the behavior, I had just done a single test of this outside this code:
- built Ska.Shell at 3.3.1
- tried to build again with --skip-existing, but no build was done
- checked out the repo at 3.3.2
- tried to rebuild with --skip-existing and it built 3.3.2
Hopefully there are no gotchas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not going to help conda to "know" when any meta packages should be rebuilt, so if we want that automated, we will need to figure out a mechanism.
subprocess.run(cmd_list) | ||
|
||
def build_one_package(self, name): | ||
repo = self._get_repo(name) | ||
if repo is not None: | ||
repo.remote().pull() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cut the pulls and the pull version checks below in favor of that --skip-existing option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, pull was never needed as long as you do a fetch and subsequent checkout at the desired tag.
@@ -59,30 +81,20 @@ def _build_package(self, name): | |||
print("Building package %s." % name) | |||
pkg_path = os.path.join(pkg_defs_path, name) | |||
cmd_list = ["conda", "build", pkg_path, "--croot", | |||
self.ska_build_dir, "--no-test", | |||
"--no-anaconda-upload"] | |||
self.build_dir, "--no-test", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may take a little longer to build, but I'd prefer to fix the packages as needed so we can actually run the tests if provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate, I don't understand this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to run the tests and fix any tests that need fixing, instead of skipping the tests with "--no-test".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and testing or not, I think we should check the status of the conda build command and do something with it (stop or save to report at the end).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you know I'm not a big fan of build time testing, but we can discuss later when we have the core requirements all in place.
👍 on checking the build command status. Just call with check=True, timeout=120
so it raises an exception if anything went wrong or it takes too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a test option for the build? Mentioning because we haven't defined a process to make new recipes. If we just plop them in the repo, the easiest way to determine they are complete and correct is to run the tests. For example, by running the build tests, I just discovered that the maude recipe needs to have the requests package added as a runtime dependency/requirement. Of course, I could run the build and build tests outside the ska_builder process, but I think that might be over-complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, that timeout won't work for some builds, so I'm not sure if it would be better to just not define one for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, make it an option that is disabled by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on not having the timeout.
self.user = user | ||
self.git_repo_path = git_repo_path | ||
self.build_dir = os.path.abspath(os.path.join(build_root, "builds")) | ||
self.src_dir = os.path.abspath(os.path.join(build_root, "src")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if there would be any other issues using abspath when trying to work somewhat relatively (using a build root explicitly defined as "." or something else relative), but this seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using abspath tends to make logging output cruftier than required, but otherwise should almost always be OK.
self.git_repo_path = git_repo_path | ||
self.build_dir = os.path.abspath(os.path.join(build_root, "builds")) | ||
self.src_dir = os.path.abspath(os.path.join(build_root, "src")) | ||
os.environ["SKA_TOP_SRC_DIR"] = self.src_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we want to use this as a somewhat global environment variable, or pass it to conda build as an env in the subprocess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it doesn't matter.
Do you see any need for the clone_ska_sources script? I don't. |
It was helpful in testing/modifying the process to have the script to separate the two tasks but can probably be safely removed when we're done. |
I think I'd like to merge this and do some other changes in separate/smaller PRs. Were there any outstanding issues you really wanted to see addressed in this PR @taldcroft ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for merge!
Instead of checking out just the tip of master, checkout the most recent tag.