Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkout repos at latest tag by default #2

Merged
merged 7 commits into from
Jun 28, 2018
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 40 additions & 16 deletions ska_conda/ska_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,32 +11,56 @@ class SkaBuilder(object):

def __init__(self, ska_root=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be conda_build_root='.'. Using ska_root is confusing to me since I think that should be $SKA, which in general is a configured (directory). Historically we did build there, but that is considered to be a mistake. Also, just use convention of tools that default to writing in the current directory unless told otherwise. It makes debugging and usage by multiple users go more smoothly.

Then there could be a production process that uses one special place, explicitly specified in a cron job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with build_root instead of conda_build_root but can obviously just edit again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add user='sot', git_repo_path='git@github.com:{user}/{name}.git' kwargs (and set corresponding instance attrs).

if ska_root is None:
ska_root = "/data/acis/ska3_pkg/"
ska_root = "/tmp/ska3_pkg/"
self.ska_root = ska_root
self.ska_build_dir = os.path.join(self.ska_root, "builds")
self.ska_src_dir = os.path.join(self.ska_root, "src")
os.environ["SKA_TOP_SRC_DIR"] = self.ska_src_dir

def _clone_repo(self, name):
def _clone_repo(self, name, tag=None):
if name == "ska":
return
print("Cloning source %s." % name)
clone_path = os.path.join(self.ska_src_dir, name)
if os.path.exists(clone_path):
print("Source %s exists, skipping." % name)
return
yml = os.path.join(pkg_defs_path, name, "meta.yaml")
with open(yml) as f:
requires = False
while not requires:
line = f.readline().strip()
if line.startswith("path:"):
requires = True
data = yaml.load(f)
url = data['about']['home']
git.Repo.clone_from(url, clone_path)
if not os.path.exists(clone_path):
# Try ssh first to avoid needing passwords for the private repos
# We could add these ssh strings to the meta.yaml for convenience
try:
git_ssh_path = 'git@github.com:sot/' + name + '.git'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo = git.Repo.clone_from(self.git_repo_path.format(user=self.user, name=name), clone_path)

repo = git.Repo.clone_from(git_ssh_path, clone_path)
assert not repo.bare
except:
yml = os.path.join(pkg_defs_path, name, "meta.yaml")
with open(yml) as f:
requires = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is from #1, but what's going on with this pre-processing of the meta.yaml file? What fails if you just start from data = yaml.load(fh)? (PEP8, avoid using single-letter variable names. My go-to idiom here is fh for filehandle).

while not requires:
line = f.readline().strip()
if line.startswith("path:"):
requires = True
data = yaml.load(f)
url = data['about']['home']
repo = git.Repo.clone_from(url, clone_path)
else:
repo = git.Repo(clone_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to do the equivalent of git fetch origin to get new tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Though I'm wondering if we need to think about the use cases and problems of the "already-existing" dir some more anyway .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We stipulate that the existing directory is created and maintained only via this script. Problems then seem extremely unlikely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so do we need a way to build a custom/test package? Or would we just do that outside this tool?

Copy link
Member

@taldcroft taldcroft Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to handle this procedurally by making a release candidate tag such as 3.43.4rc1. Apart from testing the packaging itself, one can do functional/integration testing using pip install from the development git repo. I.e. spin up a dev Ska3, conda uninstall the package in question, pip install the dev version, then test. In that case an RC tag is not needed. But for the full process one can make the tag, build the package locally, and conda install using the local build path. Since "conda acknowledges PEP440", this should work.

But let's defer further specifics of this use case for a future PR if needed, it will be easier based on an already-working system. I'll stub a placeholder into the process wiki.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So adding get fetch origin seems fine if you want to reuse the repos dirs over some period of time and want to make sure you have the newest tags at origin. We could also just clone the repos fresh for every run of the tool and remove this code path, but that does seem a little annoying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I envision an "official" build process for production, done as aca user, that uses a permanent location like /proj/sot/ska3/conda-builds, or something like that. Fetching and pulling is definitely faster than re-cloning the whole thing every time.

repo.remotes.origin.fetch()
repo.remotes.origin.fetch("--tags")
Copy link
Contributor Author

@jeanconn jeanconn Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I could tell, the fetching of tags behavior seems to differ a bit based on git version, and I'm not sure about gitpython at all. Using both fetch and fetch --tags seemed, at worse, duplication.
I think that fetching from 'origin' will be appropriate in all cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use conda git for this right? That should make behavior uniform. Git fetch origin alone will always get the tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think gitpython gets the git in your path. And yes, conda git should probably work as the thing it gets in the path. Should get added to requirements (and I don't recall which environments/installs had issues with https). Just when trying to figure out how you do a fetch with gitpython, I had also seen the language that "most" tags should be reachable via git fetch and hadn't figured the exceptions yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that we need "--tags" to be safe if a tag has actually changed on origin. For example, I retagged skare3 repo and I wasn't getting the associated new commit until adding this back again. I know this shouldn't come up (you should re-use a tag), but...

Could also delete all local tags before the fetch, but that seems more problematic.

assert not repo.is_dirty()
# I think we want the commit/tag with the most recent date, though
# if we actually want the most recently created tag, that would probably be
# tags = sorted(repo.tags, key=lambda t: t.tag.tagged_date)
# I suppose we could also use github to get the most recent release (not tag)
if tag is None:
tags = sorted(repo.tags, key=lambda t: t.commit.committed_datetime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

repo.git.checkout(tags[-1].name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is obviously just one way of going about this. I'm also wondering now if the {{ GIT_DESCRIBE_TAG }} method can break if the tag isn't formatted in a way that conda will like. I note their docs say "Conda acknowledges PEP 440" but I don't really know what acknowledges means in that context. We can obviously just update tags to work if they are broken, so that's just something to keep in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have any release tags that are not compliant then they should be fixed. From your previous output it appears this is a non issue.

if tags[-1].commit == repo.heads.master.commit:
print("Auto-checked out at {} which is also tip of master".format(tags[-1].name))
else:
print("Auto-checked out at {} NOT AT tip of master".format(tags[-1].name))
else:
repo.git.checkout(tag)
print("Checked out at {}".format(tag))

def clone_one_package(self, name):
def clone_one_package(self, name, tag=None):
self._clone_repo(name)

def clone_all_packages(self):
Expand Down