Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some automation for doing periodic mass rebuilds of Fedora packages #864

Merged
merged 197 commits into from
Jan 8, 2025

Conversation

tstellar
Copy link
Collaborator

These mass rebuilds will help test the snapshot builds and catch regressions in real-world projects. The GitHub workflows are currently configured to start a new mass rebuild once per month and then when the rebuild is complete, it will create any issue reporting any regressions since the last build.

The goal is to keep the rebuilder as simple as possible, so it will only rebuild packages that successfully built in the last rebuild, and it makes no attempt to work around packages that hard code gcc as their compiler. As a result, some of the 'passing' builds may actually use gcc instead of clang, but this is an acceptable trade-off to keep the rebuild process simple.

These mass rebuilds will help test the snapshot builds and catch
regressions in real-world projects.  The GitHub workflows are
currently configured to start a new mass rebuild once per month
and then when the rebuild is complete, it will create any issue
reporting any regressions since the last build.

The goal is to keep the rebuilder as simple as possible, so it will
only rebuild packages that successfully built in the last rebuild, and
it makes no attempt to work around packages that hard code gcc as their
compiler.  As a result, some of the 'passing' builds may actually use
gcc instead of clang, but this is an acceptable trade-off to keep
the rebuild process simple.
Copy link
Collaborator

@kwk kwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments and probably need to look at this PR again to fully grasp it.

.github/workflows/rebuilder.py Outdated Show resolved Hide resolved
Comment on lines 121 to 122
# latest is a bit a successful build, but this doesn't mean it failed.
# It could be in progress.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks unfinished.

.github/workflows/rebuilder.py Outdated Show resolved Hide resolved
.github/workflows/rebuilder.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@kwk kwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tstellar I know that the CI system is buggy at the moment but I encourage you to add tests to your script. At least some doc tests which should be carried out as regular tests in our setup. I know this is hard for functions that involve copr and alike.

Copy link
Collaborator

@kwk kwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tstellar. Thank you for accepting so many of my suggestions. I've looked at your changes and I'm puzzled if it really was such a good idea to created CoprBuild and CoprPkg classes just to access properties like with objects. I think that before you had type hints that were simply wrong. It is a long time since I've last touched the Copr APIs but when I researched the responses of the Copr API I noticed that they don't return dicts but Munchs. And that allows for object like property access. For sure it helps to have less generic types like Munch but the mentioned subclassing in one of the comments is properly what I would go with.

class CoprPkg:

def __init__(self, data: dict):
self.data = data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for nitpicking but it is really hard to understand what needs to go into data. From looking in the code an element from the response of this copr_client.package_proxy.get_list method, needs to go here? The HTTP API response is documented here.

Comment on lines 145 to 156
def get_builds_from_copr(
project_owner: str, project_name: str, copr_client: copr.v3.Client
) -> list[dict]:
return copr_client.package_proxy.get_list(
project_owner,
project_name,
with_latest_succeeded_build=True,
with_latest_build=True,
)
) -> list[CoprPkg]:
return [
CoprPkg(p)
for p in copr_client.package_proxy.get_list(
project_owner,
project_name,
with_latest_succeeded_build=True,
with_latest_build=True,
)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does is make sense to make this a class method of CoprPkg? At least topic wise they both belong together.

Also, the function is currently called get_builds_from_copr but it returns a list of Copr packages. I suggest to rename it get_packages_from_copr.

So something like:

@classmethod
def get_packages_from_copr(cls, project_owner: str, project_name: str, copr_client: copr.v3.Client) -> list["CoprPkg"]:
    return [
        CoprPkg(p)       
        for p in copr_client.package_proxy.get_list(
            project_owner,
            project_name,
            with_latest_succeeded_build=True,
            with_latest_build=True,
         )
    ]

@@ -121,30 +177,28 @@ def get_monthly_rebuild_packages(pkgs: set[str], copr_pkgs: list[dict]) -> set[s
>>> c = {"name" : "c", "builds" : { "latest" : { "id" : 2 } , "latest_succeeded" : { "id" : 1 } } }
>>> d = {"name" : "d", "builds" : { "latest" : { "id" : 2 } , "latest_succeeded" : { "id" : 2 } } }
>>> pkgs = { "b", "c", "d"}
>>> copr_pkgs = [a, b, c, d]
>>> copr_pkgs = [CoprPkg(p) for p in [a, b, c, d]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the more work of having to adjust test cased but it shows how awesome this doc tests really are. The documentation is code that MUST run.

if not latest_succeeded:
pkgs.discard(p["name"])
if not p.latest_succeeded:
pkgs.discard(p.name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I read this code I wonder if all you needed to do was to change the type hint of the copr_pkgs argument. Well, it is a hint so it wouldn't be necessary but I know that the Munch type returned from the Copr API behaves like a dict but you can access elements like in objects.

In other words: The p.name access is possible even without having CoprPkg.

If you still want CoprPkg which I think makes sense you can experiment with:

class CoprPkg(Munch):
    pass

or if you want to really show what properties there are:

class CoprPkg(Munch):
    @property
    def name(self) -> str:
        return self.name

The same applies to CoprBuild. Note that you may need to wrap Munch responses in your own classes to fully work here. But that's worth it, depending on how much your classes actually can do.

continue

# Don't report regressions if there are still builds in progress
if latest["state"] not in [
if p.latest.state not in [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test screems for a propery def is_in_progress(self) -> bool.

Comment on lines 380 to 386
elif args.command == "get-snapshot-date":
project = copr_client.project_proxy.get(project_owner, project_name)
for repo in project['additional_repos']:
match = re.match(r"copr://@fedora-llvm-team/llvm-snapshots-big-merge-([0-9]+)$", repo)
if not match:
continue
print(datetime.datetime.fromisoformat(match.group(1)).isoformat())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints more than one date so I suggest to rename it to get-snapshot-dates.

@kwk
Copy link
Collaborator

kwk commented Jan 8, 2025

@tstellar The only thing I ask for is that you cleanup the commit history a bit before merging, e.g. squashing the commits. I wouldn't want to get 197 commits in main just from this PR ;)

@tstellar tstellar merged commit 351ff2b into fedora-llvm-team:main Jan 8, 2025
4 of 5 checks passed
kwk added a commit to kwk/llvm-snapshots that referenced this pull request Jan 9, 2025
Just to clean up the naming.

See fedora-llvm-team#864 for the original mass rebuild PR.
kwk added a commit that referenced this pull request Jan 9, 2025
Just to clean up the naming.

See #864 for the original mass rebuild PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants