Build process: use Celery handlers #8815

humitos · 2022-01-17T10:01:24Z

I'm writing some notes about the decisions I've made, problems I've had and some other concerns and things I'm noticing while working on this refactor. Hopefully, we can come back to these notes in the future to keep cleaning up our code. These notes are useful for a better understanding of this PR's changes: https://hackmd.io/4TP3_EizRF-nCDrPvf4_ug

I'm happy to have a call and do some "pair review" for this work because it's quite big. However, I think it won't be too hard to communicate and understand the changes if we are talking to each other.

Also note the important commits are those that are prefixed with Build: on their message. So, I suggest taking a look at them mainly.

Closes #3984
Closes #8762
Closes #7627
Closes #7587
Closes #6452
Closes #5672
Closes #4468
It will help to implement #7660

Initial migration/refactor of the build process to make usage of the Celery handlers (`on_success`, `on_failure`, etc)

We use this environment a lot from our tests, which I don't like. However, for now, I'm keeping the because the change is not small. We should split unit-test from integration test and reduce the complexity of executing "real commands" when calling tests in exchange of more mocked data instead and unit tests for the logic.

Integration tests should not create the environment where to execute the commands nor passing weird attributes to the task. They should mock the responses of the API for the task to grab the correct data instead. This way, we avoid having "extra arguments" on the tasks just for test purposes and also avoid cheating the normal flow by overriding the environment with custom attributes. This creates lot of complexity in our code and we endup with multiple different flows that are not normal and won't ever happen in production but that we need to maintain just for the tests. In particular, we should avoid "integration tests" when they are not strictly required. In this particular case, we could create a unit-test that checks the logic for `SyncRepositoryMixin.validate_duplicate_reserved_versions` which is a lot easier to generate the context required and then an "integration test" with most things mocked (no `git` commands are executed at all) and only checks that function is being called.

We are always using `CLEAN_AFTER_BUILD=True`, so there is no environment to wipe anymore. This is a confusing message to send to the user since it has no effect. I'm also removing the code because it doesn't make sense anymore. If we ever need it again, we can re-implement it correctly and recover it from here.

We are not caching pip packages anymore.

There are lot of tests not really accurate and very complex without reason. I'd like to use the same flow in production than in tests instead of short-circuiting/hacking the process in tests to behave as we want. We should use mocks to avoid executing commands when running tests instead, but making sure the "build process" is following exact the same flow in both environments. If we want to test something in particular outside the flow, we should create unit tests for those functions/methods.

humitos · 2022-01-17T18:52:52Z

Yay! Only 85 tests skipped 💪🏼 at this point.

This way we avoid doing an aditional API call to get the version, since we already have it in memory.

I wasn't able to mock the whole task as I wanted yet, but I follow a similar convention than the one used for the migrated test. However, I've simplify them a little.

agjohnson · 2022-01-18T18:01:57Z

Phew! This seems like a lot of work to wrangle! 💪

One thing that would make review easier on the backend team would be splitting up or rebasing some of this work when you're ready. For instance, one commit to split up project/tasks.py into multiple files, one commit to alter the files. This would make it much easier to follow along with what has changed in the logic there.

But tests are isolated nicely, and most of the other refactors make sense logically so far. 👍 It was mostly just the project tasks that were hard to follow here.

There is no need to have a `base.py` and `BuildTaskBase` because it don't share code with sync repository task.

Skip moving the file if the old artifact path does not exist.

Ideally, we should be able to mock: * API calls * Commands run inside the environment * Git repository * Read the Docs config file This commit mocks most of them. The only missing here is the config file, but there should be code already for that. These mocks allow us to use the exact same flow than in production (e.g. hitting the API to get the Version/Build/Project object) and avoid short-circuiting it. Once the "mocked build" is finished, we can assert all the calls to them.

That way, it executes the `on_success`, `on_failure`, etc.

New test to check that we are updating the build object via the API, we are sending notifications and more when the build fails.

It checks all the API calls done to the webs, and that the functions we expect to be called are effectively called.

Move tests from `test_celery.py` to the "new system" and remove the invalid ones.

agjohnson

I reviewed the changes that happened since our meeting, looking good still! I won't yet call this a +1 yet, but this is mostly me understanding the changes still.

Things I haven't reviewed well yet, and where I'd like to spend some time in the next few days:

Some of the code changes that happened before our meeting, and reviewing the project/build tasks in a bit more depth. I mostly eyeballed the difficult files to understand the logic, but didn't look very deeply at the changes yet. I'm guessing it might help to jump on another call at some point to discuss any remaining questions here, but let me see what questions I come up with before I distract you with this.
Skipped tests and test coverage. I'm still wrapping my head around some of the changes still, but I don't see anything really needing changes. Some of the skipped tests might worry me because I don't trust I can help discover issues more than good test coverage would. I do agree with your instinct on some of the skipped tests so far though, so most likely nothing will come of this.

I'm planning on spending more time with this in the next few days.

agjohnson · 2022-01-27T01:47:28Z

readthedocs/rtd_tests/tests/test_doc_building.py

+#
+# Once we mock the DockerBuildEnvironment properly, we could also translate the
+# new tests from `readthedocs/projects/tests/test_build_tasks.py` to use this
+# mocks.


👍 I see what you mean with the tests above, but they do technically work the same with a more minimal implementation (and no mocking). I suppose I'm not too worried about the Docker environment test case for this as the coverage wouldn't be vastly improved. The exit code handling is slightly different when inspecting docker exec but it does seem like a lot just to test our processing of the Docker exec API return.

agjohnson · 2022-01-27T01:50:03Z

readthedocs/worker.py

@@ -25,4 +25,51 @@ def create_application():
    return application


+def register_renamed_tasks(application, renamed_tasks):


Just to track future work, we'll probably automate pulling TODOs out from this PR after merge.

Suggested change

def register_renamed_tasks(application, renamed_tasks):

# TODO Remove this function when we have fully migrated to the new task paths

def register_renamed_tasks(application, renamed_tasks):

Use a `Task.data` (`readthedocs.projects.task.builds.TaskData` object) to store all the data the task needs to work instead of storing it directly using `self.`. This is to allow us a simpler way to perform a clean _before_ (and/or _after_) starting the execution of a new task and avoid potentially sharing state with a previous task executed that may not be able to perform the cleanup. The only thing we need to keep in mind is that when modifying these Celery tasks, we _always_ have to add any new value inside the `self.data.<my-new-attribute>` and not directly `self.<my-new-attribute>` to avoid this problem. In the future, we could implement this protection at a code level if we want to avoid this mistake. See #8815 (comment) See https://docs.celeryproject.org/en/master/userguide/tasks.html#instantiation

…humitos/celery-handlers

…thedocs.org into humitos/celery-handlers-self

…humitos/celery-handlers

I wasn't able to solve this problem in readthedocs/readthedocs.org#8815 and I don't see there is anything in that PR's code producing a Cyclic Import. The code works properly and I haven't seen an exception on it. I support it may be related with a bug in pylint or something similar.

humitos · 2022-02-07T17:55:44Z

I'm not able to solve the cyclic-import. The module reported with the cyclic import by pylint does not make any sense. So, I'm not sure how to solve this. I spent some time trying to figure it out, but I failed. It seems to be a bug in pylint or something like that 🤷🏼

I opened readthedocs/common#103 to disable the cycle-import check in case we want to follow that path because I ran out of ideas about what it could be and the code is working properly locally and in production (with the instance deployed with this branch).

agjohnson · 2022-02-07T19:39:35Z

pycycle is equally not helpful here 😞

(the cyclic import details are missing, this should tell us where the cyclic import is)

Perhaps something more specific to django cyclic imports would help track this down. If we have two tools hinting at cyclic imports, it might be something to address now

agjohnson · 2022-02-07T20:19:58Z

Tuning pycycle a bit more to narrow this down, the error goes away:

(rtfd38) community/ humitos/celery-handlers* 💀  pycycle --here --ignore user_builds,bower_components,docs,media,common,.ropeproject,dockerfiles,node_modules          
Project successfully transformed to AST, checking imports for cycles..
No worries, no cycles here!
If you think some cycle was missed, please open an Issue on Github.
Finished.

🤷❓

Edit: also, noted in chat, pycycle last release was 2017, so not sure if output is really to be trusted here

I wasn't able to find nor solve the problem with the `cyclic-import` error reported by pylint. So, I tried moving the utils functions used by search tasks out from `readthedocs.projects.tasks.utils` into `readthedocs.projects.tasks.search` instead. This made `prospector` and `pylint` to pass locally without reporting any issue.

…thedocs.org into humitos/celery-handlers-self

agjohnson

I finally got a chance to spend some more time with this PR. I reviewed work since our meeting and re-reviewed project tasks modules that I previously skimmed.

At a high level, all of the changes make sense and the new celery task pattern is really easy to follow. I can't say that I did a great job reviewing the code at a low level though. There was a lot of code that was moved alongside refactoring.

So, this is a +1, but also a little bit of a yolo +1 as well. I think we are mostly confident at this point that no obvious issues were introduced here, but we can probably expect a PR of this size to introduce a few non-obvious issues as well.

@humitos does have notes accumulated here on next steps. We should turn some into issues to return to later.

humitos added 2 commits January 13, 2022 16:27

Build process: use Celery handlers

3b758a6

Initial migration/refactor of the build process to make usage of the Celery handlers (`on_success`, `on_failure`, etc)

Build process: more updates and cleanup

b7e4683

humitos added the PR: work in progress Pull request is not ready for full review label Jan 17, 2022

humitos added 11 commits January 17, 2022 13:48

Test: fix import issues for functions moved

61b5e8b

Test: remove tests for deleted "features"

f80d2c7

Test: comment about not passing environment when required

d3f0dde

Build: add self.environment_class variable to sync_repository_task

f1c960b

Test: more import moves

c72f02c

Test: mock fixes and import paths

a5123cc

Cleanup: remove pip cache method helper

640b3d3

We are not caching pip packages anymore.

humitos added 3 commits January 18, 2022 15:37

Cleanup: remove outdated things

a6053d9

clean_build: pass the version object

7b382dc

This way we avoid doing an aditional API call to get the version, since we already have it in memory.

Test: migrate rtd_tests/test_builds.py to new format

4f3e213

I wasn't able to mock the whole task as I wanted yet, but I follow a similar convention than the one used for the migrated test. However, I've simplify them a little.

humitos added 11 commits January 19, 2022 09:38

Build: re-organize files

38a005a

There is no need to have a `base.py` and `BuildTaskBase` because it don't share code with sync repository task.

Build: the reference to the task is just self

5abb5a7

Build: follow the same pattern for localmedia than the others

fdd884d

Skip moving the file if the old artifact path does not exist.

Typo: remove comma (,) from logging line to match the rest

4eaab60

Test: call the task with .delay

65d9d04

That way, it executes the `on_success`, `on_failure`, etc.

Test: check important calls when the build fails

d3e0bb7

New test to check that we are updating the build object via the API, we are sending notifications and more when the build fails.

Cleanup: small fixes and comments

535834b

Test: full test for successful build

d49ef65

It checks all the API calls done to the webs, and that the functions we expect to be called are effectively called.

Test: migrate/update build tests

2fba199

Move tests from `test_celery.py` to the "new system" and remove the invalid ones.

Test: ignore errors when cleaning up test environment

2263a53

agjohnson reviewed Jan 27, 2022

View reviewed changes

This was referenced Jan 27, 2022

Environmet cache: breaks when using build.tools #8604

Closed

Builds fail trying to create_container and/or remove_container #7583

Closed

Notification on duplicated builds #7177

Closed

humitos mentioned this pull request Feb 1, 2022

Celery: use an internal namespace to store build task's data #8874

Merged

humitos added 2 commits February 1, 2022 10:37

Lint: undefined variable

8b237a3

humitos mentioned this pull request Feb 1, 2022

Design doc: collect data about builds #8124

Merged

humitos added 6 commits February 3, 2022 12:12

Merge branch 'master' of github.com:readthedocs/readthedocs.org into …

699fe96

…humitos/celery-handlers

Merge branch 'humitos/celery-handlers' of github.com:readthedocs/read…

e08c00e

…thedocs.org into humitos/celery-handlers-self

Lint

017df68

Test: check for pinned version of pip

a546f59

Lint

4c89df1

Merge branch 'master' of github.com:readthedocs/readthedocs.org into …

fde6a1a

…humitos/celery-handlers

humitos mentioned this pull request Feb 7, 2022

pylint/prospector: disable cyclic-import check readthedocs/common#103

Closed

humitos force-pushed the humitos/celery-handlers branch from aa25ab8 to f8ab09e Compare February 7, 2022 22:09

Merge branch 'humitos/celery-handlers' of github.com:readthedocs/read…

2491854

…thedocs.org into humitos/celery-handlers-self

humitos enabled auto-merge February 7, 2022 22:26

Merge pull request #8874 from readthedocs/humitos/celery-handlers-self

b68730d

agjohnson approved these changes Feb 7, 2022

View reviewed changes

humitos merged commit 17b225f into master Feb 7, 2022

humitos deleted the humitos/celery-handlers branch February 7, 2022 23:39

humitos mentioned this pull request Feb 8, 2022

Reduce verbose logging on generic command failure #8904

Merged

humitos mentioned this pull request Mar 15, 2022

Refactor BuildCommand and move outside doc_builder/ #3541

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build process: use Celery handlers #8815

Build process: use Celery handlers #8815

humitos commented Jan 17, 2022 •

edited

Loading

humitos commented Jan 17, 2022

agjohnson commented Jan 18, 2022 •

edited

Loading

agjohnson left a comment

agjohnson Jan 27, 2022

agjohnson Jan 27, 2022

humitos commented Feb 7, 2022

agjohnson commented Feb 7, 2022 •

edited

Loading

agjohnson commented Feb 7, 2022 •

edited

Loading

agjohnson left a comment

		@@ -25,4 +25,51 @@ def create_application():
		return application


		def register_renamed_tasks(application, renamed_tasks):

	def register_renamed_tasks(application, renamed_tasks):
	# TODO Remove this function when we have fully migrated to the new task paths
	def register_renamed_tasks(application, renamed_tasks):

Build process: use Celery handlers #8815

Build process: use Celery handlers #8815

Conversation

humitos commented Jan 17, 2022 • edited Loading

humitos commented Jan 17, 2022

agjohnson commented Jan 18, 2022 • edited Loading

agjohnson left a comment

Choose a reason for hiding this comment

agjohnson Jan 27, 2022

Choose a reason for hiding this comment

agjohnson Jan 27, 2022

Choose a reason for hiding this comment

humitos commented Feb 7, 2022

agjohnson commented Feb 7, 2022 • edited Loading

agjohnson commented Feb 7, 2022 • edited Loading

agjohnson left a comment

Choose a reason for hiding this comment

humitos commented Jan 17, 2022 •

edited

Loading

agjohnson commented Jan 18, 2022 •

edited

Loading

agjohnson commented Feb 7, 2022 •

edited

Loading

agjohnson commented Feb 7, 2022 •

edited

Loading