Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Track Processor Interface #1613

Merged
merged 8 commits into from
Jan 30, 2023

Conversation

DJRickyB
Copy link
Contributor

@DJRickyB DJRickyB commented Nov 7, 2022

Closes #1534

In this PR, I have done the following:

  • Moved advanced topics to their own page
  • Renamed each advanced topic to hint at what problem they are attempting to solve (e.g. Template Language becomes Templating Track Files with Jinja2)
  • Documented the TrackProcessor interface, including the pitfall of not having Allow tracks to enable or disable the DefaultTrackPreparator #1257 yet.

A note for reviewers, the "meat" of the change here is at the bottom of advanced.rst, which by default is collapsed by GH as being too large. The rest is mostly extracted/moved content and some cleaned up typing in doc strings.

@DJRickyB DJRickyB self-assigned this Jan 25, 2023
@DJRickyB DJRickyB requested a review from a team January 25, 2023 20:07
@DJRickyB DJRickyB added the :Docs Changes to the documentation label Jan 25, 2023
@DJRickyB DJRickyB added this to the 2.7.1 milestone Jan 25, 2023
@DJRickyB DJRickyB marked this pull request as ready for review January 25, 2023 21:27
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good, thank you!

It's basically LGTM from me (I only looked at the advanced.rst after Controlling Task Execution Behavior With Custom Schedulers, I assume the rest are the same). The only additional ask is whether we could have a high level description of DocumentSetPreparator (can also be done in a follow up PR, since this has been in the making for a while).

self.track = None

def on_after_load_track(self, track):
...
Copy link
Contributor

@dliappis dliappis Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the three dots here make me think that we are deliberately skipping code to make the example more readable. The actual codebase as of now though doesn't defined this method at all, so maybe we should just replace ... with pass to prevent other readers of these docs wondering the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sorry, there is some leftover here. I intended to document the duck-typed version of the default processor, but you can see it is also still subclassing here. I had pickling issues at the point where I tried to make TrackProcessor an ABC, and so partially reverted that change


In this case, you can see by default we do nothing here for ``on_after_load_track`` to mutate the track, but yield a tuple of the ``prepare_docs``
function and its parameters for each corpus in the track ``corpora``. After this is called, these tuples are given to each TrackProcessor worker actor
to be executed in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explain a bit more what the expected params typically are? e.g. here we see the config object, the track, which are easy to understand, but it's unclear what the DocumentSetPreparator does (passed via preparator). A high level explanation would be beneficial in my opinion.

@DJRickyB DJRickyB requested a review from dliappis January 27, 2023 15:56
Copy link
Contributor

@dliappis dliappis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you!

@dliappis dliappis merged commit d5eae56 into elastic:master Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Docs Changes to the documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document Track Processor Interface
2 participants