Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-info: support new update-lookup and update-watch service #5467

Merged
merged 9 commits into from
Oct 24, 2023

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Sep 22, 2023

per discussion in #5451

Problem: In the future, several services will need to know a job's resources and know the updates that would apply to them.
This would currently require users to read R, watch the eventlog, and then apply resource-update events as they happen. It would be nice if a service did this as there will be multiple users.

Solution: Support a new job-info.update-watch streaming service. It currently supports only the key "R", but can be extended to other keys in the future.

The service will read R and the eventlog for a job and apply all resource-update changes as needed. This initial "R" will sent back to the caller. If the job has completed, the RPC streaming service ends. If not, the eventlog will be watched for future resource-update events. On each new resource-update event, a new R will be streamed back to the caller. This continues until the job ends or the caller cancels the stream.


Notes:

  • It might be wise to add some type of "caching" of sorts as multiple callers may want to watch the same job's R value. I didn't do that at the moment. I figured that could be an optimization for the future.

  • per discussion in Idea: job-info: add streaming RPC to "watch" R #5451 future options could include a "uniq" option as a "full history" option.

@chu11
Copy link
Member Author

chu11 commented Sep 23, 2023

It might be wise to add some type of "caching" of sorts as multiple callers may want to watch the same job's R value. I didn't do that at the moment. I figured that could be an optimization for the future.

As I think about this more, is the caching critically important? It would definitely save some unnecessary lookups. Lemme see how hard it'll be to add right now.

Note: I think such support would be in a follow up PR, lets look at this PR as "first round implementation".

@chu11 chu11 force-pushed the issue5451_job_info_watch_R branch 10 times, most recently from 1b75bba to e01ce52 Compare September 26, 2023 04:31
@chu11
Copy link
Member Author

chu11 commented Sep 26, 2023

although a few parts were a little tricky, wasn't horrible to add the "multiple watchers under 1 eventlog-watch" support.

The commits could be squashed in at a later time, but for ease, I left the separate commits in so the review (can maybe) be easier.

* to:
* - execution.expiration
*/
if (!streq (key, "execution.expiration"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the other PR, the actual key is just expiration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I now see the resource-update context is formatted differently in RFC21 than jobspec-update.

Did we have a specific reason for that? It seems possibly unwise long term for both contexts to be formatted differently? On the other hand I can understand that R should have far more limited update possibilities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we have a specific reason for that? It seems possibly unwise long term for both contexts to be formatted differently?

I think it was done this way because jobspec and R are actually quite different - R is not a simple hierarchical document like jobspec. E.g., when grow/shrink or similar are added, it will be more useful to have the added or deleted R and not a wholesale replacement of execution.R_lite.

@garlick
Copy link
Member

garlick commented Sep 27, 2023

Do users want to see the whole history starting with the original R, or just current R + updates?

If the latter, would it make sense to have this RPC check the streaming bit, and if not set, return the current R and done? Then maybe #5464 and other users like sched hello in #5472 wouldn't need to replay the eventlog on their own?

(I haven't looked at this - just basing this comment on your summary in #5472 - so sorry if it's off base)

Possibly the method name would need adjustment then.

@chu11
Copy link
Member Author

chu11 commented Sep 27, 2023

Do users want to see the whole history starting with the original R, or just current R + updates?

The assumption in this PR is the latter. It returns R in its current form (R + updates currently in eventlog), and then streams further updates to R that occur later.

I left whole history as a future option / flag, as it wasn't clear there was a use case at the moment (other than "would like to see it just to see it").

If the latter, would it make sense to have this RPC check the streaming bit, and if not set, return the current R and done? Then maybe #5464 and other users like sched hello in #5472 wouldn't need to replay the eventlog on their own?

That isn't a terrible idea, makes sense. I guess the (original) theory was that whoever called this service would also want to know if R ever changed in the future. But I'm guessing there are a few cases where that's not needed?

And #5464 could totally use this service. I initially just mirrored the "jobspec" equivalent code in #5464 because the idea was not put too much burden on the broker service for parsing the eventlog a bunch. If we wanted to go down the path where we put the burden on the job-info module, we should probably support jobspec as well.

@chu11
Copy link
Member Author

chu11 commented Sep 27, 2023

re-pushed, updating "execution.expiration" to "expiration", as I had misread the RFC

@chu11
Copy link
Member Author

chu11 commented Sep 27, 2023

Hmmm, got this in one builder (gcc-12,content-s3,distcheck) with one of my new tests. I have no idea how an EAGAIN could even happen. Will have to think about this.

Edit: noticed a minor && chain issue below. Dunno if some raciness with the background processes.

  expecting success: 
  	flux submit --wait-event=start sleep inf > job8.id
  	${UPDATE_WATCH} $(cat job8.id) R > watch8A.out &
  	watchpidA=$! &&
  	wait_update_watchers 1 &&
  	kvspath=`flux job id --to=kvs $(cat job8.id)` &&
  	flux kvs eventlog append ${kvspath}.eventlog resource-update "{\"expiration\": 100.0}" &&
  	${WAITFILE} --count=2 --timeout=30 --pattern="expiration" watch8A.out
  	${UPDATE_WATCH} $(cat job8.id) R > watch8B.out &
  	watchpidB=$! &&
  	wait_update_watchers 2 &&
  	kvspath=`flux job id --to=kvs $(cat job8.id)` &&
  	flux kvs eventlog append ${kvspath}.eventlog resource-update "{\"expiration\": 200.0}" &&
  	${WAITFILE} --count=3 --timeout=30 --pattern="expiration" watch8A.out &&
  	${WAITFILE} --count=2 --timeout=30 --pattern="expiration" watch8B.out
  	${UPDATE_WATCH} $(cat job8.id) R > watch8C.out &
  	watchpidC=$!
  	wait_update_watchers 3
  	${WAITFILE} --count=1 --timeout=30 --pattern="expiration" watch8C.out &&
  	flux cancel $(cat job8.id) &&
  	wait $watchpidA &&
  	wait $watchpidB &&
  	wait $watchpidC &&
  	test $(cat watch8A.out | wc -l) -eq 3 &&
  	test $(cat watch8B.out | wc -l) -eq 2 &&
  	test $(cat watch8C.out | wc -l) -eq 1 &&
  	head -n1 watch8A.out | jq -e ".execution.expiration == 0.0" &&
  	head -n2 watch8A.out | tail -n1 | jq -e ".execution.expiration == 100.0" &&
  	tail -n1 watch8A.out | jq -e ".execution.expiration == 200.0" &&
  	head -n1 watch8B.out | jq -e ".execution.expiration == 100.0" &&
  	tail -n1 watch8B.out | jq -e ".execution.expiration == 200.0" &&
  	tail -n1 watch8C.out | jq -e ".execution.expiration == 200.0"
  
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 0.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 100.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 0.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 100.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 200.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 100.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 200.0, "nodelist": ["fv-az505-771"]}}
  {"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "0"}}], "starttime": 0.0, "expiration": 200.0, "nodelist": ["fv-az505-771"]}}
  update_watch_stream: job-info.update-watch: Resource temporarily unavailable
  update_watch_stream: job-info.update-watch: Resource temporarily unavailable
  not ok 8 - job-info: update watch with multiple watchers works

@grondo
Copy link
Contributor

grondo commented Sep 27, 2023

Do users want to see the whole history starting with the original R, or just current R + updates?

If the latter, would it make sense to have this RPC check the streaming bit, and if not set, return the current R and done?

Stepping back a bit, there are two use cases here:

  • A user that wants to see the current R + updates. The use case here is a user running flux job info R or scheduler fetching R after a reload/restart, etc. As noted in cache R in the job manager #5472, it would be nice if job-info.lookup just returned this value so we don't have to even update callers...
  • A user wants to see current R and updates. The use case here is any entity that is monitoring R for changes, e.g. for expiration updates. This includes the job-exec module (so it can update the timer at which it terminates the job), the job shell (to properly implement the --signal=SIG@TIME option), and the scheduler of a subinstance (to update the internal expiration of its resource set).

@chu11 chu11 force-pushed the issue5451_job_info_watch_R branch 2 times, most recently from bbca20b to 45f7e68 Compare September 27, 2023 23:19
@chu11
Copy link
Member Author

chu11 commented Sep 27, 2023

per offline discussion, re-worked this PR.

  • like before, there is a job-info.update-watch target that returns the R + updates result, then streams R updates as they occur. Multiple watches on the same jobid + R share the same internal eventlog-watch.

  • there is a job-info.update-lookup target, which looks up R and eventlog and returns the current R result and that's it. If there is already a update-watch for that jobid, it can return the cached result.

because of all these changes, I had to squash my previous changes, so unfortunately there's just one giant commit adding update-lookup and update-watch, hopefully it's not too hard to review :P

@chu11 chu11 changed the title job-info: support new update-watch service job-info: support new update-lookup and update-watch service Sep 28, 2023
@chu11
Copy link
Member Author

chu11 commented Sep 28, 2023

hmmm several builders are consistently failing with

Run docker run --rm --privileged aptman/qus -s -- -p --credential aarch64
Unable to find image 'aptman/qus:latest' locally
docker: Error response from daemon: Head "https://registry-1.docker.io/v2/aptman/qus/manifests/latest": EOF.
See 'docker run --help'.
Error: Process completed with exit code 125.

under the setup qemu-user-static chunk. Perhaps something happened after #5469 merged?

@grondo
Copy link
Contributor

grondo commented Sep 28, 2023

My guess is the docker manifest for that repo is corrupted. Same issue in flux-sched.

Problem: In t2231-job-info-eventlog-watch.t, there is a missing
pipe between an echo of a json payload and the 'rpc' helper tool,
so a payload was not actually being sent.  The test was just echoing
the name of the command.

Add a pipe between the echo and 'rpc' helper tool to initiate the
correct payload and test.
Problem: A test in t2231-job-info-eventlog-watch.t does not set the
streaming flag in an invalid input test.  This test happens to work
because the code's "is the input valid" check is done before the
"is the streaming flag set" test.  But for full correctness, the
streaming flag should be set.

Solution: Use the 'rpc_stream' tool in the test, which sets the
streaming flag.
Problem: In several headers in job-info, the comment indicating the
end of a ifdef block had the wrong macro name.

Correct the macro name in the comments.
Problem: A local function in job-info was not declared static.

Make it static.
Problem: It would be convenient if the several ctx destroy functions
preserved errno so it doesn't have to be done with a "save_errno"
pattern in cleanup paths.

Preserve errno in ctx destroy functions. Cleanup code in a few
locations where possible.
Problem: In guest_watch.c there is a utility function called
guest_msg_pack() that allows us to create a message using specific
credentials from another message.  It's useful when messages are sent
within the module and you want the original user's request credentials
to be copied into those new messages.

This function could be useful in future work.

Solution: Move the function into a new internal util library and
rename it cred_msg_pack().
@codecov
Copy link

codecov bot commented Oct 24, 2023

Codecov Report

Merging #5467 (9885340) into master (640134f) will decrease coverage by 0.05%.
The diff coverage is 75.39%.

@@            Coverage Diff             @@
##           master    #5467      +/-   ##
==========================================
- Coverage   83.51%   83.46%   -0.05%     
==========================================
  Files         485      487       +2     
  Lines       81543    81918     +375     
==========================================
+ Hits        68099    68375     +276     
- Misses      13444    13543      +99     
Files Coverage Δ
src/modules/job-info/guest_watch.c 73.72% <100.00%> (-1.15%) ⬇️
src/modules/job-info/watch.c 68.93% <88.88%> (+2.66%) ⬆️
src/modules/job-info/job-info.c 75.32% <82.35%> (+1.55%) ⬆️
src/modules/job-info/util.c 66.66% <66.66%> (ø)
src/modules/job-info/update.c 75.69% <75.69%> (ø)

... and 11 files with indirect coverage changes

Problem: In watch and guest_watch, there are some common eventlog
parsing helper code that could be useful in future job-info additions.

Solution: Move the helper function eventlog_parse_next() into
util and add a new helper function eventlog_parse_entry_chunk().
Update callers in watch.c and guest_watch.c appropriately.
Problem: In the future, several services will need to know a job's
resources and know the updates that would apply to them.  This would
currently require users to read R, read the eventlog, and then apply
`resource-update` events to R.  Some other users would also need to
know when there are changes to R, necessitating watching the eventlog
for future resource-update changes.

It would be nice if a service did this as there will be multiple users.

Solution: Support a new job-info.update-lookup service and
job-info.update-watch streaming service.  It currently supports only
the key "R", but can be extended to other keys in the future.

The job-info.update-lookup service will read R and the eventlog for
a job.  it then apples all resource-update changes to R and returns
the result.

job-info.update-watch service will do the same as the above, but if
the job is not completed, it will continue to watch the eventlog for
future resource-update events.  On each new resource-update event,
a new R will be streamed back to the caller.  This continues until
the job ends or the caller cancels the stream.

Fixes flux-framework#5451
Problem: There is no coverage for the new job-info.update-lookup and
job-info.update-watch services.

Add tests in the new t2233-job-info-update.t file and add helper
tools update_lookup and update_watch_stream.
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been basing work to handle R updates in the job-exec service and job shell on this PR, and this seems to be working. My inclination is to merge this PR now and make any needed changes and improvements if necessary as we need them.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few things but actually, if this is working, maybe it's fine to merge as is and open an issue to return and clean up the two bigger things I called out

  • use eventlog class to parse eventlogs
  • use flux_msglist_disconnect() and flux_msglist_cancel() instead of duplicating that functionality.

Since it may take a bit of time to audit that code and make sure those changes don't break anything, merge now+fix later may be more expedient.

Comment on lines +35 to +42
if (!(payload = json_vpack_ex (NULL, 0, fmt, ap)))
goto error;
if (!(payloadstr = json_dumps (payload, JSON_COMPACT))) {
errno = ENOMEM;
goto error;
}
if (flux_msg_set_string (newmsg, payloadstr) < 0)
goto error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is just moved in this PR not created, but maybe for later:

The json_vpack_ex() + json_dumps() + flux_msg_set_string() could be replaced with one call to flux_msg_vpack().

@@ -708,19 +708,6 @@ static int main_namespace_lookup (struct guest_watch_ctx *gw)
return rv;
}

static bool eventlog_parse_next (const char **pp, const char **tok,
Copy link
Member

@garlick garlick Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three comments on this:

  • use a different function prefix for local helpers since eventlog_ is already used by the external eventlog class
  • would it be better to use eventlog_decode() from that class to parse input into a JSON array and then iterate over that?
  • is a "chunk" the same as an "eventlog entry"? Just call it that if so.

@@ -170,6 +170,8 @@ job_info_la_SOURCES = \
job-info/watch.c \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit mesage for job-info: support new update-lookup/watch service : capitalize "it" in the 4th paragraph of the body.

uc->type = type;
uc->id = id;
if (!(uc->key = strdup (key))) {
errno = ENOMEM;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strdup already sets ENOMEM on failure

Comment on lines +321 to +333
while (eventlog_parse_next (&input, &tok, &toklen)) {
json_t *entry = NULL;
const char *name;
json_t *context = NULL;
if (eventlog_parse_entry_chunk (uc->ctx->h,
tok,
toklen,
&entry,
&name,
&context) < 0) {
errmsg = "error parsing eventlog";
goto error;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as noted earlier, it looks like this could be replaced with (with appropriate error handling)

eventlog = eventlog_decode (eventlog_str);
json_array_foreach (eventlog, index, entry) {
    eventlog_entry_parse (entry, NULL, &name, &context);
    ...
}

Comment on lines +654 to +657
if (flux_respond_error (uc->ctx->h, cmpmsg, ENODATA, NULL) < 0)
flux_log_error (uc->ctx->h,
"%s: flux_respond_error",
__FUNCTION__);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Response should only be sent on cancel, not disconnect.


uc = zlist_first (ctx->update_watchers);
while (uc) {
update_watch_cancel (uc, msg, cancel);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to replace update_watch_cancel() with

if (cancel)
    flux_msglist_cancel (ctx->h, uc->msglist, msg);
else
    flux_msglist_disconnect (ctx->h, uc->msglist, msg);

@grondo
Copy link
Contributor

grondo commented Oct 24, 2023

I've just set MWP on behalf of @chu11

@mergify mergify bot merged commit fce7220 into flux-framework:master Oct 24, 2023
31 checks passed
@chu11 chu11 deleted the issue5451_job_info_watch_R branch November 20, 2023 01:48
@chu11 chu11 mentioned this pull request Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants