-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plugin: add project validation and annotation #294
Conversation
1e11f56
to
6903551
Compare
1bef807
to
46c3125
Compare
So that was odd! My |
d940727
to
31b1e1a
Compare
31b1e1a
to
6ce8502
Compare
6ce8502
to
9062a94
Compare
9062a94
to
385a8ad
Compare
Just skimming a bit, but should the |
@garlick, ah, you're absolutely right; thanks for catching that! If a user specifies a project on the command line the project will be present in the jobspec already, so there won't be a need to add it manually via |
7160e0b
to
437fe00
Compare
OK, I've pushed up some changes to this PR to only post a |
6a62eee
to
f42f3fd
Compare
@grondo if you have some time later this week or next, could you give this PR a look? |
f42f3fd
to
3670a22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmoussa1 - I made a first pass at this PR. Sorry it took so long!
A few comments inline. I also had some general questions:
- Is the use of projects now a requirement? What happens at a site where no projects are configured at all
- Do we need to add
attributes.system.project
to the official list of system attributes in RFC 14? - Does job-list need to support
project
as a job attribute it can return (and so job-archive can store this attribute)
Sorry if all this has been discussed before, I may need a refresher on the use case for job "project" attribute!
|
||
data = {"data": bulk_p_data} | ||
|
||
flux.Flux().rpc("job-manager.mf_priority.rec_p_update", json.dumps(data)).get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as an FYI - the Python RPC
class will take care of encoding a non-string payload with json.dumps()
if the argument is not string, so the json.dumps(data)
is unnecessary here. It doesn't cause a problem though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I did not know this, thanks for pointing this out! There are other parts in this script that also wrap data
with json.dumps()
. I should open a separate PR later that drops it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it doesn't cause a problem to call json.dumps()
first, but it would probably be good to clean up in the future.
src/plugins/mf_priority.cpp
Outdated
@@ -168,15 +168,24 @@ static int get_queue_info ( | |||
} | |||
|
|||
|
|||
static void split_string (char *queues, struct bank_info *b) | |||
static void split_string (char flag, char *list, struct bank_info *b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might make more sense for this function to take two args, a string to split on ,
, and the vector onto which to push the results, rather than passing the whole struct bank_info *
and using a flag to determine which member will receive the new values.
You might then rename the function split_string_and_push_back()
This way, in the future if there is another list on which you want to use this function, it won't require any changes.
E.g. the new calling pattern would be
split_string_and_push_back (projects, &b->projects);
(actually I may have gotten the C++ wrong, but you get the idea)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, callers should be updated in the same commit if possible to avoid a commit that breaks the build (This is useful if anyone ever needs to use git bisect
on the codebase)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you're saying! Adjusting the parameters of the function to just take a string to split on ,
and the vector to push the results cleans the code up quite considerably (no having to pass funky single-character flags); instead it can just look like this:
split_string_and_push_back (projects, &b->projects);
and the function only needs one while
loop. Thanks for the suggestion!
I can make this change as well as add the callers of this function to the same commit (sorry about that!!).
src/plugins/mf_priority.cpp
Outdated
|
||
// split queues comma-delimited string and add it to b->queues vector | ||
b->queues.clear (); | ||
split_string (queues, b); | ||
split_string ('q', queues, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted before, this change should have gone with the previous commit.
if (flux_respond (h, msg, NULL) < 0) | ||
flux_log_error (h, "flux_respond"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "success" response should be delayed until the processing is successful. That is, I'd move this to right before the return
|
||
if (json_unpack_ex (el, &error, 0, "{s:s}", | ||
"project", &project) < 0) | ||
flux_log (h, LOG_ERR, "mf_priority unpack: %s", error.text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a goto error;
on this failure? Or is the intent to just ignore errors unpacking individual projects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nope, that's my mistake - it should probably respond with an error if it fails to unpack any projects sent by the flux-accounting DB because that could affect job validation with a specified project, so there should be a goto error;
here. I will fix this!
src/plugins/mf_priority.cpp
Outdated
@@ -1046,7 +1104,8 @@ extern "C" int flux_plugin_init (flux_plugin_t *p) | |||
if (flux_plugin_register (p, "mf_priority", tab) < 0 | |||
|| flux_jobtap_service_register (p, "rec_update", rec_update_cb, p) | |||
|| flux_jobtap_service_register (p, "reprioritize", reprior_cb, p) | |||
|| flux_jobtap_service_register (p, "rec_q_update", rec_q_cb, p) < 0) | |||
|| flux_jobtap_service_register (p, "rec_q_update", rec_q_cb, p) | |||
|| flux_jobtap_service_register (p, "rec_p_update", rec_p_cb, p) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you already have req_q_update
but my old eyes are finding it very difficult to see the difference between the p
and q
here, so I was just wondering if perhaps the service should be rec_project_update
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah absolutely, sorry for the confusion! I was trying to stay under the 80 character limit here but it can be hard for me to look at too, so I should just break it up 👍 will fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I can sympathize with that! However, no problem breaking the function across multiple lines!
t/t1015-mf-priority-urgency.t
Outdated
flux.Flux().rpc("job-manager.mf_priority.rec_update", json.dumps(bulk_user_data)).get() | ||
flux.Flux().rpc("job-manager.mf_priority.rec_q_update", json.dumps(bulk_queue_data)).get() | ||
EOF | ||
' | ||
|
||
test_expect_success 'send the user information to the plugin' ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These removed lines look suspiciously like an accident, since this commit is mainly about moving test JSON input to separate files. Just a note to double check that you didn't accidentally remove a test and/or test RPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking. This should be fine, as the JSON payload that stored all of the user information was moved to a separate file, but I think I made it more confusing because in this commit, I re-ordered the test that sent the user information to the plugin to before sending the queue information. Not sure why I did that. 🤦
I think if I just leave that test that sends user information in the same spot, it should fix the commit from looking like I removed a couple tests. I will fix this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, thanks for the clarification. I figured everything was good since tests were passing, but just thought I'd check!
src/plugins/mf_priority.cpp
Outdated
// add project name to jobspec | ||
json_t *project_name = json_object (); | ||
if (!project_name) | ||
return -1; | ||
|
||
if (json_object_set_new (project_name, | ||
"attributes.system.project", | ||
json_string (project)) < 0) { | ||
json_decref (project_name); | ||
return -1; | ||
} | ||
|
||
// post jobspec-update event | ||
if (flux_jobtap_event_post_pack (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"jobspec-update", "O", | ||
project_name) < 0) { | ||
json_decref (project_name); | ||
return -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest just using flux_jobtap_event_post_pack()
to pack the project name instead of separately constructing the object, e.g.
// post jobspec-update event
if (flux_jobtap_event_post_pack (p,
FLUX_JOBTAP_CURRENT_JOB,
"jobspec-update",
"{s:{s:{s:s}}}",
"attributes",
"system",
"project", project) < 0) {
return -1;
}
Then you do not have to worry about the memory for the temporary JSON object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh neat! I didn't know you could do this; this cleans up the function a lot. Thanks for the suggestion!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmoussa1: D'oh! I gave bad advice here by not paying attention and forgetting how the jobpsec-update
event works. Sorry! You can ignore this comment. My suggestion should have been:
// post jobspec-update event
if (flux_jobtap_event_post_pack (p,
FLUX_JOBTAP_CURRENT_JOB,
"jobspec-update",
"{s:s}",
"attributes.system.project", project) < 0) {
return -1;
}
I apologize for the confusion 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omg! i think this worked, and it solved the issue of having to place the call to jobspec-update
in a place like job.state.run
and instead it can go in job.validate
where it would make more sense. This feels like a eureka moment! 😆 Thanks for catching this and don't worry about it at all, I'm glad it worked 🙌
I'll make the appropriate changes to the commit and force push in a bit.
src/plugins/mf_priority.cpp
Outdated
if (project == NULL) { | ||
project = const_cast <char *> | ||
(bank_it->second.def_project.c_str ()); | ||
if (update_jobspec_project (p, userid, bank, project) < 0) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, | ||
"failed to update jobspec " | ||
" with project name"); | ||
|
||
return -1; | ||
} | ||
} | ||
} | ||
} | ||
|
||
// if using a default project, add it to jobspec | ||
if (project == NULL) { | ||
if (update_jobspec_project (p, userid, bank, project) < 0) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, "failed to update " | ||
"jobspec with project name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any way to combine this duplicated code? I might need help understanding why update_jobspec_project()
needs to be called two different places like this here, sorry!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're so right! Thanks for pointing this out. I don't know why I have this code duplicated; it doesn't need to be. I think I must have thought that if a job is submitted and held in PRIORITY because a user is waiting for their information to be added to the plugin, it required a special case in priority_cb ()
to add the project name, but that's not the case. Sorry for the confusion!
I'll remove this duplicate code and add a test which shows that the project name gets updated in the case where a user's job is held in PRIORITY as the plugin waits for (and then receives) information from the flux-accounting DB.
A job could enter the PRIORITY state multiple times, so would this mean the We may want to think about how to issue the |
Hm, I don't why the pr-validator thinks all your commits have empty bodies, but I'll look into it and get that action fixed. |
Yes, the first set of tests in test_expect_success 'submit a job under a user before plugin gets updated' '
job0=$(flux python ${SUBMIT_AS} 1003 hostname) &&
test $(flux jobs -no {state} ${job0}) = PRIORITY
'
test_expect_success 'add user to flux-accounting DB and to plugin; job transitions to RUN' '
flux account -p ${DB_PATH} add-user --username=user1003 --userid=1003 --bank=account1 &&
flux account-priority-update -p ${DB_PATH} &&
test $(flux jobs -no {state} ${job0}) = RUN
'
test_expect_success 'check that project gets updated for submitted job' '
flux job info $job0 eventlog > eventlog.out &&
grep "{\"attributes.system.project\":\"\*\"}" eventlog.out &&
flux job cancel $job0
' In the above, a job gets submitted before any data is loaded to the plugin. The user is then added to the flux-accounting DB and the DB info is sent to the plugin, the job transitions to RUN, and the project name is checked in the job's main eventlog to look for the default project name. |
t/t1028-mf-priority-projects.t
Outdated
test_expect_success 'submit a job under a project that does not exist' ' | ||
test_must_fail flux python ${SUBMIT_AS} 1001 --setattr=system.project=projectFOO \ | ||
hostname > project_dne.out 2>&1 && | ||
test_debug "project_dne.out" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed this while trying to run this test with -d -v
-- test_debug()
takes a command to run, not a file. Since there's no command project_dne.out
, this line causes a failure. Change to
test_debug "cat project_dne.out" &&
here and in the other use of test_debug
in this sharness test. (Also perhaps audit the rest of the testsuite)
t/t1028-mf-priority-projects.t
Outdated
flux account-priority-update -p ${DB_PATH} && | ||
test $(flux jobs -no {state} ${job0}) = RUN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a slight race condition here (and actually above in the check for PRIORITY) because of the eventual consistency properties of job-list
. It might be safer to check the eventlog and ensure the expected event has been posted.
E.g. instead of checking that the job is in the PRIORITY state above, check that the last event in the eventlog is depend
.
and instead of checking for the RUN
state here, wait for an alloc
event...
Sorry, still getting clarification here! In this test though the project name of I apologize if I'm still not getting it and I appreciate your patience! |
No need to apologize, I think I am doing a poor job explaining the requirements here, so I'm sorry! Here is the requirement for how each job needs to be associated with a project, taken from conversation with @ryanday36 in #255 (feel free to correct me I am misunderstanding this Ryan): If a user is added to the DB without specifying any projects, a default project If a user is added to the DB with a specified project name, any job submitted without specifying a project name will fall under that project name. Does that explanation make sense? So, I believe you are absolutely right, in the event where a user does have a default project, the plugin has to assign this project to pending jobs while the plugin waits for flux-accounting information. I was able to confirm that this was not yet supported with the most current push of this branch by submitting a job under a user/bank before the plugin gets any data, adding the user with a specific default project, updating the plugin, and seeing that the project name for the user/bank does not get updated to the correct default project. To fix this, I can place a check in I also wrote a set of tests that submits two jobs from two different users before the plugin gets any information (so both jobs are pending). One user gets added with no projects specified, and the other user gets added with one project specified (which becomes their default). Once the plugin gets loaded with the flux-accounting DB information, the jobs get updated. The job from the user/bank with no projects specified keeps its project as Let me know if this explanation does or does not make sense. I'll push up these changes (as well as the changes based on your feedback above) shortly. |
35cef73
to
2ba8c04
Compare
Ah, ok. I guess my naive question would be why not just not set a project when no project is set instead of using |
If I am understanding your question correctly, |
Yeah, sorry I meant just don't set |
Oh, whoops. I misunderstood your question, sorry. My understanding is that we want a default whether a user/bank has a default project or not. Here is a comment from @ryanday36 in #255 when projects were first being added to the flux-accounting DB:
|
Ok, thanks! I didn't read that issue carefully enough! 🤦 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple other comments I had while looking at this again. Not a full review.
src/plugins/mf_priority.cpp
Outdated
while (s_stream.good ()) { | ||
std::string substr; | ||
getline (s_stream, substr, ','); // get string delimited by comma | ||
b->queues.push_back (substr); | ||
vec->push_back (substr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a stray indent here?
src/plugins/mf_priority.cpp
Outdated
b->projects = { "*" }; | ||
b->def_project = "*"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the flux-accounting info is missing, shouldn't the project be treated like the bank info, i.e. be somehow recorded as unknown? That is, if the bank info is missing, we don't know if the default project should be "*"
since the real default should be set in the accounting database. In fact, should the default project be something encoded in the database and left our of the plugin itself (that way other sites could make their own decision what a default project should be, or if there should even be a default)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great question. I can change this assignment to be set to indicate that the project for the user/bank's job is unknown until it gets its information - just re-ran the tests after changing these values to something like "DNE"
, which is the same as the bank information, and I think everything still works as expected. 👍
In the user table in the flux-accounting DB, the default project for any user/bank when being added is *
(if one is not specified at user creation time; the default project can be changed later if desired):
"""
projects tinytext DEFAULT '*' NOT NULL ON CONFLICT REPLACE DEFAULT '*',
default_project tinytext DEFAULT '*' NOT NULL ON CONFLICT REPLACE DEFAULT '*',
"""
and *
is added as a valid project name to the DB when the DB is first created:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS project_table (
project_id integer PRIMARY KEY AUTOINCREMENT,
project tinytext NOT NULL,
usage real DEFAULT 0.0 NOT NULL
);"""
)
conn.execute("INSERT INTO project_table (project) VALUES ('*')")
Since this is the case, maybe we don't even need to assign values for project
and def_project
here? Removing these assignments should also be fine (at least, the the testsuite passes). What do you think? Sorry if I am misunderstanding you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would look more correct to me. If the default is already set in the database, then the code should use that default and not a copy, since that might be redundant and also could cause the values to diverge if someone updates the database and forget to do the same in the plugin. This is why it is usually bad practice to copy a default value into more than one place.
Of course, I'm not all that familiar with the code here, so maybe it made more sense in this situation, so I was just posing the question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right and it was a good question to ask, so I'll go ahead and make this change here. I'll remove the assignment of project
and default_project
. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just pushed up a change (really just dropped this commit) that added this assignment of a project/default project in the case where a user/bank submits a job before any information is loaded to the plugin.
Problem: The split_string () helper function can only accept a list of queues to parse when unpacking user/bank information from the flux-accounting DB, but there also exists a need to be able to parse a list of projects as well. Since both lists can be parsed the same way, this helper function should be expanded. Expand the split_string () helper function to accept either a list of queues or a list of proects to parse through when unpacking user/bank information.
Add an unpack of the "projects" and "def_project" columns for each association when unpacking them in the rec_update_cb () function. Add a new callback to unpack information from the project_table and store it in a vector of projects to be used for project validation and annotation.
Add information from the project_table and the projects column in the association_table to the bulk update script that sends user, bank, and queue information to the multi-factor priority plugin.
Add "active", "projects", and "def_project" key-value pairs to the fake payloads sent to the multi-factor priority plugin in a number of the sharness tests.
Problem: The sample user payloads sent to the plugin grow longer than wanted in the sharness test files with the addition of the "project" and "default_project" fields. For the tests that have these sample payloads, create a new folder in the t/ directory that stores these JSON payloads permanently in a file instead of creating them within the test.
Add project validation to the validate_cb () function, which checks that if a project is specified by an association when a job is submitted, it: 1) is a valid project to submit jobs under, and 2) is a valid project for the association to submit jobs under. If either one of these checks fail, the job is rejected with a message including the project name.
2ba8c04
to
7d01df6
Compare
Add a helper function to mf_priority.cpp that adds the project name for a user/bank's job to the jobspec via jobspec-update under attributes.system.project.
Problem: If a user submits a job under a default project, there is no way to tell which project they submitted their job under because it does not get posted to jobspec. Add a jobspec-update of the user/bank's default project name when the job is in the job.new callback. In the event where a user/bank submits a job before any information is loaded to the priority plugin and their job is held in PRIORITY state, add a jobspec-update of the user/bank's default project name in job.state.priority IF their default project name is not "*" ("*" is the default project name if a user is added to the database with no projects).
Add the projects and def_project fields to the list of information returned for each association in the plugin.query callback. Add the projects and def_project key-value pairs to the expected files for t1019-mf-priority-info-fetch.t.
Problem: There are no tests for project validation and annotation in the multi-factor priority plugin. Add some tests.
Problem: Now that the priority jobtap plugin supports jobspec-updates of both banks and projects, the "jobspec-update" event in the eventlog holds both of these updates on one line, which breaks the assumption in t1029-mf-priority-default-bank.t that the jobspec-update of a user's default bank will be surrounded by brackets {}. Remove the outer brackets from the tests in t1029-mf-priority-default-bank.t that grep for a certain bank.
7d01df6
to
09063e0
Compare
Codecov Report
@@ Coverage Diff @@
## master #294 +/- ##
==========================================
- Coverage 81.04% 80.85% -0.19%
==========================================
Files 20 20
Lines 1419 1499 +80
==========================================
+ Hits 1150 1212 +62
- Misses 269 287 +18
|
I've been able to make some decent progress on adding an external class for user/bank (i.e accounting) information that I think would really clean up this PR, so I think I'll just close this and re-propose something else at a later date. |
Background
Mentioned in #293, there is a requirement to be able to associate jobs with "projects," which allow multiple users who belong in different banks to submit jobs under one entity. Currently, the flux-accounting database supports adding projects and adding associations to certain projects, but this information is not relayed to the multi-factor priority plugin, so there is no way to associate a job with a project.
This PR adds project support in the priority plugin. The two big components that make up this PR are 1) providing some validation for projects if one is specified during
job.validate
, and 2) updating the jobspec with the project name in the PRIORITY event of a job.Projects (along with the default project) for each association are now sent via
flux account-priority-update
to the plugin's internal map, where they are unpacked and stored in each association'sbank_info
object. The project information for each association is also added to the information returned in theplugin.query
callback.When an association specifies a project during job submission with
--setattr=system.project=my_project
, the project name is checked to both exist and be a valid project for the user to submit jobs under. Failure in either of these checks results in a job rejection with an appropriate message as to why the job was rejected.If a user/bank submits a job under a default project, a
jobspec-update
event is posted underattributes.system.project
in the PRIORITY event to add the user/bank's default project. The project name can then be found in the job's eventlog: