-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge task table and task log into a single table #30
Conversation
4c27110
to
f34bc8e
Compare
LOG_ERR("Result table object callback"); | ||
if (reply->type == REDIS_REPLY_STRING) { | ||
/* If we found the object, get the spec of the task that created it. */ | ||
CHECK(reply->len == UNIQUE_ID_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a DCHECK
2ed87a9
to
25a3815
Compare
25a3815
to
7a88268
Compare
return task_result; | ||
} | ||
/* Parse the task struct's fields. */ | ||
int32_t state = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in some places we use int32_t
, but shouldn't we actually use the enum type scheduling_state
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should. I tried to replace them all, but I guess I missed some :)
int32_t state = 0; | ||
node_id node = NIL_ID; | ||
task_spec *spec = NULL; | ||
for (int i = 0; i < num_redis_replies; i = i + 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add DCHECK(num_redis_replies % 2 == 0)
? Or maybe even DCHECK(num_redis_replies == 6)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
spec = malloc(value->len); | ||
memcpy(spec, value->str, value->len); | ||
} else { | ||
LOG_ERR("Found unexpected %s field in task log", key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be fatal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess so.
task *task_result; | ||
if (num_redis_replies == 0) { | ||
/* If there was no information about this task, an empty task. */ | ||
task_result = alloc_nil_task(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this case ever happen? If not, then perhaps this should be a fatal error. If yes, then maybe we should return NULL
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, unless we decide to do the update commands to the task and result tables in parallel. I'd like to keep it non-fatal so we can test that the error triggers, but I'll change it to return NULL.
if (reply->type != REDIS_REPLY_ARRAY) { | ||
LOG_ERR("Expected Redis array, received type %d %s", reply->type, | ||
reply->str); | ||
exit(-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the convention for when we should use CHECK
versus LOG_ERR
versus LOG_ERR
followed by exit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this should probably be CHECK
.
CHECK
is fatal and LOG_ERR
is non-fatal errors. We could probably name these better.
* fail_callback. | ||
* @return Void. | ||
*/ | ||
void task_table_add_task(db_handle *db_handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances would this be called when the task_id is already in the task table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the task_id
is already in the table, the user should call task_table_update_task
instead. This is steps 3 and 5 in the task table documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the documentation says "this will override anything already in the table", so I'm wondering if there are situations where we expect that to happen. If not, then it's a bug so we can throw a fatal error, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Yeah, it should be an error, but we can't check without more communication to redis. Either we use HSETNX
, which only checks for existence of one hashmap field, not a key, or we use a transaction to atomically get and set the key.
typedef void (*task_table_subscribe_callback)(task *task, void *user_context); | ||
|
||
/** | ||
* Register callback for a certain event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a bit more specific.
return result; | ||
} | ||
|
||
int64_t task_instance_size(task_instance *instance) { | ||
return sizeof(task_instance) - sizeof(task_spec) + task_size(&instance->spec); | ||
task *alloc_nil_task(task_id task_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to understand if this is really necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of it.
} | ||
|
||
task *alloc_task(task_spec *spec, scheduling_state state, node_id node) { | ||
int64_t size = sizeof(task) - sizeof(task_spec) + task_spec_size(spec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably use the task_size
method, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't know the task_size
without allocating the task first. This is to compute the size that task_size
should return once the task is allocated.
int64_t size = sizeof(task_instance) - sizeof(task_spec) + task_size(spec); | ||
task_instance *result = malloc(size); | ||
bool node_ids_equal(node_id first_id, node_id second_id) { | ||
return UNIQUE_ID_EQ(first_id, second_id) ? true : false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is inconsistent with how object_ids_equal
is defined (that one doesn't use the ternary operator). This way is probably better safer, although I'm not totally sure how the bool
implementation here works.
Each entry in the task log now includes the task spec and the current scheduling information. Updates to the task log set the scheduling information fields in the corresponding entry, instead of appending a new log entry.