Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge task table and task log into a single table #30

Merged
merged 4 commits into from
Nov 11, 2016
Merged

Conversation

stephanie-wang
Copy link
Contributor

Each entry in the task log now includes the task spec and the current scheduling information. Updates to the task log set the scheduling information fields in the corresponding entry, instead of appending a new log entry.

@stephanie-wang stephanie-wang force-pushed the task-log branch 10 times, most recently from 4c27110 to f34bc8e Compare November 8, 2016 06:02
LOG_ERR("Result table object callback");
if (reply->type == REDIS_REPLY_STRING) {
/* If we found the object, get the spec of the task that created it. */
CHECK(reply->len == UNIQUE_ID_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a DCHECK

return task_result;
}
/* Parse the task struct's fields. */
int32_t state = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in some places we use int32_t, but shouldn't we actually use the enum type scheduling_state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should. I tried to replace them all, but I guess I missed some :)

int32_t state = 0;
node_id node = NIL_ID;
task_spec *spec = NULL;
for (int i = 0; i < num_redis_replies; i = i + 2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add DCHECK(num_redis_replies % 2 == 0)? Or maybe even DCHECK(num_redis_replies == 6)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

spec = malloc(value->len);
memcpy(spec, value->str, value->len);
} else {
LOG_ERR("Found unexpected %s field in task log", key);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be fatal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess so.

task *task_result;
if (num_redis_replies == 0) {
/* If there was no information about this task, an empty task. */
task_result = alloc_nil_task(id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this case ever happen? If not, then perhaps this should be a fatal error. If yes, then maybe we should return NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, unless we decide to do the update commands to the task and result tables in parallel. I'd like to keep it non-fatal so we can test that the error triggers, but I'll change it to return NULL.

if (reply->type != REDIS_REPLY_ARRAY) {
LOG_ERR("Expected Redis array, received type %d %s", reply->type,
reply->str);
exit(-1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the convention for when we should use CHECK versus LOG_ERR versus LOG_ERR followed by exit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this should probably be CHECK.

CHECK is fatal and LOG_ERR is non-fatal errors. We could probably name these better.

* fail_callback.
* @return Void.
*/
void task_table_add_task(db_handle *db_handle,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances would this be called when the task_id is already in the task table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the task_id is already in the table, the user should call task_table_update_task instead. This is steps 3 and 5 in the task table documentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the documentation says "this will override anything already in the table", so I'm wondering if there are situations where we expect that to happen. If not, then it's a bug so we can throw a fatal error, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Yeah, it should be an error, but we can't check without more communication to redis. Either we use HSETNX, which only checks for existence of one hashmap field, not a key, or we use a transaction to atomically get and set the key.

typedef void (*task_table_subscribe_callback)(task *task, void *user_context);

/**
* Register callback for a certain event.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a bit more specific.

return result;
}

int64_t task_instance_size(task_instance *instance) {
return sizeof(task_instance) - sizeof(task_spec) + task_size(&instance->spec);
task *alloc_nil_task(task_id task_id) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand if this is really necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of it.

}

task *alloc_task(task_spec *spec, scheduling_state state, node_id node) {
int64_t size = sizeof(task) - sizeof(task_spec) + task_spec_size(spec);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use the task_size method, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't know the task_size without allocating the task first. This is to compute the size that task_size should return once the task is allocated.

int64_t size = sizeof(task_instance) - sizeof(task_spec) + task_size(spec);
task_instance *result = malloc(size);
bool node_ids_equal(node_id first_id, node_id second_id) {
return UNIQUE_ID_EQ(first_id, second_id) ? true : false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is inconsistent with how object_ids_equal is defined (that one doesn't use the ternary operator). This way is probably better safer, although I'm not totally sure how the bool implementation here works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants