Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylc.rundb: replace state dump #1827

Merged
merged 1 commit into from
Jul 27, 2016

Conversation

matthewrmshin
Copy link
Contributor

@matthewrmshin matthewrmshin commented May 3, 2016

This change replaces state dump functionality with the runtime DB.

Runtime DB changes:

  • New table for suite parameters, e.g. run mode, initial/final cycle points.
    • And a similar table to store suite parameters at check points.
  • New table for current tasks in task pool, with their "spawned" status.
    • And a similar table to store task pool status at check points.
  • New table to store broadcast states at check points.
  • Remove duplicated columns from task_states and task_events tables.
    The information can all be found under task_jobs.

Restart now loads previous states and other information from the runtime DB only.

  • Where relevant old DB + state file will be upgraded automatically on restart.

cylc ls-checkpoints: new command to list the task pool checkpoints.

  • Latest
  • Restarts
  • Before and after reloads.

Close #421.

See also:

This change breaks backward compatibility on restart (as it no longer read/write state dumps), so I'd like to start a conversation on what old features and/or backward compatibility we want to retain (or not).

@matthewrmshin matthewrmshin self-assigned this May 3, 2016
@matthewrmshin matthewrmshin added this to the soon milestone May 3, 2016
@arjclark
Copy link
Contributor

arjclark commented May 3, 2016

How do we roll back to a previous suite state in your new system?

@matthewrmshin
Copy link
Contributor Author

How do we roll back to a previous suite state in your new system?

It cannot be done at the moment. I want to discuss the main usages of this before putting this functionality back in.

@hjoliver
Copy link
Member

hjoliver commented May 3, 2016

I want to discuss the main usages of this before putting this functionality back in.

The current rolling archive of state dumps is probably of little use - by default it's very short, and it isn't easy to figure out which to use if not the most recent (however, I'll ask our operations team if they ever use it). But I think a more deliberate suite check-pointing system might be really useful, e.g. during suite development, which typically involves a lot of re-running - see #1735.

@arjclark
Copy link
Contributor

arjclark commented May 4, 2016

It cannot be done at the moment. I want to discuss the main usages of this before putting this functionality back in.

Main use to date (to my knowledge) has been in operations where items of graphing have gone missing following a restart or reload, allowing us to 1) easily diff the contents of the task pool before and after reload/restart and 2) allowing us to roll back when something has messed up rather than sitting there trying to manually insert and reset states for not insignificant numbers of tasks.

@matthewrmshin
Copy link
Contributor Author

It cannot be done at the moment. I want to discuss the main usages of this before putting this functionality back in.

Main use to date (to my knowledge) has been in operations where items of graphing have gone missing following a restart or reload, allowing us to 1) easily diff the contents of the task pool before and after reload/restart and 2) allowing us to roll back when something has messed up rather than sitting there trying to manually insert and reset states for not insignificant numbers of tasks.

Both of these are better solved by #1735 (thanks to @hjoliver for raising this) instead of the rolling archive:

  1. We can store the task pool before/after a restart/reload. This should allow us to diff the task pool properly. The rolling archive is only effective for diff if the user remembers to hold the suite on restart or reload.
  2. The rolling archive may be too far gone by the time the problem is spotted.

I think it is easy to extend the task_pool table (or add a new task pool snapshot? table) to store snapshots of the task pool at various points. (It is probably not too difficult to implement some housekeeping mechanism to the table as well.) We can then have a command to list possible restart points (and to dump content of the task pool at each restart point).

@arjclark
Copy link
Contributor

arjclark commented May 4, 2016

Both of these are better solved by #1735 (thanks to @hjoliver for raising this) instead of the rolling archive:

  1. We can store the task pool before/after a restart/reload. This should allow us to diff the task pool properly. The rolling archive is only effective for diff if the user remembers to hold the suite on restart or reload.

So long as @hjoliver's proposal and/or point 1 are in the same release then that's fine. I don't want to end up in a position where we've lost the state dumps and have no way of rolling back/checkpointing for a given cylc version.

(point 2 is rarely the case in my experience of our operational problem suites - it's usually pretty clear something's gone horrendously wrong fairly quickly and the task becomes determining what's happened and how to recover)

I think it is easy to extend the task_pool table (or add a new task pool snapshot? table) to store snapshots of the task pool at various points. (It is probably not too difficult to implement some housekeeping mechanism to the table as well.) We can then have a command to list possible restart points (and to dump content of the task pool at each restart point).

Sounds good to me - as a basic starting point I'd snapshot at initialisation of the suite/restart/warmstart and on (planned) shutdowns, otherwise just using wherever the suite was last up to.

@hjoliver
Copy link
Member

hjoliver commented May 5, 2016

Operations here reports they only ever (and infrequently at that) use the "most recent non-corrupted state dump", i.e. (presumably) the most-recent-but-one file in the unlikely event that the suite was killed in the middle of writing the most recent one. So that's fine - the atomic nature of the the DB means it should never be corrupted!

@matthewrmshin
Copy link
Contributor Author

matthewrmshin commented May 6, 2016

"most recent non-corrupted state dump"

(The logic since #825 should mean that the default state file (which is a symbolic link to the latest state file that has been fully written and fsync'ed) is never corrupted.)

I think the only (useful?) functionality missing from this change is the ability to easily inspect the last 10 state dumps (or whatever is the state dump rolling archive length in global.rc) as well as the state dumps at previous restarts.

The former is probably not useful. For the latter, I'll add a table to snapshot the task pool and a table to snapshot broadcast states on restart (and possibly before/after reload).

@matthewrmshin matthewrmshin force-pushed the rundb-state-dump branch 5 times, most recently from cef549e to 495b2cd Compare May 11, 2016 07:49
@matthewrmshin matthewrmshin force-pushed the rundb-state-dump branch 6 times, most recently from 90e1fec to 465f141 Compare May 12, 2016 11:37
@matthewrmshin matthewrmshin force-pushed the rundb-state-dump branch 2 times, most recently from 5bac391 to ad6bd30 Compare May 12, 2016 13:21
self.pub_dao.add_delete_item(table_name, where_args)
# Record suite parameters and tasks in pool
# Record any broadcast settings to be dumped out
for obj in self, BroadcastServer.get_inst():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you (or do you want me to) do some profiling with a busy suite for this routine? It was a bit of a bottleneck before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running my 100-task busy suite for 30 cycles in my environment, process_queued_db_ops has a cumulative run time of 67.2s. This is faster when compared to master's 63.6s + 10.8s on state file dumping.

@benfitzpatrick
Copy link
Contributor

Lots of comments but I really like the change.

@benfitzpatrick
Copy link
Contributor

I think the most important thing is to profile it against a busy suite.

@matthewrmshin matthewrmshin force-pushed the rundb-state-dump branch 2 times, most recently from 1feeb7b to 00badf1 Compare July 13, 2016 13:02
@matthewrmshin
Copy link
Contributor Author

@benfitzpatrick most comments addressed or fixed. I'll do some profiling.

@matthewrmshin matthewrmshin force-pushed the rundb-state-dump branch 3 times, most recently from ec2c162 to 2ae9a80 Compare July 22, 2016 13:39
@matthewrmshin
Copy link
Contributor Author

Branch squashed and re-based. Profiling using my 100-task busy suite suggests little or no change in performance.

New tables for:
* Suite parameters, e.g. run mode, initial/final cycle points.
* Current tasks in task pool, with their "spawned" status.
* Snapshots of above and broadcast states.
Restart loads previous states from run DB.
* Will upgrade DB with old state dump file automatically.
Remove duplicated columns from task_states and task_events tables.
Remove state dump functionality.
cylc ls-checkpoints: new command to list the task pool checkpoints.
Rename cylc-suite.db
* cylc-suite-private.db <- state/cylc-suite.db
* cylc-suite-public.db <- cylc-suite.db
* cylc-suite.db now a symbolic link to cylc-suite-public.db
@matthewrmshin
Copy link
Contributor Author

Branch re-based again.

@benfitzpatrick benfitzpatrick modified the milestones: next release, soon Jul 27, 2016
@benfitzpatrick benfitzpatrick merged commit 3199aa5 into cylc:master Jul 27, 2016
@matthewrmshin matthewrmshin deleted the rundb-state-dump branch July 27, 2016 07:58
matthewrmshin added a commit to matthewrmshin/rose that referenced this pull request Aug 16, 2016
arjclark added a commit to metomi/rose that referenced this pull request Aug 16, 2016
matthewrmshin added a commit to matthewrmshin/rose that referenced this pull request Aug 22, 2016
A missing `break` means that we were still querying the `task_events`
table instead of the `task_jobs` table for a set of unique job hosts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants