Skip to content
This repository has been archived by the owner on Oct 17, 2023. It is now read-only.

Enhancement: Add the ability to save and restore the state of an adaptor #33

Closed
jipperinbham opened this issue Dec 29, 2014 · 12 comments · Fixed by #323
Closed

Enhancement: Add the ability to save and restore the state of an adaptor #33

jipperinbham opened this issue Dec 29, 2014 · 12 comments · Fixed by #323
Assignees
Milestone

Comments

@jipperinbham
Copy link
Contributor

jipperinbham commented Dec 29, 2014

Currently, if transporter fails, it is not able to start back where it left off during a copy/tail. In order to support this, each adaptor will want to save/persist the most recent document it has processed. We should be able to support multiple types of persistent stores using a simple interface like so:

type SessionStore interface {
    Set(path string, msg *message.Msg) error
    Get(path string) (string, int64, error)
}

The path would typically be a combination of the Transporter key and the node path. When retrieving the last known State, we will only return the last _id and the timestamp of the operation. It may be necessary in the future to support retuning the entire message.Msg.

Ideally, an implementing class will not constantly write the last operation but have the ability to "flush" on an interval which can be defined in the config.yaml.

The beginning of this can be seen in adaptor-state branch but it is incomplete. It currently introduces a sessionTicker where the Pipeline will call the Set func for each Node. As of right now, I have not added the ability to retrieve/get the state during startup/initialization.

Thoughts and feedback?

@jipperinbham
Copy link
Contributor Author

With the changes being made in noid, it will no longer be viable to only store the id and timestamp. We'll need to store the entire Msg.Data now which will change the interface to

type SessionStore interface {
    Set(path string, msg *message.Msg) error
    Get(path string) (*message.Msg, error)
}

@jipperinbham
Copy link
Contributor Author

Another part of this that has yet to be discussed is how to actually configure a SessionStore. My thought is to have something like the following in the config.yaml:

sessions:
  type: "filestore",
  interval: 10s,
  uri: "file:///tmp/transporter.state"

If the sessions section is missing, no SessionStore will be used.

@jipperinbham jipperinbham self-assigned this Jan 2, 2015
@jipperinbham
Copy link
Contributor Author

Made another change to the interface to use a struct instead of just the message.Msg. This will allow adaptors to store additional information while processing documents.

type SessionStore interface {
    Set(path string, state *MsgState) error
    Get(path string) (*MsgState, error)
}

type MsgState struct {
    Msg   *message.Msg
    Extra map[string]interface{}
}

@nstott
Copy link
Contributor

nstott commented Jan 2, 2015

do we need a map[string]interface{} there?
or can we assume we have a few specific states that we care about and use a type State int to store it?

@jipperinbham
Copy link
Contributor Author

possibly, the purpose of the map is so adaptors could set things like Extra["progress"] = "copy" or Extra["progress"] = "tail" in the case of the MongoDB adaptor

that could just as easily be solved I guess but didn't want to limit things right away

@methuz
Copy link

methuz commented Jul 27, 2015

This is the most important feature I need. I also want It to be distributed, Allow another node to restore the state if the current node is down by saving the state on a central server (or using etcd)

@tombray
Copy link

tombray commented Aug 2, 2015

+1

1 similar comment
@Alino
Copy link
Contributor

Alino commented Aug 3, 2015

+1

@jipperinbham
Copy link
Contributor Author

@methuz I hope to get the adaptor state branch going again very soon but the initial implementation will not involve any distributed aspect to running transporter.

The initial goal is for transporter to save its running state at specified intervals to a SessionStore so if the transporter process crashed, it could be restarted and pick back up where it left off. In addition, the first version will likely only support 2 SessionStore implementations: File and Redis but more will be added over time.

@jipperinbham jipperinbham added this to the v.0.1.1 milestone Aug 6, 2015
@jipperinbham jipperinbham removed this from the v.0.1.1 milestone Aug 27, 2015
@ambasta
Copy link

ambasta commented Sep 4, 2015

Hi,

Since the feature has already been merged into master, post-configuring the config.yaml, how does one go about using it in application.js?

@SherClockHolmes
Copy link

+1

1 similar comment
@patmanh
Copy link

patmanh commented Oct 24, 2015

+1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants