-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single binary server (SQLite) and concurrency bugfixes for in memory storages #2114
Single binary server (SQLite) and concurrency bugfixes for in memory storages #2114
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a couple of nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. @wolfy-j , thank you for your contribution.
refCount int | ||
} | ||
|
||
func newConnPool() *connPool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connection pool exists here due to frontend / matching / history / worker (4 services) each will create its own DB connection & causing DB locking issue?
can you plz put a comment here?
we can possibly improve the DI to initialize this DB connection only once (for single binary), then this conn pool management can possibly be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added. Yes, refactoring DI would be the ideal approach, but I'm not sure about the size of the impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also try to turn off cluster, gossip and other services irrelevant for single binary server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, some services open and close connections sequentially, getting DB connection stopped and destroying the database in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also try to turn off cluster, gossip and other services irrelevant for single binary server?
we definitely should (assuming we are talking about the ideal version), plus inter-service communication can by pass network layer.
Also, some services open and close connections sequentially, getting DB connection stopped and destroying the database in memory.
sorry what do you mean? if using DI, DI should only initialize the DB conn once (assuming 1 and only 1 conn)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This use case won't exist with DI, but it exists now. We can probably take a look at it once it will be time to clean up all unnecessary services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also create a new development_sqlite.yaml
file with the static configs?
also run |
I guess it's the question about turning CGO off or on now. |
cmd/server/main.go
Outdated
@@ -41,6 +41,7 @@ import ( | |||
"go.temporal.io/server/common/log/tag" | |||
_ "go.temporal.io/server/common/persistence/sql/sqlplugin/mysql" // needed to load mysql plugin | |||
_ "go.temporal.io/server/common/persistence/sql/sqlplugin/postgresql" // needed to load postgresql plugin | |||
_ "go.temporal.io/server/common/persistence/sql/sqlplugin/sqlite" // needed to load sqlite plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what breaks the build. Apparently w/o this line, sqlite
package is not even compiled. With this line it gets compiled but compilation fails because we don't use CGO for our production Linux builds and
Important: because this is a
CGO
enabled package you are required to set the environment variableCGO_ENABLED=1
and have agcc
compile present within your path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to enable CGO and don't know what is the best way to solve this. The only one place where sqlite3
is used. Maybe we can avoid importing this package and use some other approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you already have a driver which utilizes the SQLite CGO package. The alternative is to use a different type of embedded database... or something like this - https://www.reddit.com/r/golang/comments/n1u2b1/new_advanced_cgofree_sqlite_package/
I agree about avoiding CGO, maybe create a different build pipeline? Or it's too complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also turn off the plugin in server and SQL so DataDog can patch their TemporalLite server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI there was lots of discussion around CGO on the original PR. I did have a non-CGO version working at one point using modernc.org/sqlite
, but we decided to just skip importing the driver by default: #1968 (comment)
I'd suggest doing the same here and follow up on supporting non-CGO builds or dealing with cross-compilation issues introduced by CGO separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the mean time it would be pretty straightforward to add a file to the main
package which uses a build tag to conditionally import the sqlite driver when CGO is enabled. That way anyone wanting to try out the sqlite
driver just needs to compile from source rather than having to write their own Temporal server wrapper or pull in Temporalite.
@@ -112,9 +123,38 @@ func (p *plugin) createDBConnection( | |||
// Maps struct names in CamelCase to snake without need for db struct tags. | |||
db.MapperFunc(strcase.ToSnake) | |||
|
|||
// only suitable for in memory databases! | |||
if _, ok := cfg.ConnectAttributes["setup"]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up we pass connection attributes more or less transparently into the sqlite driver here:
temporal/common/persistence/sql/sqlplugin/sqlite/plugin.go
Lines 176 to 187 in c38d169
for k, v := range cfg.ConnectAttributes { | |
key := strings.TrimSpace(k) | |
value := strings.TrimSpace(v) | |
if parameters.Get(key) != "" { | |
return nil, fmt.Errorf("duplicate connection attr: %v:%v, %v:%v", | |
key, | |
parameters.Get(key), | |
key, value, | |
) | |
} | |
parameters.Set(key, value) | |
} |
The current driver may not be picky about it but I think we should avoid passing in non-standard sqlite config options. How about checking if the driver is using memory mode instead? When persisting to a file we can use the same migration patterns we do for postgres/mysql/cassandra backends.
if _, ok := cfg.ConnectAttributes["setup"]; ok { | |
if mode := cfg.ConnectAttributes["mode"]; mode == "memory" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not seen a notification about this question. Ugh. I want to leave the dedicated flag since it makes possible to run bootstrap on sqlite file database (which is handy). But now this flag will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlegrone Also, I tried https://github.com/mattn/go-sqlite3 and I do not see a quick path to integrate it here without adding a few DB layer abstractions for sqlx compatibility. It might be easier and more effective to build a new DB type for in-memory use.
9bd25fc
to
244df69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enabled cgo by default in #2396, rebased this PR and tried to run it locally:
$ make bins
Delete old binaries...
Build temporal-server with CGO_ENABLED=1 for linux/amd64...
go build -o temporal-server ./cmd/server
Build tctl with CGO_ENABLED=1 for linux/amd64...
go build -o tctl ./cmd/tools/cli
Build tctl-authorization-plugin with CGO_ENABLED=1 for linux/amd64...
go build -o tctl-authorization-plugin ./cmd/tools/cli/plugins/authorization
Build temporal-cassandra-tool with CGO_ENABLED=1 for linux/amd64...
go build -o temporal-cassandra-tool ./cmd/tools/cassandra
Build temporal-sql-tool with CGO_ENABLED=1 for linux/amd64...
go build -o temporal-sql-tool ./cmd/tools/sql
$ ./temporal-server --zone sqlite start
2022/01/20 17:21:41 Loading config; env=development,zone=sqlite,configDir=config
2022/01/20 17:21:41 Loading config files=[config/base.yaml config/development.yaml config/development_sqlite.yaml]
and it just hangs. I didn't debug it further. Does it work for you locally?
// only suitable for in memory databases! | ||
if mode := cfg.ConnectAttributes["mode"]; mode == "memory" { | ||
// creates temporary DB overlay in order to configure database and schemas | ||
err := p.setupSQLiteDatabase(cfg, db) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call now conflicts with explicit schema creation in sqlite_test.go
. Where it should be removed: here or in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why unit tests are failing on Buildkite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to handle this inline here to simplify usage of the plugin; I don't know of any cases where it wouldn't make sense to initialize the schema with in-memory mode enabled.
In Temporalite we initialize schema when in-memory mode is being used OR when the db file doesn't already exist. We'd be able to skip the setup for in-memory mode here: https://github.com/DataDog/temporalite/blob/de413afe117f4a5117d1148198622f0f54d5edc2/server.go#L55-L64
As an aside, for file persistence mode I think we should aim for consistency with how migrations work with other sql storage drivers, ie. support sqlite via https://github.com/temporalio/temporal/tree/master/tools/sql.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, removing it from tests.
config/development_sqlite.yaml
Outdated
connectAttributes: | ||
mode: "memory" | ||
cache: "private" | ||
setup: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this setup
parameter anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's no longer needed.
2d8c7c0
to
00a0426
Compare
My bad, server actually was working. I am so get used to the log console output that we usually get, and I didn't notice that sqlite config has less verbose log level. I am changing the config to match to other SQL implementations. It might be noisy but I would like to keep in sync. |
In short, you can use temporal now as a single binary file with in-memory or file SQLite database.
These changes properly enabled the support for SQLite in the temporal server. The initial implementation did not address the fact that internal Temporal services are highly isolated, each will create at least a single connection to the database violating the SQLite concept of safety only within a single thread. While the base configuration works on any minor load it might and it will lock the database due to concurrent writers from different parts of the system.
The clean solution will be to refactor the Temporal container in order to provide a single database instance to all the services and limit concurrency on the SQL driver level. However, such change will cause a significant impact and will require very careful testing of the server. Instead, the change has been addressed on the plugin level only, since it's directly responsible for providing the valid database instance. Ideally, we also have to separate the database from the matching service and use the most appropriate solution for each service.
The following configuration is valid to start Temporal without any external dependencies or databases:
The temporal database will be fully configured in memory (see connectionAttribute
setup
) and ready to use within a few seconds. You still have to init the namespace.This setup can not be used on production for obvious reasons but can be used for demos, tests, and micro applications which do not experience high loads.
Tested locally with multiple concurrent workflows (1000). Observed performance ~30 workflows per second (hello world with a single activity, on PHP SDK).
Does not affect production usage.
P.S. Please let me know what you want me to change in here.