-
-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1.0.0] DML migrations (seeders) #687
Comments
First drafts of interfacesinsertexports.migrate = async (db, opt) => {
const { util } = opt.dbm;
await db.insert(
'test',
// can recieve object or array of same structured objs
util.insertObj([
{ column1: '1', column2: '2' },
{ column1: '1', column2: '3' },
{ column1: '2', column2: '4' }
])
);
/**
* For efficiency insertObj shouldn't be used on big operations
* The previous command essentially expands to:
*/
await db.insert('test', {
columns: ['column1', 'column2'],
data: ['1', '2', '1', '3', '2', '4']
});
/**
* The streaming interface can be accessed by passing a config object
* with the predefinition of the incoming data and the stream to consume
* from.
*/
const streamConfig = {
columns: ['column1', 'column2'],
stream: myDataStream
};
await db.insert.stream('test', streamConfig);
/**
* Streaming can be used with helper functions such as structured readers.
*/
await db.insert.stream(
'test',
util.readStructured('./data.csv', { delimiter: ';' })
);
/**
* And of course also read from other tables data.
*/
await db.insert.from('test', 'readme', {
map: {
source1: 'column1',
source2: 'column2'
},
search: 'ventilation NOT NULL'
});
/**
* Instead of search as a short hand, the get property
* will give access to get directly.
*/
await db.insert.from('test', 'readme', {
map: {
source1: 'column1',
source2: 'column2'
},
get: { /* ... */ }
})
};
exports._meta = {
version: 2,
DML: true
}; dot notation might be as well be integrated into the object as a mode property, but not sure, probably will provide both possibilities. getexports.migrate = async (db, opt) => {
await db.get('test', {
columns: ['column1']
});
/**
* columns can be aliased
*/
await db.get('test', {
columns: {
column1: 'maxima'
}
});
/**
* We avoid tapping into complicated languages to define
* the where clause. Instead you just use SQL or mongo schema
* depending on the db you use. If you want to provide genericity we will
* suggest using helper libraries like sql2mongo or vice versa. This allows
* flexibility and avoids also the need to learn to many new things and
* reinventing the weel just for a translation layer in this case.
*/
await db.get('test', {
columns: {
column1: 'maxima',
column2: 'something'
},
search: 'maxima >= 12'
});
/**
* However, the most simple operation we will support upfront. Everything
* else can use tooling. We will provide a plugin hook so this can be in
* doubt tightly integrated into the migrations.
*/
await db.get('test', {
columns: {
column1: 'maxima',
column2: 'something'
},
search: {
maxima: 15
}
});
/**
* when doing more then searching, you want to access it in query mode.
* You will have to write the whole definition by hand. This guarantees
* that we still understand what is happening (type of operation), but
* gives ultimate flexibility.
*
* The table param is optional, when using for example SQL it will not be
* needed, databases like mongodb will need it however, so it might be
* recommended as a best practice to not omit it, depending on how ultimate
* the decision of the db is yet.
*
* Adding LIMITS yourself is possible, but leave this to db-migrate in doubt.
*/
await db.get('test', {
query: 'DISTINCT value1 FROM test' // select will be forcefully prepended so we always know this will only get data
});
};
exports._meta = {
version: 2,
DML: true
}; |
Calling some of latest contributors and previous feedback givers :) /cc @BorntraegerMarc @Marsup @rfrm As always feedback and ideas are welcome |
Thanks for requesting some feedback. First of all I'd like to say that I'm not really a database migration expert and that I'm coming from the mongodb world. General I'd say a concept like this is very much needed. So here are a couple of points from my side:
|
@wzrdtales it's a nice thought. |
This will be the user controlled version mentioned. However, you will loose every and all benefits of db-migrates new capabilities in v2 if you do that. I will however see, how to mitigate that and leave the flexibility.
Yes and no. The migrations would run through completely, but they let your old application work, by leaving room for other queries to be executed and not lock the table all the time. The reason that your applications can't handle those is that you probably never cared about zero downtime at all yet and did not follow the principles to accomplish that. That might be ok in the field you're working in, but most fields today would invest the tiny amount of extra time it needs to accomplish zero downtime and save often millions by that (we talk about very big tables where those migration processes otherwise mean an hour downtime or longer). I am a consultant for a lot of big companies, so I know the direct needs of the industry very well, that is mainly the reason why I am so focused on reaching the state of a stable framework providing zero downtime capabilities.
This is the task of the concurrency manager also a v2 migration feature, but will be applied in general to v1.0.0 of db-migrate in general. Please see and read #538 for that. In the future there will be an event emitter so you can even partially activate new functions in your application depending on which migrations have run already instead of waiting for all of them to complete. |
@jtassin Pretty much like |
@wzrdtales i didn't mean my comment on #292 to sound snarky. i understand you (and anyone else who maintains OSS) have lots of other things to do. i wasn't asking for functionality that doesn't exist yet; i was just pointing out that the current state of documentation is a bit confusing, because it mentions something that doesn't really seem to be implemented. if the feature exists, whatever is there should be fully documented, and if not, that's fine, but IMO it should be removed from the docs until if/when it's ready. i would love to help out if i can find some time. will keep an eye on this in case i can. |
Description
DML migrations, or commonly referred to as Seeders, although not entirely precise with what DML migrations in db-migrate are, are reversible operations to manipulate the data inside the schema previously defined by default DDL migrations.
Implementation Details
There will be two modes with different APIs, one being handled by db-migrate, the other being handled by the user. The user handled one is a
atomic
wrapped API exposing DML functionality, essentially direct access to the db with runSql or other drivers provided functions. The db-migrate handled one follows the v2 principle of perfecttracability
andatomicity
. It works hand in hand with the DDL preparing certain standards on the tables, such as flag fields and will provide step by step a thorough API capable of transforming not only big, but large tables.Big Table Operations
Big tables do have the problem, that changes made to them will directly affect the running systems. Besides the necessity to follow principles like zero downtime mitigations, it is also necessary to operate in small steps on the tables to allow the live system to still operate on the tables in question. This becomes interesting especially for databases that do not provide online schema changes. Since this can be mitigated with cheaper DML (like adding notNull false columns instead of defaultValue fields and migrate that action over, or copy actions instead of type changes). These operations will be embedded into the DDL
Streaming continuable operations
This needs tracking of the current execution status, which will be two folded. First the state manager will carry the current state parameters for currently running seeding operations, secondly the flag field will indicate successfully updated rows to not depend on transactions. All the operations will resolve on completion, but will also provide a streaming interface, since especially DML will operate on massive amounts of data which needs the capability to work with partial data instead of the full available data.
Base operations
The db-migrate handled feature set will include the following base operations
The text was updated successfully, but these errors were encountered: