Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly duplicate empty tables #28

Closed
sevakram opened this issue Oct 23, 2020 · 11 comments
Closed

Correctly duplicate empty tables #28

sevakram opened this issue Oct 23, 2020 · 11 comments
Labels
enhancement New feature or request

Comments

@sevakram
Copy link

I'm just trying out this library with a blank django project. I've a project on heroku and I want to replicate the heroku database locally in sqlite, but it seems that db-to-sqlite is skipping some tables.

Command run -

db-to-sqlite postgres://uid:pwd@host/dbname db.sqlite3 --all -p

Output -

1/10: django_migrations
2/10: django_content_type
3/10: auth_permission
4/10: auth_group
5/10: auth_group_permissions
6/10: auth_user_groups
7/10: auth_user_user_permissions
8/10: django_admin_log
9/10: auth_user
10/10: django_session

Adding 1 foreign key
  auth_permission.content_type_id => django_content_type.id

When tried to run django locally, faced an error that django could not find django_admin_log, so I tried to see if tables have been copied correctly or not-

$ sqlite3 db.sqlite3
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
sqlite> .tables
auth_permission      django_content_type  django_session
auth_user            django_migrations
sqlite>
@rakbladsvalsen
Copy link

rakbladsvalsen commented Feb 19, 2021

Can also confirm this. The dump seems to be incomplete when using django. In fact, it seems to completely omit some Django auth tables, like auth_groups.

Otherwise it works flawlessly. Hope this gets fixed someday.

Edit: This happened to me when I was trying to convert a full PostgreSQL database. I haven't checked if this happens with MySQL/Mariadb, though.
By doing further testing it seems that the tool is unable to convert some tables even with --table. The resulting sqlite file will be literally empty.

@rakbladsvalsen
Copy link

Just found something which is kinda a baid-aid for this problem. If you're using django, you can do the following:

  1. Migrate your app to an empty sqlite database (python manage.py migrate). This will create an empty database with your schema as defined in models.py
  2. Open the newly created database with sqlite3 db.sqlite3 and type the following:
.out schema.sql
.schema
.quit

This will extract the schema of your empty sqlite database.
3. Use db-to-sqlite to migrate your postgre/mariadb database to sqlite. The resulting file will contain an incomplete dump. Let's assume that the resulting file is called dump.sqlite3
4. Now run sqlite3 dump.sqlite3 < schema.sql. This will create the missing tables or whathever db-to-sqlite missed. You will see error messages, just ignore them. Or remove the conflicting lines.
5. Literally eureka, now your dump.sqlite3 is complete.

I gave db-to-sqlite3 a quick read and it seems that there's no logic to handle empty tables. It could be a sqlalchemy bug too, but idk to be honest.

@simonw
Copy link
Owner

simonw commented Feb 23, 2021 via email

@simonw simonw changed the title Not downloading the complete schema Correctly duplicate empty tables Feb 27, 2021
@simonw simonw added the enhancement New feature or request label Feb 27, 2021
@simonw
Copy link
Owner

simonw commented Feb 27, 2021

sqlite-utils has a db[table].create(...) method that can do thus. The trick will be introspecting the existing schema with SQLAlchemy and mapping the types to int/float/str.

The current code lazily avoids this by relying on sqlite-utils ability to create a schema automatically based on an incoming list of row dictionaries to be inserted.

@simonw
Copy link
Owner

simonw commented Feb 27, 2021

rows = itertools.chain([first], rows)
if progress:
with click.progressbar(rows, length=count) as bar:
db[table].insert_all(bar, pk=pks, replace=True)
else:
db[table].insert_all(rows, pk=pks, replace=True)

@simonw
Copy link
Owner

simonw commented Feb 27, 2021

This SQLAlchemy method will give me details of the columns. Not sure how to make the types though: https://docs.sqlalchemy.org/en/13/core/reflection.html#sqlalchemy.engine.reflection.Inspector.get_columns

@simonw
Copy link
Owner

simonw commented Feb 27, 2021

I think I need to provide a mapping for this list of subclases (or a subset thereof - maybe just the base Text one maps to str for example): https://docs.sqlalchemy.org/en/13/core/type_basics.html#generic-types

@simonw
Copy link
Owner

simonw commented Feb 27, 2021

I can bring across NOT NULL too - I think sqlite-utils got more sophisticated about table creation since I first wrote db-to-sqlite.

@simonw
Copy link
Owner

simonw commented Jun 11, 2021

https://docs.sqlalchemy.org/en/13/core/reflection.html#sqlalchemy.engine.reflection.Inspector.get_columns

Given a string table_name and an optional string schema, return column information as a list of dicts with these keys:

  • name - the column's name
  • type - the type of this column; an instance of TypeEngine
  • nullable - boolean flag if the column is NULL or NOT NULL
  • default - the column's server default value - this is returned as a string SQL expression.
  • autoincrement - indicates that the column is auto incremented - this is returned as a boolean or 'auto'
  • comment - (optional) the comment on the column. Only some dialects return this key
  • computed - (optional) when present it indicates that this column is computed by the database. Only some dialects return this key. Returned as a dict with the keys:
    • sqltext - the expression used to generate this column returned as a string SQL expression
    • persisted - (optional) boolean that indicates if the column is stored in the table
      New in version 1.3.16: - added support for computed reflection.
  • dialect_options - (optional) a dict with dialect specific options

For example:

(Pdb) inspector.get_columns(table)
[{'name': 'id', 'type': INTEGER(), 'default': None, 'comment': None, 'nullable': False, 'autoincrement': True}, {'name': 'name', 'type': VARCHAR(length=32), 'default': None, 'comment': None, 'nullable': False}]

python_type looks useful:

(Pdb) inspector.get_columns(table)[0]['type'].python_type
<class 'int'>

@simonw
Copy link
Owner

simonw commented Jun 11, 2021

Got this error:

  File "/Users/simon/Dropbox/Development/db-to-sqlite/db_to_sqlite/cli.py", line 121, in cli
    for column in inspector.get_columns(table)
  File "/Users/simon/Dropbox/Development/db-to-sqlite/db_to_sqlite/cli.py", line 121, in <dictcomp>
    for column in inspector.get_columns(table)
  File "/Users/simon/.local/share/virtualenvs/db-to-sqlite-Nk_u3PUs/lib/python3.7/site-packages/sqlalchemy/sql/type_api.py", line 416, in python_type
    raise NotImplementedError()
NotImplementedError

It was because of {'name': 'ip', 'type': INET(), 'nullable': False, 'default': None, 'autoincrement': False, 'comment': None}

@simonw
Copy link
Owner

simonw commented Jun 11, 2021

Tried this on my blog and got the following SQLite schema for an empty table:

CREATE TABLE [blog_comment] (
   [id] INTEGER,
   [object_id] INTEGER,
   [body] TEXT,
   [created] TEXT,
   [name] TEXT,
   [url] TEXT,
   [email] TEXT,
   [openid] TEXT,
   [ip] TEXT,
   [spam_status] TEXT,
   [visible_on_site] INTEGER,
   [spam_reason] TEXT,
   [content_type_id] INTEGER,
   FOREIGN KEY([content_type_id]) REFERENCES [django_content_type]([id])
);
CREATE INDEX [idx_blog_comment_content_type_id]
    ON [blog_comment] ([content_type_id]);

@simonw simonw closed this as completed in c3bb614 Jun 11, 2021
simonw added a commit that referenced this issue Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants