Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserting many items into a SQL Server database #217

Closed
ghost opened this issue Mar 17, 2017 · 5 comments
Closed

Inserting many items into a SQL Server database #217

ghost opened this issue Mar 17, 2017 · 5 comments

Comments

@ghost
Copy link

ghost commented Mar 17, 2017

I'm currently trying to add 100k entries into a table on an MS SQL Server 2014 from Linux via SQLAlchemy. Unfortunately, an error happens when I try to insert more than 200 entries per insert. The error happens with both the latest MS ODBC driver and with the FreeTDS driver:

MS ODBC with 300 rows (error: COUNT field incorrect or syntax error (0) (SQLExecDirectW)')):

    pdm_session.execute(insert)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1107, in execute
    bind, close_with_result=True).execute(clause, params or {})
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 945, in execute
    return meth(self, multiparams, params)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
    context)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
    exc_info
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
    raise value.with_traceback(tb)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
    context)
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('07002', '[07002] [Microsoft][ODBC Driver 13 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)') 
[SQL: 'INSERT INTO [TableName] ([xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx]) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), 
[...])]

FreeTDs with 300 row (error: Invalid descriptor index (0) (SQLBindParameter)')):

[...]
  File "/home/js/projects/projectname/.venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('07009', '[07009] [FreeTDS][SQL Server]Invalid descriptor index (0) (SQLBindParameter)')
[SQL: 'INSERT INTO [TableName] ([xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx], [xxx]) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?, ?, ?), 
[...]

I'm currently uploading in 200er batches and that works, but is slow. If I run the same insert into a postgres DB, it works. If I run a simple INSERT INTO ... VALUES (real values, Without binds,...),... it also works (with ~500 rows, both in a SQL Tool and via session.execute("SQL String") in SQLAlchemy).

I'm a bit lost how to debug this further. As it happens with both FreeTDS and the MS ODBC drivers, it looks a bit like a MS SQL Server error?

Installed packages (ubuntu 16.10):

λ dpkg -l |egrep "freetds|odbc"
ii  freetds-bin                                                 0.91-6.1build2                                  amd64        FreeTDS command-line utilities
ii  freetds-common                                              0.91-6.1build2                                  all          configuration files for FreeTDS SQL client libraries
ii  freetds-dev                                                 0.91-6.1build2                                  amd64        MS SQL and Sybase client library (static libs and headers)
ii  libodbc1:amd64                                              2.3.1-4.1build1                                 amd64        ODBC library for Unix
ii  libodbcinstq4-1:amd64                                       2.3.0-4                                         amd64        Qt-based ODBC configuration library
ii  msodbcsql                                                   13.1.4.0-1                                      amd64        ODBC Driver for Microsoft(R) SQL Server(R)
ii  odbcinst                                                    2.3.1-4.1build1                                 amd64        Helper program for accessing odbc ini files
ii  odbcinst1debian2:amd64                                      2.3.1-4.1build1                                 amd64        Support library for accessing odbc ini files
ii  tdsodbc:amd64                                               0.91-6.1build2                                  amd64        ODBC driver for connecting to MS SQL and Sybase SQL servers
ii  unixodbc                                                    2.3.1-4.1build1                                 amd64        Basic ODBC tools
ii  unixodbc-bin                                                2.3.0-4                                         amd64        Graphical tools for ODBC management and browsing
ii  unixodbc-dev                                                2.3.1-4.1build1                                 amd64        ODBC libraries for UNIX (development files)
@gordthompson
Copy link
Collaborator

You appear to be hitting SQL Server's limit of 2100 parameters per stored procedure (ref: here via here). When pyodbc sends a parameterized query to SQL Server it ends up being processed as a call to the SQL Server stored procedure sp_prepexec.

SQLAlchemy is producing a parameterized query using a "table value constructor" that in your case adds nine (9) parameters per row. A batch of 200 rows results in an sp_prepexec call with 1800 parameters so it works, but a batch of 300 rows results in a call with 2700 parameters which exceeds the limit of 2100.

So for now it looks like you'll need to keep inserting in batches of ~230 rows or less. Note that pyodbc may be changing the way it handles parameterized queries, so keep an eye on issue #214.

@mkleehammer
Copy link
Owner

Again, great catch @gordthompson. I'm going to close this for now.

@ghost
Copy link
Author

ghost commented Mar 20, 2017

@gordthompson Is there any known workaround? I tried generating a INSERT ... VALUES [lots of rows]; statement, but then I got the following exception:

pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Die Anzahl der Zeilenwertausdr�cke in der INSERT-Anweisung �berschreitet die maximal zul�ssige Anzahl von 1000 Zeilenwerten. (10738) (SQLExecDirectW)')

[Number of row value expressions in the insert statement is bigger than the maximum number of 1000 row values]

So for me it looks like there is no way to insert more than 1000 rows at a time?

@gordthompson
Copy link
Collaborator

"So for me it looks like there is no way to insert more than 1000 rows at a time?"

With a single INSERT statement, yes. The relevant MSDN document says that

The maximum number of rows that can be constructed by inserting rows directly in the VALUES list is 1000."

It goes on to say that

To insert more than 1000 rows, use one of the following methods:

Create multiple INSERT statements

Use a derived table

Bulk import the data by using the bcp utility or the BULK INSERT statement

@bobhy
Copy link

bobhy commented Oct 7, 2017

BULK INSERT statement should work for you. It can handle large imports. It has quaint issues around constraint checking, input file parsing and reading input files remote from the database server.

Here's an example (for SQL Server 2005-2017)
(my input file is Excel comma delimited, with column names in row one, needing to be skipped.)

import pyodbc
import contextlib

with contextlib.closing(pyodbc.connect(
            r'DRIVER={ODBC Driver 13 for SQL Server};SERVER=myserver\SQLEXPRESS;DATABASE=test_db;Trusted_Connection=yes;')) as conn:
    with contextlib.closing(conn.cursor()) as cursor:
        cursor.execute(
r"""
BULK INSERT test_data 
    FROM 'n:\DevWork\samples\test.csv' 
    WITH ( FIELDTERMINATOR =',', FIRSTROW=2);
""")
        conn.commit()
        conn.close()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants