You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The docs state that the excluding table names matching command with comma separated table names should be used to exclude tables, but the matching keyword doesn't work at all for MS SQL (also mentioned in #1167). An error is produced indicating that the matching keyword is not recognized.
The excluding table names like command does work, but only for one table - schema pair. When multiple excluding table names like commands are provided, then no data will be parsed, but the script will run successfully.
So, the following example works:
excluding table names like 'some_table' in schema 'dbo'
The following example doesn't:
excluding table names like 'some_table' in schema 'dbo'
excluding table names like 'some_table' in schema 'dbo2'
Worth mentioning is that the including only table names like command is repeatable. So the following 2 examples both work correctly:
including only table names like 'some_table' in schema 'dbo'
including only table names like 'some_table' in schema 'dbo'
including only table names like 'some_table' in schema 'dbo2'
I would expect that there is a way that makes it possible to exclude multiple table - schema pairs for MS SQL, but as far as I know/tried this doesn't seem to be possible.
pgloader --version
pgloader version "3.6.3047c9a"
compiled with SBCL 2.0.1.debian
did you test a fresh compile from the source tree?
did you search for other similar issues?
how can I reproduce the bug?
I don't have a remotely hosted MS SQL server to make this reproducible, sorry. I might add the create/insert setup queries later.
LOAD database
FROM mssql://exampleuser:examplepass@localhost:1433/database_oct_2021
INTO postgresql://postgres:password@localhost:5432/database_oct_2021
WITH data only
excluding table names like 'some_table' in schema 'dbo'
excluding table names like 'some_table' in schema 'schema1'
excluding table names like 'some_table' in schema 'schema2'
alter schema 'dbo' rename to 'public'
SET work_mem to '16MB', maintenance_work_mem to '512MB'
BEFORE LOAD DO
$$ SET TIME ZONE UTC; $$;
pgloader output you obtain
pgloader version 3.6.3047c9a
compiled with SBCL 2.0.1.debian
sb-impl::*default-external-format* :UTF-8
tmpdir: #P"/tmp/pgloader/"
2021-10-15T08:36:14.012000+02:00 NOTICE Starting pgloader, log system is ready.
2021-10-15T08:36:14.026000+02:00 INFO Starting monitor
2021-10-15T08:36:14.030000+02:00 LOG pgloader version "3.6.3047c9a"
2021-10-15T08:36:14.080000+02:00 INFO Parsed command:
LOAD database
FROM mssql://exampleuser:examplepass@localhost:1433/database_oct_2021
INTO postgresql://postgres:password@localhost:5432/database_oct_2021
WITH data only
excluding table names like 'some_table' in schema 'dbo'
excluding table names like 'some_table' in schema 'schema1'
excluding table names like 'some_table' in schema 'schema2'
alter schema 'dbo' rename to 'public'
SET work_mem to '16MB', maintenance_work_mem to '512MB'
BEFORE LOAD DO
$$ SET TIME ZONE UTC; $$;
2021-10-15T08:36:14.229000+02:00 DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {1007B6D703}>
2021-10-15T08:36:14.230000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:14.231000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:14.231000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:14.232000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:14.612000+02:00 NOTICE Executing SQL block for before load
2021-10-15T08:36:14.702000+02:00 DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {1007B6D703}>
2021-10-15T08:36:14.704000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:14.704000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:14.705000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:14.705000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:14.710000+02:00 DEBUG BEGIN
2021-10-15T08:36:14.711000+02:00 SQL SET TIME ZONE UTC;
2021-10-15T08:36:14.720000+02:00 LOG Migrating from #<MSSQL-CONNECTION mssql://exampleuser@localhost:1433/database_oct_2021 {1007B6BA03}>
2021-10-15T08:36:14.721000+02:00 LOG Migrating into #<PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {1007B6D703}>
Max connections reached, increase value of TDS_MAX_CONN
2021-10-15T08:36:15.007000+02:00 SQL MSSQL: sending query: -- params: dbname
-- table-type-name
-- including
-- filter-list-to-where-clause including
-- excluding
-- filter-list-to-where-clause excluding
select c.TABLE_SCHEMA,
c.TABLE_NAME,
c.COLUMN_NAME,
c.DATA_TYPE,
CASE
WHEN c.COLUMN_DEFAULT LIKE '((%' AND c.COLUMN_DEFAULT LIKE '%))' THEN
CASE
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) = 'newid()' THEN 'GENERATE_UUID'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) LIKE 'convert(%varchar%,getdate(),%)' THEN 'today'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) = 'getdate()' THEN 'CURRENT_TIMESTAMP'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) LIKE '''%''' THEN SUBSTRING(c.COLUMN_DEFAULT,4,len(c.COLUMN_DEFAULT)-6)
ELSE SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4)
END
WHEN c.COLUMN_DEFAULT LIKE '(%' AND c.COLUMN_DEFAULT LIKE '%)' THEN
CASE
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) = 'newid()' THEN 'GENERATE_UUID'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) LIKE 'convert(%varchar%,getdate(),%)' THEN 'today'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) = 'getdate()' THEN 'CURRENT_TIMESTAMP'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) LIKE '''%''' THEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4)
ELSE SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2)
END
ELSE c.COLUMN_DEFAULT
END,
c.IS_NULLABLE,
COLUMNPROPERTY(object_id(c.TABLE_NAME), c.COLUMN_NAME, 'IsIdentity'),
c.CHARACTER_MAXIMUM_LENGTH,
c.NUMERIC_PRECISION,
c.NUMERIC_PRECISION_RADIX,
c.NUMERIC_SCALE,
c.DATETIME_PRECISION,
c.CHARACTER_SET_NAME,
c.COLLATION_NAME
from INFORMATION_SCHEMA.COLUMNS c
join INFORMATION_SCHEMA.TABLES t
on c.TABLE_SCHEMA = t.TABLE_SCHEMA
and c.TABLE_NAME = t.TABLE_NAME
where c.TABLE_CATALOG = 'database_oct_2021'
and t.TABLE_TYPE = 'BASE TABLE'
and ((c.table_schema = 'dbo' and c.table_name NOT LIKE 'some_table')
and (c.table_schema = 'schema1' and c.table_name NOT LIKE 'some_table')
and (c.table_schema = 'schema2' and c.table_name NOT LIKE 'some_table'))
order by c.table_schema, c.table_name, c.ordinal_position;
2021-10-15T08:36:15.029000+02:00 SQL MSSQL: sending query: -- params: dbname
-- including
-- filter-list-to-where-clause including
-- excluding
-- filter-list-to-where-clause excluding
SELECT
REPLACE(KCU1.CONSTRAINT_NAME, '.', '_') AS 'CONSTRAINT_NAME'
, KCU1.TABLE_SCHEMA AS 'TABLE_SCHEMA'
, KCU1.TABLE_NAME AS 'TABLE_NAME'
, KCU1.COLUMN_NAME AS 'COLUMN_NAME'
, KCU2.TABLE_SCHEMA AS 'UNIQUE_TABLE_SCHEMA'
, KCU2.TABLE_NAME AS 'UNIQUE_TABLE_NAME'
, KCU2.COLUMN_NAME AS 'UNIQUE_COLUMN_NAME'
, RC.UPDATE_RULE AS 'UPDATE_RULE'
, RC.DELETE_RULE AS 'DELETE_RULE'
FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1
ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG
AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA
AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2
ON KCU2.CONSTRAINT_CATALOG = RC.UNIQUE_CONSTRAINT_CATALOG
AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA
AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME
WHERE KCU1.ORDINAL_POSITION = KCU2.ORDINAL_POSITION
AND KCU1.TABLE_CATALOG = 'database_oct_2021'
AND KCU1.CONSTRAINT_CATALOG = 'database_oct_2021'
AND KCU1.CONSTRAINT_SCHEMA NOT IN ('dto', 'sys')
AND KCU1.TABLE_SCHEMA NOT IN ('dto', 'sys')
AND KCU2.TABLE_SCHEMA NOT IN ('dto', 'sys')
and ((kcu1.table_schema = 'dbo' and kcu1.table_name NOT LIKE 'some_table') and (kcu1.table_schema = 'schema1' and kcu1.table_name NOT LIKE 'some_table') and (kcu1.table_schema = 'schema2' and kcu1.table_name NOT LIKE 'some_table'))
ORDER BY KCU1.CONSTRAINT_NAME, KCU1.ORDINAL_POSITION;
2021-10-15T08:36:15.070000+02:00 INFO Processing source catalogs
2021-10-15T08:36:15.094000+02:00 NOTICE Prepare PostgreSQL database.
2021-10-15T08:36:15.191000+02:00 DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {1007B6D703}>
2021-10-15T08:36:15.192000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:15.193000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:15.193000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:15.193000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:15.194000+02:00 DEBUG BEGIN
2021-10-15T08:36:15.232000+02:00 SQL --
-- get user defined SQL types
--
select nt.nspname,
extname,
typname,
case when enum.enumtypid is not null
then array_agg(enum.enumlabel order by enumsortorder)
end as enumvalues
from pg_class c
join pg_namespace n on n.oid = c.relnamespace
left join pg_attribute a on c.oid = a.attrelid and a.attnum > 0
join pg_type t on t.oid = a.atttypid
left join pg_namespace nt on nt.oid = t.typnamespace
left join pg_depend d on d.classid = 'pg_type'::regclass
and d.refclassid = 'pg_extension'::regclass
and d.objid = t.oid
left join pg_extension e on refobjid = e.oid
left join pg_enum enum on enum.enumtypid = t.oid
where nt.nspname !~ '^pg_' and nt.nspname <> 'information_schema'
and n.nspname !~ '^pg_' and n.nspname <> 'information_schema'
and c.relkind in ('r', 'f', 'p')
and
( t.typrelid = 0
or
(select c.relkind = 'c'
from pg_class c
where c.oid = t.typrelid)
)
and not exists
(
select 1
from pg_type el
where el.oid = t.typelem
and el.typarray = t.oid
)
group by nt.nspname, extname, typname, enumtypid
order by nt.nspname, extname, typname, enumtypid;
2021-10-15T08:36:15.288000+02:00 SQL -- params: table-type-name
-- including
-- filter-list-to-where-clause for including
-- excluding
-- filter-list-to-where-clause for excluding
with seqattr as
(
select adrelid,
adnum,
pg_get_expr(d.adbin, d.adrelid) as adsrc,
case when pg_get_expr(d.adbin, d.adrelid) ~ 'nextval'
then substring(pg_get_expr(d.adbin, d.adrelid)
from '''([^'']+)'''
)
else null
end as seqname
from pg_attrdef d
)
select nspname, relname, c.oid, attname,
t.oid::regtype as type,
case when atttypmod > 0
then substring(format_type(t.oid, atttypmod) from '\d+(?:,\d+)?')
else null
end as typmod,
attnotnull,
case when atthasdef
then pg_get_expr(def.adbin, def.adrelid)
end as default ,
case when s.seqname is not null then 'auto_increment' end as extra
from pg_class c
join pg_namespace n on n.oid = c.relnamespace
left join pg_attribute a on c.oid = a.attrelid
join pg_type t on t.oid = a.atttypid and attnum > 0
left join pg_attrdef def on a.attrelid = def.adrelid
and a.attnum = def.adnum
and a.atthasdef
left join seqattr s on def.adrelid = s.adrelid
and def.adnum = s.adnum
where nspname !~ '^pg_' and n.nspname <> 'information_schema'
and relkind in ('r', 'f', 'p')
order by nspname, relname, attnum;
2021-10-15T08:36:15.322000+02:00 SQL -- params: including
-- filter-list-to-where-clause for including
-- excluding
-- filter-list-to-where-clause for excluding
select n.nspname,
i.relname,
i.oid,
rn.nspname,
r.relname,
indisprimary,
indisunique,
(select string_agg(attname, ',')
from pg_attribute
where attrelid = r.oid
and array[attnum::integer] <@ indkey::integer[]
) as cols,
pg_get_indexdef(indexrelid),
c.conname,
pg_get_constraintdef(c.oid)
from pg_index x
join pg_class i ON i.oid = x.indexrelid
join pg_class r ON r.oid = x.indrelid
join pg_namespace n ON n.oid = i.relnamespace
join pg_namespace rn ON rn.oid = r.relnamespace
left join pg_depend d on d.classid = 'pg_class'::regclass
and d.objid = i.oid
and d.refclassid = 'pg_constraint'::regclass
and d.deptype = 'i'
left join pg_constraint c ON c.oid = d.refobjid
where n.nspname !~ '^pg_' and n.nspname <> 'information_schema'
order by n.nspname, r.relname
2021-10-15T08:36:15.360000+02:00 SQL -- params: including (table)
-- filter-list-to-where-clause for including
-- excluding (table)
-- filter-list-to-where-clause for excluding
-- including (ftable)
-- filter-list-to-where-clause for including
-- excluding (ftable)
-- filter-list-to-where-clause for excluding
select n.nspname, c.relname, nf.nspname, cf.relname as frelname,
r.oid,
d.refobjid as pkeyoid,
conname,
pg_catalog.pg_get_constraintdef(r.oid, true) as condef,
(select string_agg(attname, ',')
from pg_attribute
where attrelid = r.conrelid
and array[attnum::integer] <@ conkey::integer[]
) as conkey,
(select string_agg(attname, ',')
from pg_attribute
where attrelid = r.confrelid
and array[attnum::integer] <@ confkey::integer[]
) as confkey,
confupdtype, confdeltype, confmatchtype,
condeferrable, condeferred
from pg_catalog.pg_constraint r
JOIN pg_class c on r.conrelid = c.oid
JOIN pg_namespace n on c.relnamespace = n.oid
JOIN pg_class cf on r.confrelid = cf.oid
JOIN pg_namespace nf on cf.relnamespace = nf.oid
JOIN pg_depend d on d.classid = 'pg_constraint'::regclass
and d.refclassid = 'pg_class'::regclass
and d.objid = r.oid
and d.refobjsubid = 0
where r.contype = 'f'
AND c.relkind in ('r', 'f', 'p')
AND cf.relkind in ('r', 'f', 'p')
AND n.nspname !~ '^pg_' and n.nspname <> 'information_schema'
AND nf.nspname !~ '^pg_' and nf.nspname <> 'information_schema'
2021-10-15T08:36:15.387000+02:00 SQL -- params pkey-oid-list
-- fkey-oild-list
with pkeys(oid) as (
values(17438),(17436),(17449),(17426),(17419),(17457),(17468),(17518),(17520),(17507),(17515),(17479),(17492),(17490),(17383),(17381),(17402),(17517),(17409),(17391),(17535),(17537),(17549),(17547),(17321),(17323),(17333),(17335),(17191),(17193),(17310),(17308),(17344),(17342),(17367),(17365)
),
knownfkeys(oid) as (
values(17610),(17615),(17620),(17630),(17625),(17635)
),
pkdeps as (
select pkeys.oid, pg_depend.objid
from pg_depend
join pkeys on pg_depend.refobjid = pkeys.oid
where classid = 'pg_catalog.pg_constraint'::regclass
and refclassid = 'pg_catalog.pg_class'::regclass
)
select n.nspname, c.relname, nf.nspname, cf.relname as frelname,
r.oid as conoid, conname,
pg_catalog.pg_get_constraintdef(r.oid, true) as condef,
pkdeps.oid as index_oid
from pg_catalog.pg_constraint r
JOIN pkdeps on r.oid = pkdeps.objid
JOIN pg_class c on r.conrelid = c.oid
JOIN pg_namespace n on c.relnamespace = n.oid
JOIN pg_class cf on r.confrelid = cf.oid
JOIN pg_namespace nf on cf.relnamespace = nf.oid
where NOT EXISTS (select 1 from knownfkeys where oid = r.oid)
2021-10-15T08:36:15.391000+02:00 DEBUG fetch-pgsql-catalog: 21 tables, 36 indexes, 6+6 fkeys
2021-10-15T08:36:15.391000+02:00 INFO Done with COPYing data, waiting for indexes
2021-10-15T08:36:15.405000+02:00 NOTICE Completing PostgreSQL database.
2021-10-15T08:36:15.421000+02:00 NOTICE Reset sequences
2021-10-15T08:36:15.507000+02:00 DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {100C450043}>
2021-10-15T08:36:15.508000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:15.509000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:15.510000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:15.510000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:15.510000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:15.511000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:15.511000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:15.512000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:15.652000+02:00 DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@localhost:5432/database_oct_2021 {100CB25503}>
2021-10-15T08:36:15.653000+02:00 DEBUG SET client_encoding TO 'utf8'
2021-10-15T08:36:15.654000+02:00 DEBUG SET work_mem TO '16MB'
2021-10-15T08:36:15.654000+02:00 DEBUG SET maintenance_work_mem TO '512MB'
2021-10-15T08:36:15.654000+02:00 DEBUG SET application_name TO 'pgloader'
2021-10-15T08:36:15.656000+02:00 DEBUG BEGIN
2021-10-15T08:36:15.667000+02:00 LOG report summary reset
table name errors read imported bytes total time read write
----------------------- --------- --------- --------- --------- -------------- --------- ---------
before load 0 1 1 0.102s
fetch meta data 0 0 0 0.343s
Drop Foreign Keys 0 0 0 0.000s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
----------------------- --------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 0 4 4 0.000s
Reset Sequences 0 13 13 0.151s
Create Foreign Keys 0 0 0 0.000s
Install Comments 0 0 0 0.000s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
Total import time ✓ 0 0 0.151s
2021-10-15T08:36:15.731000+02:00 INFO Stopping monitor
data that is being loaded, if relevant
No data, that is the issue. I would expect data to be parsed from a total of 18 tables in my case. 3 tables, each from a different schema should be excluded.
How the data is different from what you expected, if relevant
The text was updated successfully, but these errors were encountered:
troley
added a commit
to semestry/pgloader
that referenced
this issue
Oct 15, 2021
Allows a pgloader load file to parse data correctly when multiple
`excluding table names like 'TABLE' in schema 'SCHEMA'`
filter rules have been provided.
Fixesdimitri#1328
troley
added a commit
to semestry/pgloader
that referenced
this issue
Oct 15, 2021
Allows a pgloader load file to parse data correctly when multiple
`excluding table names like 'TABLE' in schema 'SCHEMA'`
filter rules have been provided.
Fixesdimitri#1328
Allows a pgloader load file to parse data correctly when multiple
`excluding table names like 'TABLE' in schema 'SCHEMA'`
filter rules have been provided.
Fixesdimitri#1328
The docs state that the
excluding table names matching
command with comma separated table names should be used to exclude tables, but thematching
keyword doesn't work at all for MS SQL (also mentioned in #1167). An error is produced indicating that the matching keyword is not recognized.The
excluding table names like
command does work, but only for one table - schema pair. When multipleexcluding table names like
commands are provided, then no data will be parsed, but the script will run successfully.So, the following example works:
The following example doesn't:
Worth mentioning is that the
including only table names like
command is repeatable. So the following 2 examples both work correctly:I would expect that there is a way that makes it possible to exclude multiple table - schema pairs for MS SQL, but as far as I know/tried this doesn't seem to be possible.
pgloader --version
did you test a fresh compile from the source tree?
did you search for other similar issues?
how can I reproduce the bug?
I don't have a remotely hosted MS SQL server to make this reproducible, sorry. I might add the create/insert setup queries later.
No data, that is the issue. I would expect data to be parsed from a total of 18 tables in my case. 3 tables, each from a different schema should be excluded.
The text was updated successfully, but these errors were encountered: