some duckdb sql cannot run in jupysql #172

wonb168 · 2023-02-28T07:07:48Z

multi lines, must leave a blank after $$?

%sql create table test_table(sqlstr text);
%%sql
insert into test_table values ($$create table t(id int)
,code text$$)

Parser Error: syntax error at or near "create"
LINE 1: insert into test_table values ($create table t(id int
BUT, leave a blank between $$ and create, will OK

can not run regexp_split_to_array?

%%sql 
select regexp_split_to_array(a.parts,','),regexp_split_to_array(b.parts,',')
from v_ddl a
inner join v_ddl2 b on a.word3=b.word3 and a.sqlstr<>b.sqlstr

raise ERROR:

Cell In[49], line 4
    select regexp_split_to_array(a.parts,','),regexp_split_to_array(b.parts,',')
           ^
SyntaxError: invalid syntax

BUT, this sql have no prolem in duckdb cli or in python.

The text was updated successfully, but these errors were encountered:

edublancas · 2023-02-28T08:42:21Z

Ok, so the problem with the first example is that $something is interpreted as variable substitution, that's why adding a space fixes the problem.

We're fixing this in #93 and $variable will no longer trigger variable substitution. We'll use the more common {{something}} format.

for the second example: I couldn't reproduce it (see my output below). I got a "Table with name v_ddl does not exist!" error, which makes sense. But you're getting "invalid syntax", which means that your Python session is interpreting the SQL code as Python; maybe you have some extra line breaks?

full session below:

first example

%load_ext sql

%sql duckdb://

%sql create table test_table(sqlstr text);

Console output (1/2):

*  duckdb://
Done.

Console output (2/2):

%%sql
insert into test_table values ($$create table t(id int)
,code text$$)

Console output (1/1):

*  duckdb://
(duckdb.ParserException) Parser Error: syntax error at or near "create"
LINE 1: insert into test_table values ($create table t(id int)
                                        ^
[SQL: insert into test_table values ($create table t(id int)
,code text$$)]
(Background on this error at: https://sqlalche.me/e/14/f405)

second example

%%sql 
select regexp_split_to_array(a.parts,','),regexp_split_to_array(b.parts,',')
from v_ddl a
inner join v_ddl2 b on a.word3=b.word3 and a.sqlstr<>b.sqlstr

Console output (1/1):

*  duckdb://
(duckdb.CatalogException) Catalog Error: Table with name v_ddl does not exist!
Did you mean "pg_am"?
[SQL: select regexp_split_to_array(a.parts,','),regexp_split_to_array(b.parts,',')
from v_ddl a
inner join v_ddl2 b on a.word3=b.word3 and a.sqlstr<>b.sqlstr]
(Background on this error at: https://sqlalche.me/e/14/f405)

wonb168 · 2023-03-01T06:42:53Z

Second example：

with p1 as(select word3,unnest(regexp_split_to_array(a.parts,', \n')) p from v_ddl a) 
,p2 as(select word3,unnest(regexp_split_to_array(a.parts,', \n')) p from v_ddl2 a) 
select --concat('alter table ',p1.word3,' add ',p1.p,';'), 
p1.word3,p1.p 
from  p1 
left join p2 on p1.p=p2.p 
where p2.p is null

this WORKS, but if I add the concat line, raise error :
SyntaxError: invalid syntax. Perhaps you forgot a comma?
but this works in the cli also.

BUT, if I copy this sql into a new cell, also works, see the flowing:

WHY? maybe some proplem in jupysql cell?

wonb168 · 2023-03-01T07:25:52Z

here is the whole test data:

%%sql
create table ddl (sqlstr text);
create table ddl2 (sqlstr text);
insert into ddl values ($$ CREATE TABLE public.test_partition_list (
    id integer,
    name character varying(64),
    fdate character varying(10)
)
 DISTRIBUTED BY (id) PARTITION BY LIST(fdate) 
          (
          PARTITION p1 VALUES('2017-01-01') WITH (tablename='test_partition_list_1_prt_p1', appendonly='false'), 
          PARTITION p2 VALUES('2017-01-02') WITH (tablename='test_partition_list_1_prt_p2', appendonly='false')
         );$$)
;
insert into ddl2 values ($$ CREATE TABLE public.test_partition_list (
    id integer,
    name character varying(64),
    fdate character varying(10)
)
 DISTRIBUTED BY (name) PARTITION BY LIST(fdate) 
          (
          PARTITION p2 VALUES('2017-01-02') WITH (tablename='test_partition_list_1_prt_p2', appendonly='false')
         );$$)
;
create view v_ddl as
select regexp_extract(sqlstr, '(\S+)', 1) left1
,regexp_extract(sqlstr, '(\S+\s\S+)', 1) left2
,regexp_extract(sqlstr, '\S+\s\S+\s(\S+)', 1) word3
-- 动作，除set、select、comment取首单词，其他取前2个单词
,case when left1 in ('SET','SELECT','COMMENT') then left1 else left2 end as action
-- 对象，只有create取，取动作后的第一个单词
,case when left1='CREATE' then word3 else sqlstr end as obj
-- 表的字段列表，函数的参数列表
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '(\(.*\))\s+DISTRIBUTED', 1,'s') --s跨行匹配，默认.不支持\n
--  when left2='CREATE FUNCTION' then regexp_extract(sqlstr, '(\(.*) DISTIRBUTED', 1) 函数也是动作后第一段非空
  end as cols
-- 表的分布键，DISTIRBUTED BY开始，括号结束'(DISTRIBUTED BY .*?\))'
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '(DISTRIBUTED BY .*?\))', 1) end as dkey 
-- 表的分区转array
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '\n(\s+PARTITION \S+ VALUES.*\))\s+\)', 1,'s') end as parts
,* 
from ddl;
create view v_ddl2 as
select regexp_extract(sqlstr, '(\S+)', 1) left1
,regexp_extract(sqlstr, '(\S+\s\S+)', 1) left2
,regexp_extract(sqlstr, '\S+\s\S+\s(\S+)', 1) word3
-- 动作，除set、select、comment取首单词，其他取前2个单词
,case when left1 in ('SET','SELECT','COMMENT') then left1 else left2 end as action
-- 对象，只有create取，取动作后的第一个单词
,case when left1='CREATE' then word3 else sqlstr end as obj
-- 表的字段列表，函数的参数列表
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '(\(.*\))\s+DISTRIBUTED', 1,'s') --s跨行匹配，默认.不支持\n
--  when left2='CREATE FUNCTION' then regexp_extract(sqlstr, '(\(.*) DISTIRBUTED', 1) 函数也是动作后第一段非空
  end as cols
-- 表的分布键，DISTIRBUTED BY开始，括号结束'(DISTRIBUTED BY .*?\))'
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '(DISTRIBUTED BY .*?\))', 1) end as dkey 
-- 表的分区转array
,case when left2='CREATE TABLE' then regexp_extract(sqlstr, '\n(\s+PARTITION \S+ VALUES.*\))\s+\)', 1,'s') end as parts
,* 
from ddl2;

then:

with p1 as(select word3,unnest(regexp_split_to_array(a.parts,', \n')) p from v_ddl a) 
,p2 as(select word3,unnest(regexp_split_to_array(a.parts,', \n')) p from v_ddl2 a) 
select concat('alter table ', p1.word3,' drop ', regexp_extract(p1.p,'(\S+\s+\S+)',1),';'), p1.word3, p1.p 
from  p1 
left join p2 on p1.word3=p2.word3 and p1.p=p2.p 
where p2.p is null

In jupysql error, but in cli it's OK!

edublancas · 2023-03-01T18:50:17Z

can you share the .py script (the one that you have in the screenshot)? you can remove stuff if there's sensitive info there.

wonb168 · 2023-03-02T04:15:34Z

can you share the .py script (the one that you have in the screenshot)? you can remove stuff if there's sensitive info there.

issue.py.txt
because github not support py, I rename the file name as .txt.

edublancas · 2023-03-02T13:18:18Z

I see what the problem is! You cannot add comments before the cell magic.

This will break:

# %%
# stuff
%%sql
create table ddl (sqlstr text);
create table ddl2 (sqlstr text);

because Python will interpret the line after the # stuff comment as Python, instead of SQL.

You can create it in a separate cell:

# %% [markdown]
# some comment

# %%
# stuff
%%sql
create table ddl (sqlstr text);
create table ddl2 (sqlstr text);

Or use a SQL-style comment after the cell magic:

# %%
%%sql
-- stuff
create table ddl (sqlstr text);
create table ddl2 (sqlstr text);

wonb168 changed the title ~~some sql cannot run in jupysql~~ some duckdb sql cannot run in jupysql Feb 28, 2023

edublancas closed this as completed Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some duckdb sql cannot run in jupysql #172

some duckdb sql cannot run in jupysql #172

wonb168 commented Feb 28, 2023

edublancas commented Feb 28, 2023

wonb168 commented Mar 1, 2023 •

edited

Loading

wonb168 commented Mar 1, 2023

edublancas commented Mar 1, 2023

wonb168 commented Mar 2, 2023

edublancas commented Mar 2, 2023

some duckdb sql cannot run in jupysql #172

some duckdb sql cannot run in jupysql #172

Comments

wonb168 commented Feb 28, 2023

edublancas commented Feb 28, 2023

first example

second example

wonb168 commented Mar 1, 2023 • edited Loading

wonb168 commented Mar 1, 2023

edublancas commented Mar 1, 2023

wonb168 commented Mar 2, 2023

edublancas commented Mar 2, 2023

wonb168 commented Mar 1, 2023 •

edited

Loading