Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in get_general_index_definition function #17

Closed
wants to merge 4 commits into from
Closed

Fix bug in get_general_index_definition function #17

wants to merge 4 commits into from

Conversation

bricklen
Copy link

  • Fixed bug in "table_oid::TEXT" section of the "TABLE_NAME" code. Executing SELECT create_hypertable('my_table', 'time'); failed with ERROR: Cannot process index with definition (no index name match)....
  • Replaced "=" with the preferred ":=" assignment operator. Both are allowed, but the latter is less likely to cause subtle bugs due to equality vs assignment (I've run into it before).
  • Qualified system relations with pg_catalog schema so there is no ambiguity.
  • Dropped the "CREATE" keyword from the "replace()" function because it was failing to cover the case where an index had a definition like "CREATE UNIQUE INDEX".
  • Updated some comments.
  • Revised variable names to have a "v_" prefix to reduce the chance of a name collision between objects in a function (I've hit this in production before).

bricklen and others added 4 commits April 10, 2017 14:40
* Fixed bug in "table_oid::TEXT" section of the "TABLE_NAME" code. Executing `SELECT create_hypertable('my_table', 'time');` failed with `ERROR:  Cannot process index with definition (no index name match)...`.
* Replaced "=" with the preferred ":=" assignment operator. Both are allowed, but the latter is less likely to cause subtle bugs due to equality vs assignment (I've run into it before).
* Qualified system relations with pg_catalog schema so there is no ambiguity.
* Dropped the "CREATE" keyword from the "replace()" function because it was failing to cover the case where an index had a definition like "CREATE UNIQUE INDEX".
* Updated some comments.
* Revised variable names to have a "v_" prefix to reduce the chance of a name collision between objects in a function (I've hit this in production before).
@bricklen bricklen mentioned this pull request Apr 11, 2017
@akulkarni
Copy link
Member

Thanks for the PR. We're in the process of getting ourselves ready to accept external contributions (e.g., putting together a style guide). Should have that ready soon.

Also, noticed that the tests are failing. Think you need to change the expected test result to work with your changes?

@bricklen
Copy link
Author

I did see the failed tests but I can't see how they relate to my changes to a plpgsql function. You'll notice the failing command is 'The command "make -f docker.mk test" exited with 2.'
Do your internal tests build successfully?

@cevian
Copy link
Contributor

cevian commented Apr 12, 2017

Our internal tests are passing. If you notice


test alternate_users          ... FAILED (test process exited with exit code 3)
test create_chunks            ... ok
test create_hypertable        ... ok
test ddl                      ... FAILED (test process exited with exit code 3)
test ddl_errors               ... ok
test ddl_single               ... FAILED (test process exited with exit code 3)
test delete                   ... FAILED (test process exited with exit code 3)
test drop_chunks              ... ok
test drop_extension           ... ok
test drop_rename_hypertable   ... FAILED (test process exited with exit code 3)
test insert                   ... FAILED (test process exited with exit code 3)
test insert_single            ... FAILED (test process exited with exit code 3)
test pg_dump                  ... FAILED (test process exited with exit code 3)
test sql_query                ... FAILED (test process exited with exit code 3)
test sql_query_results_optimized ... ok
test sql_query_results_unoptimized ... ok
test sql_query_results_x_diff ... ok
test tablespace               ... ok
test timestamp                ... FAILED (test process exited with exit code 3)
test update                   ... FAILED (test process exited with exit code 3)

That means the regression (golden file) tests are no longer passing (probably because the output changed in some small way). We'll work on making this more clear. But in the meantime, you can follow the instruction in the "Testing" section of the README to reproduce the same results locally. Then, your new regression test output will be in test/results and you can copy over the changes to test/expected and commit the changes to the files in test/expected.

Note, that this sort of regression file testing is standard and recommended in Postgres.

@bricklen
Copy link
Author

Hi, I must be doing something wrong. A fresh git clone and "make installcheck" results in all the tests failing. Can you see if I'm missing a step here?

make installcheck
/usr/pgsql-9.6/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/usr/pgsql-9.6/bin'    --inputdir=test --outputdir=test --launcher=test/runner.sh --host=localhost --port=5432 --user=postgres --load-language=plpgsql --load-extension=dblink --load-extension=postgres_fdw --load-extension=timescaledb --dbname=contrib_regression_timescaledb alternate_users create_chunks create_hypertable ddl ddl_errors ddl_single delete drop_chunks drop_extension drop_rename_hypertable insert insert_single pg_dump sql_query sql_query_results_optimized sql_query_results_unoptimized sql_query_results_x_diff tablespace timestamp update
(using postmaster on localhost, port 5432)
============== dropping database "contrib_regression_timescaledb" ==============
NOTICE:  database "contrib_regression_timescaledb" does not exist, skipping
DROP DATABASE
============== creating database "contrib_regression_timescaledb" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql                     ==============
CREATE LANGUAGE
============== installing dblink                      ==============
CREATE EXTENSION
============== installing postgres_fdw                ==============
CREATE EXTENSION
============== installing timescaledb                 ==============
CREATE EXTENSION
============== running regression test queries        ==============
test alternate_users          ... FAILED
test create_chunks            ... FAILED
test create_hypertable        ... FAILED
test ddl                      ... FAILED
test ddl_errors               ... FAILED
test ddl_single               ... FAILED
test delete                   ... FAILED
test drop_chunks              ... FAILED
test drop_extension           ... FAILED
test drop_rename_hypertable   ... FAILED
test insert                   ... FAILED
test insert_single            ... FAILED
test pg_dump                  ... FAILED
test sql_query                ... FAILED
test sql_query_results_optimized ... FAILED
test sql_query_results_unoptimized ... FAILED
test sql_query_results_x_diff ... FAILED
test tablespace               ... FAILED
test timestamp                ... FAILED
test update                   ... FAILED

========================
 20 of 20 tests failed.
========================

@bricklen
Copy link
Author

So on a freshly spun up VM, with a clean git pull + make + make install + make installcheck (after making pg_hba.conf and postgresql.conf changes), still results in half the tests failing.

make installcheck
/usr/pgsql-9.6/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/usr/pgsql-9.6/bin'    --inputdir=test --outputdir=test --launcher=test/runner.sh --host=localhost --port=5432 --user=postgres --load-language=plpgsql --load-extension=dblink --load-extension=postgres_fdw --load-extension=timescaledb --dbname=contrib_regression_timescaledb alternate_users create_chunks create_hypertable ddl ddl_errors ddl_single delete drop_chunks drop_extension drop_rename_hypertable insert insert_single pg_dump sql_query sql_query_results_optimized sql_query_results_unoptimized sql_query_results_x_diff tablespace timestamp update
(using postmaster on localhost, port 5432)
============== dropping database "contrib_regression_timescaledb" ==============
DROP DATABASE
============== creating database "contrib_regression_timescaledb" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql                     ==============
CREATE LANGUAGE
============== installing dblink                      ==============
CREATE EXTENSION
============== installing postgres_fdw                ==============
CREATE EXTENSION
============== installing timescaledb                 ==============
CREATE EXTENSION
============== running regression test queries        ==============
test alternate_users          ... FAILED
test create_chunks            ... ok
test create_hypertable        ... ok
test ddl                      ... ok
test ddl_errors               ... ok
test ddl_single               ... FAILED
test delete                   ... ok
test drop_chunks              ... ok
test drop_extension           ... ok
test drop_rename_hypertable   ... FAILED
test insert                   ... FAILED
test insert_single            ... FAILED
test pg_dump                  ... ok
test sql_query                ... FAILED
test sql_query_results_optimized ... ok
test sql_query_results_unoptimized ... FAILED
test sql_query_results_x_diff ... FAILED
test tablespace               ... FAILED
test timestamp                ... ok
test update                   ... ok

=======================
 9 of 20 tests failed.
=======================

@cevian
Copy link
Contributor

cevian commented Apr 12, 2017

That's strange, I just tested and everything passes for me on a clean clone. Can you send the test/regression.diffs file after the run?

@bricklen
Copy link
Author

regression_diffs.zip

@cevian
Copy link
Contributor

cevian commented Apr 13, 2017

I am sorry this has been so much trouble. I really appreciate you working with us. Apparently, we are missing some instructions/are unfamiliar with the way you are running this.

I did find one problem:

ERROR:  could not set permissions on directory "/tmp/tspace1": Operation not permitted

But I don't think that solves all the problems. I am having some trouble reproducing this. Can you run

SELECT version();

Also it would be very helpful if you could let us know more about your environment/ways to reproduce. How are you running the VM? How are you installing Postgres?

Just to make sure: this is running the master branch without your modifications, correct?

Again, really appreciate this.

Mat

@bricklen
Copy link
Author

bricklen commented Apr 13, 2017

The make install and make installcheck were run as root so I'm not sure why a permission problem is happening, unless there is a su occurring within the test itself?
Yes, these were clean git pulls, none of my changes were in the database. I just checked my history, the git clone came from your repo (not mine).

ls -l /tmp/
drwxr-xr-x. 2 root root  6 Apr 12 18:34 tspace1
select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

Installed PostgreSQL from package from https://yum.postgresql.org using Puppet (we have about 2000 VM's running Postgres so there should be nothing unusual about these few instances).

CentOS version:

cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)

@cevian
Copy link
Contributor

cevian commented Apr 13, 2017

I just ran centos:7 in docker.

In one terminal I issued the following commands:

yum install -y git
yum install -y https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
yum install -y postgresql96 postgresql96-server postgresql96-contrib postgresql96-devel
yum -y groupinstall "Development Tools"

git clone https://github.com/timescale/timescaledb
cd timescaledb/
export PATH=$PATH:/usr/pgsql-9.6/bin/
make && make install

su postgres
mkdir /tmp/tspace1
/usr/pgsql-9.6/bin/initdb -D /tmp/data
/usr/pgsql-9.6/bin/postgres -D /tmp/data -clogging_collector=off -cshared_preload_libraries="dblink,timescaledb"

In another terminal I ran:

cd timescaledb/
export PATH=$PATH:/usr/pgsql-9.6/bin/
make installcheck

And got the following output:

/usr/pgsql-9.6/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/usr/pgsql-9.6/bin'    --inputdir=test --outputdir=test --launcher=test/runner.sh --host=localhost --port=5432 --user=postgres --load-language=plpgsql --load-extension=dblink --load-extension=postgres_fdw --load-extension=timescaledb --dbname=contrib_regression_timescaledb alternate_users create_chunks create_hypertable ddl ddl_errors ddl_single delete drop_chunks drop_extension drop_rename_hypertable insert insert_single pg_dump sql_query sql_query_results_optimized sql_query_results_unoptimized sql_query_results_x_diff tablespace timestamp update
(using postmaster on localhost, port 5432)
============== dropping database "contrib_regression_timescaledb" ==============
NOTICE:  database "contrib_regression_timescaledb" does not exist, skipping
DROP DATABASE
============== creating database "contrib_regression_timescaledb" ==============
CREATE DATABASE
ALTER DATABASE
============== installing plpgsql                     ==============
CREATE LANGUAGE
============== installing dblink                      ==============
CREATE EXTENSION
============== installing postgres_fdw                ==============
CREATE EXTENSION
============== installing timescaledb                 ==============
CREATE EXTENSION
============== running regression test queries        ==============
test alternate_users          ... ok
test create_chunks            ... ok
test create_hypertable        ... ok
test ddl                      ... ok
test ddl_errors               ... ok
test ddl_single               ... ok
test delete                   ... ok
test drop_chunks              ... ok
test drop_extension           ... ok
test drop_rename_hypertable   ... ok
test insert                   ... ok
test insert_single            ... ok
test pg_dump                  ... ok
test sql_query                ... ok
test sql_query_results_optimized ... ok
test sql_query_results_unoptimized ... ok
test sql_query_results_x_diff ... ok
test tablespace               ... ok
test timestamp                ... ok
test update                   ... ok

======================
 All 20 tests passed.
======================

Can you spot any differences with your setup?

Thanks again,
Mat

@bricklen
Copy link
Author

(Sorry for the delay, I was away camping). I ran the commands you listed and got the following failures:

============== running regression test queries        ==============
test alternate_users          ... FAILED
test create_chunks            ... ok
test create_hypertable        ... ok
test ddl                      ... ok
test ddl_errors               ... ok
test ddl_single               ... FAILED
test delete                   ... ok
test drop_chunks              ... ok
test drop_extension           ... ok
test drop_rename_hypertable   ... FAILED
test insert                   ... FAILED
test insert_single            ... FAILED
test pg_dump                  ... ok
test sql_query                ... FAILED
test sql_query_results_optimized ... ok
test sql_query_results_unoptimized ... FAILED
test sql_query_results_x_diff ... FAILED
test tablespace               ... ok
test timestamp                ... ok
test update                   ... ok

=======================
 8 of 20 tests failed.
=======================

In our rig, Puppet has already installed Postgres and the associated config files, so there could be some differences in that area, but none that I can think of that would cause failures. The only changes I've made to my conf files are the shared_preload_libraries additions, and a couple changes to the pg_hba.conf:

host    all    all    ::1/128    trust
host    single    alt_user    ::1/128    trust

I am using a log_line_prefix of %m [%p] (user=%u) (db=%d) (rhost=%h) [vxid:%v txid:%x] [%i] . I'll attach the Postgres log output from the make installcheck steps, and you can see the various permission-denied errors. You don't see those in your tests?
timescaledb_apr17_installcheck_postgres.zip

@cevian
Copy link
Contributor

cevian commented Apr 17, 2017

So I am confused in your log you have

2017-04-17 15:02:00.198 UTC [26064] (user=alt_usr) (db=single) (rhost=::1) [vxid:4/20 txid:0] [SELECT] ERROR:  permission denied for relation _hyper_1_0_replica
2017-04-17 15:02:00.198 UTC [26064] (user=alt_usr) (db=single) (rhost=::1) [vxid:4/20 txid:0] [SELECT] STATEMENT:  SELECT * FROM "one_Partition";

Which is the correct output. However your diff had:

*** /pgsql/cluster/timescale_src/timescaledb/test/expected/alternate_users.out	2017-04-12 18:28:29.023712588 +0000
--- /pgsql/cluster/timescale_src/timescaledb/test/results/alternate_users.out	2017-04-12 18:39:04.636830527 +0000
***************
*** 70,76 ****
  \set ON_ERROR_STOP 0
  --todo fix error message here:
  SELECT * FROM "one_Partition";
! ERROR:  permission denied for relation _hyper_1_0_replica
  \set ON_ERROR_STOP 1
  CREATE TABLE "1dim"(time timestamp, temp float);
  SELECT create_hypertable('"1dim"', 'time');
--- 70,89 ----
  \set ON_ERROR_STOP 0
  --todo fix error message here:
  SELECT * FROM "one_Partition";
!      timeCustom      | device_id | series_0 | series_1 | series_2 | series_bool 
! ---------------------+-----------+----------+----------+----------+-------------
!  1257894000000000000 | dev1      |      1.5 |        1 |        2 | t
!  1257894000000000000 | dev1      |      1.5 |        2 |          | 
!  1257894000000001000 | dev1      |      2.5 |        3 |          | 
!  1257894001000000000 | dev1      |      3.5 |        4 |          | 
!  1257897600000000000 | dev1      |      4.5 |        5 |          | f
!  1257894000000000000 | dev20     |      1.5 |        1 |          | 
!  1257894002000000000 | dev1      |      2.5 |        3 |          | 
!  1257894000000000000 | dev20     |      1.5 |        2 |          | 
!  1257987600000000000 | dev1      |      1.5 |        1 |          | 
!  1257987600000000000 | dev1      |      1.5 |        2 |          | 
! (10 rows)
! 

Which seemed to imply that SELECT * FROM "one_Partition"; succeeded and did not give ERROR: permission denied for relation _hyper_1_0_replica ?
Is your diff still the same?

Is there any way you can send me your postgresql.conf or is this proprietary?

Also the output of psql -v ON_ERROR_STOP=1 -v VERBOSITY=terse -v ECHO=all -U postgres -f alternate_users.sql run from within timescaledb/test/sql could be useful.

Thanks and sorry for the hassle.
Mat

@bricklen
Copy link
Author

HI Mat, absolutely no need to apologize. I'm happy to help if I can, and I realize this debugging is taking time away from your other work. It is entirely possible that the problem is on my end, or the steps I'm running (and which user they run as).

alternate_users.out:
timescaledb_apr17_alternate_users.zip

Non-standard postgresql.conf settings:
timescaledb_postgresql_conf.zip

alternate_users.sql:
output_of_alternate_users_sql.zip

@cevian
Copy link
Contributor

cevian commented Apr 17, 2017

Ok I isolated the problem. It is the cpu_tuple_cost=0.6 setting causing rows to be returned in a slightly different order. We will work on making our regression tests more robust to such settings (just filed an issue #21 ), but in the mean time if you just change this setting back to default it should work.

Thanks again for working with us,
Mat

@bricklen
Copy link
Author

Yep, all tests pass now that that settings is back to default.
I also reran the make installcheck after replacing the _timescaledb_internal.get_general_index_definition function with my changes in the couple of places that it is defined and the tests all passed.

@cevian
Copy link
Contributor

cevian commented Apr 17, 2017

Great. So would it be easy to update this PR with the changes needed to make the tests pass?

@bricklen
Copy link
Author

Is there a way to force the existing batch of changes to get retested without having to make other changes? I can't understand why the tests failed in the first place, since I didn't touch any configs (including the cpu_tuple_cost setting).
I hope you can pardon my ignorance of these processes in git, I'm a DBA who doesn't use git much except the basics at work.

@cevian
Copy link
Contributor

cevian commented Apr 17, 2017

I just restarted the build with the same results. I'll take a closer look tonight/tomorrow morning. The best way to test a PR is to check it out locally (https://help.github.com/articles/checking-out-pull-requests-locally/) and run make installcheck on the pr exactly as it is.

@bricklen
Copy link
Author

Thanks, I'll try to test again today if I can scrape up a small window of time.

@bricklen
Copy link
Author

Okay, I have a revised function that works, passes tests, and handles mixed-case identifiers properly (table, index, and column names) and indexes that have more than then bare "INDEX" keyword (eg. UNIQUE INDEX).
Your current version of _timescaledb_internal.get_general_index_definition fails that last check.
I'll cancel these two PRs (17 and 19) and create a new one with just this changed function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants