Skip to content

Commit

Permalink
[#4415] YSQL: Use default value for existing rows after ALTER TABLE A…
Browse files Browse the repository at this point in the history
…DD COLUMN ... DEFAULT

Summary:
Presently, in YB, we disallow this operation for volatile default values:
```
yugabyte=# ALTER TABLE test ADD COLUMN y float DEFAULT random();
ERROR:  Rewriting of YB table is not yet implemented
```

Meanwhile, for non-volatile values, we do not backfill existing rows. The new column’s value for the existing rows is null. This departs from PG’s semantics and can also lead to a constraint violation if the new column has a NOT NULL constraint.
Note: For new rows, the default value is inserted correctly.
```
yugabyte=# CREATE TABLE test (t int);
CREATE TABLE
yugabyte=# INSERT INTO test VALUES (1);
INSERT 0 1
yugabyte=# ALTER TABLE test ADD COLUMN y int DEFAULT 2 NOT NULL;
ALTER TABLE
yugabyte=# SELECT * FROM test;
 t | y
---+---
 1 |     -- in PG, y would have value of 2 here. This is also a constraint violation.
(1 row)
yugabyte=# INSERT INTO test(t) VALUES (2);
INSERT 0 1
yugabyte=# SELECT * FROM test;
 t | y
---+---
 1 |      -- in PG, y would have value of 2 here
 2 | 2    -- Newly inserted rows correctly use the default value.
(2 rows)
```

This diff fixes this issue by introducing a new field (missing_val) which will be stored in the column's metadata (ColumnSchema and for packed rows in ColumnPackingData as well). We do **not** backfill the existing rows. The default value will be filled in on-the-fly if the column value is missing for a row.

Implementation details:

High-level design:

  - Evaluate the non-volatile default expression in the PG layer.
  - Augment the existing Alter Table Add Column flow to take an optional missing value, and pass it down to DocDB.
  - Store the missing value in the newly added column’s schema (`ColumnSchema`).
  - When doc_reader reads a row (using the latest schema) and encounters a missing column for the row, it will look at the column schema to see if there is a missing default value for the column. If there is, it will fill in the missing default value (instead of null as it currently does).

Regular storage:

  - For this feature, we need a way to distinguish whether the column entry is missing, or if a null was explicitly inserted by the user.
  - Presently, null entries are stored as kTombstone records. These tombstone records are compacted away after the retention history period has exceeded.
  - Therefore, for columns that have a missing value, we will store null entries as kNullLow.
  - Note: It does not matter what we store the column nulls as. All of kNullLow, kNullHigh, kTombstone are interpreted as null by docdb.
  - Although we can fill in missing values on a full compaction, it is not required for correctness, and can be handled later as an optimization. Therefore, we will NOT backfill the rows on compaction in this case. We will rely on the stored missing value in ColumnSchema instead.

Packed storage:

  - Explicitly inserted nulls are stored as columns with 0 length in packed storage format. This behavior will not change.
  - On compaction, we write the packed rows in accordance with the latest schema packing.
  - Presently, for missing values we write an explicit null. This behavior will be altered to write the missing value instead (if any).
  - Therefore, packed rows will be backfilled with the missing value on compaction.

Note: This feature is under the `Persisted` autoflag, as we are adding a new field to the `ColumnSchema` which is used in backups. The feature will only be turned on after upgrade is complete. Once enabled, it is not safe to downgrade/rollback.
Jira: DB-1463

Test Plan:
  - Imported PG test `yb_pg_fast_default`
  - Tests in `yb_alter_table`: test different default value types and expressions (UDTs, functions etc.), test explicit null values for the new column, test `SET DEFAULT`, test to verify that we don't set `attmissingval` and `atthasmissing` in YB, test default values for partitioned tables
  - `YBAddColumnDefaultBackupTest.TestYSQLDefaultMissingValues` -- verify that missing default values are backed up and restored
  - `YbAdminSnapshotScheduleTestWithYsqlParam.PgsqlDeleteColumnWithMissingDefault` -- verify that missing default values are restored after PITR
  - `PgDdlAtomicitySanityTest.DropColumnWithMissingDefaultTest` -- verify that we are able to roll back `DROP COLUMN` on a column with a missing default value
  - `PgAddColumnDefaultTest.AddColumnDefaultConcurrency ` -- verify that concurrently inserted rows use the missing default value, and that the missing default values are read even after compaction.
  - `PgAddColumnDefaultTest.AddColumnDefaultCompactionAfterUpdate` -- verify that compaction after updates on columns with missing default values works correctly.
  - `PgAddColumnDefaultTest.AddColumnDefaultCopy` -- verify that `COPY FROM` command works correctly on a table with columns that have missing default values.
  - `CDCYsqlAddColumnBeforeImageTest.TestAddColumnBeforeImage/0` -- verify that the `BeforeImage` in CDC uses missing default value

Reviewers: sergei, timur, dsrinivasan, abharadwaj

Reviewed By: sergei, dsrinivasan

Subscribers: hsunder, ycdcxcluster, ybase, yql, sergei, rthallam, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D25297
  • Loading branch information
fizaaluthra committed Aug 15, 2023
1 parent 5c74e2e commit 0aca19f
Show file tree
Hide file tree
Showing 60 changed files with 2,582 additions and 194 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -166,8 +166,8 @@ public void testDocDBMetadataFailuresForTable() throws Exception {
statement.execute("INSERT INTO ddl_ft_table VALUES (7, '8', 9.0, 10)");

expectedRows.clear();
expectedRows.add(new Row(1, "2", 3.0, null));
expectedRows.add(new Row(4, "5", 6.0, null));
expectedRows.add(new Row(1, "2", 3.0, 99));
expectedRows.add(new Row(4, "5", 6.0, 99));
expectedRows.add(new Row(7, "8", 9.0, 10));
assertRowSet(statement, selectStmt, expectedRows);

Expand Down
15 changes: 13 additions & 2 deletions src/postgres/src/backend/catalog/heap.c
Original file line number Diff line number Diff line change
Expand Up @@ -1830,8 +1830,9 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)

CatalogTupleUpdate(attr_rel, &tuple->t_self, tuple);
}
/* YB note: attmissingval is unused in YB relations. */
/* clear the missing value if any */
if (attStruct->atthasmissing)
if (!IsYBRelationById(relid) && attStruct->atthasmissing)
{
Datum valuesAtt[Natts_pg_attribute];
bool nullsAtt[Natts_pg_attribute];
Expand Down Expand Up @@ -2187,6 +2188,9 @@ heap_drop_with_catalog(Oid relid)
void
RelationClearMissing(Relation rel)
{
/* YB note: attmissingval is unused in YB relations. */
if (IsYBRelation(rel))
return;
Relation attr_rel;
Oid relid = RelationGetRelid(rel);
int natts = RelationGetNumberOfAttributes(rel);
Expand Down Expand Up @@ -2255,6 +2259,9 @@ RelationClearMissing(Relation rel)
void
SetAttrMissing(Oid relid, char *attname, char *value)
{
/* YB note: attmissingval is unused in YB relations. */
if (IsYBRelationById(relid))
return;
Datum valuesAtt[Natts_pg_attribute];
bool nullsAtt[Natts_pg_attribute];
bool replacesAtt[Natts_pg_attribute];
Expand Down Expand Up @@ -2401,7 +2408,11 @@ StoreAttrDefault(Relation rel, AttrNumber attnum,
valuesAtt[Anum_pg_attribute_atthasdef - 1] = true;
replacesAtt[Anum_pg_attribute_atthasdef - 1] = true;

if (add_column_mode)
/*
* YB note: attmissingval is unused in YB relations - the missing value
* is stored in the DocDB schema instead.
*/
if (!IsYBRelation(rel) && add_column_mode)
{
expr2 = expression_planner(expr2);
estate = CreateExecutorState();
Expand Down
32 changes: 31 additions & 1 deletion src/postgres/src/backend/commands/ybccmds.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,13 @@
#include "pg_yb_utils.h"

#include "access/nbtree.h"
#include "catalog/heap.h"
#include "commands/defrem.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/planner.h"
#include "parser/parser.h"
#include "parser/parse_coerce.h"
#include "parser/parse_relation.h"
#include "parser/parse_type.h"
#include "parser/parse_utilcmd.h"

Expand Down Expand Up @@ -1089,10 +1092,37 @@ YBCPrepareAlterTableCmd(AlterTableCmd* cmd, Relation rel, List *handles,
Assert(list_length(handles) == 1);
YBCPgStatement add_col_handle =
(YBCPgStatement) lfirst(list_head(handles));

YBCPgExpr res = NULL;
if (colDef->raw_default && yb_enable_add_column_missing_default)
{
ParseState *pstate = make_parsestate(NULL);
pstate->p_sourcetext = NULL;
RangeTblEntry *rte = addRangeTableEntryForRelation(pstate,
rel,
NULL,
false,
true);
addRTEtoQuery(pstate, rte, true, true, true);
Expr *expr = (Expr *) cookDefault(pstate, colDef->raw_default,
typeOid, typmod,
colDef->colname);
expr = expression_planner(expr);
EState *estate = CreateExecutorState();
ExprState *exprState = ExecPrepareExpr(expr, estate);
ExprContext *econtext = GetPerTupleExprContext(estate);
bool missingIsNull;
Datum missingval = ExecEvalExpr(exprState, econtext,
&missingIsNull);
res = YBCNewConstant(add_col_handle, typeOid, colDef->collOid,
missingval, missingIsNull);
FreeExecutorState(estate);
}
HandleYBStatus(YBCPgAlterTableAddColumn(add_col_handle,
colDef->colname,
order,
col_type));
col_type,
res));
++(*col);
*needsYBAlter = true;

Expand Down
6 changes: 4 additions & 2 deletions src/postgres/src/backend/utils/cache/relcache.c
Original file line number Diff line number Diff line change
Expand Up @@ -715,8 +715,9 @@ RelationBuildTupleDesc(Relation relation)
ndef++;
}

/* YB note: attmissingval is unused in YB relations. */
/* Likewise for a missing value */
if (attp->atthasmissing)
if (!IsYBRelation(relation) && attp->atthasmissing)
{
Datum missingval;
bool missingNull;
Expand Down Expand Up @@ -1778,8 +1779,9 @@ YbApplyAttr(YbAttrProcessorState *state, Relation attrel, HeapTuple htup)
++processing->ndef;
}

/* YB note: attmissingval is unused in YB relations. */
/* Likewise for a missing value */
if (attp->atthasmissing)
if (!IsYBRelation(relation) && attp->atthasmissing)
{
bool missingNull;

Expand Down
11 changes: 11 additions & 0 deletions src/postgres/src/backend/utils/misc/guc.c
Original file line number Diff line number Diff line change
Expand Up @@ -2310,6 +2310,17 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},

{
{"yb_enable_add_column_missing_default", PGC_USERSET, CUSTOM_OPTIONS,
gettext_noop("Enable using the default value for existing rows"
" after an ADD COLUMN ... DEFAULT operation."),
NULL
},
&yb_enable_add_column_missing_default,
true,
NULL, NULL, NULL
},

/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL
Expand Down
Loading

0 comments on commit 0aca19f

Please sign in to comment.