Merge branch 'master' into support_change_charset_utf8

pingcap · Oct 24, 2018 · c0ec35d · c0ec35d
2 parents e882210 + d65ce86
commit c0ec35d
Show file tree

Hide file tree

Showing 66 changed files with 1,716 additions and 836 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,6 +1,37 @@
 # TiDB Changelog
 All notable changes to this project will be documented in this file. See also [Release Notes](https://github.com/pingcap/docs/blob/master/releases/rn.md), [TiKV Changelog](https://github.com/tikv/tikv/blob/master/CHANGELOG.md) and [PD Changelog](https://github.com/pingcap/pd/blob/master/CHANGELOG.md).
 
+## [2.1.0-rc.4] - 2018-10-23
+### SQL Optimizer
+* Fix the issue that column pruning of `UnionAll` is incorrect in some cases [#7941](https://github.com/pingcap/tidb/pull/7941)
+* Fix the issue that the result of the `UnionAll` operator is incorrect in some cases [#8007](https://github.com/pingcap/tidb/pull/8007)
+### SQL Execution Engine
+* Fix the precision issue of the `AVG` function [#7874](https://github.com/pingcap/tidb/pull/7874)
+* Support using the `EXPLAIN ANALYZE` statement to check the runtime statistics including the execution time and the number of returned rows of each operator during the query execution process [#7925](https://github.com/pingcap/tidb/pull/7925)
+* Fix the panic issue of the `PointGet` operator when a column of a table appears multiple times in the result set [#7943](https://github.com/pingcap/tidb/pull/7943)
+* Fix the panic issue caused by too large values in the `Limit` subclause [#8002](https://github.com/pingcap/tidb/pull/8002)
+* Fix the panic issue during the execution process of the `AddDate`/`SubDate` statement in some cases [#8009](https://github.com/pingcap/tidb/pull/8009)
+### Statistics
+* Fix the issue of judging the prefix of the histogram low-bound of the combined index as out of range [#7856](https://github.com/pingcap/tidb/pull/7856)
+* Fix the memory leak issue caused by statistics collecting [#7873](https://github.com/pingcap/tidb/pull/7873)
+* Fix the panic issue when the histogram is empty [#7928](https://github.com/pingcap/tidb/pull/7928)
+* Fix the issue that the histogram bound is out of range when the statistics is being uploaded [#7944](https://github.com/pingcap/tidb/pull/7944)
+* Limit the maximum length of values in the statistics sampling process [#7982](https://github.com/pingcap/tidb/pull/7982)
+### Server
+* Refactor Latch to avoid misjudgment of transaction conflicts and improve the execution performance of concurrent transactions [#7711](https://github.com/pingcap/tidb/pull/7711)
+* Fix the panic issue caused by collecting slow queries in some cases [#7874](https://github.com/pingcap/tidb/pull/7847)
+* Fix the panic issue when `ESCAPED BY` is an empty string in the `LOAD DATA` statement [#8005](https://github.com/pingcap/tidb/pull/8005)
+* Complete the “coprocessor error” log information [#8006](https://github.com/pingcap/tidb/pull/8006)
+### Compatibility
+* Set the `Command` field of the `SHOW PROCESSLIST` result to `Sleep` when the query is empty [#7839](https://github.com/pingcap/tidb/pull/7839)
+### Expressions
+* Fix the constant folding issue of the `SYSDATE` function [#7895](https://github.com/pingcap/tidb/pull/7895)
+* Fix the issue that `SUBSTRING_INDEX` panics in some cases [#7897](https://github.com/pingcap/tidb/pull/7897)
+### DDL
+* Fix the stack overflow issue caused by throwing the `invalid ddl job type` error [#7958](https://github.com/pingcap/tidb/pull/7958)
+* Fix the issue that the result of `ADMIN CHECK TABLE` is incorrect in some cases [#7975](https://github.com/pingcap/tidb/pull/7975)
+
+
 ## [2.1.0-rc.2] - 2018-09-14
 ### SQL Optimizer
 * Put forward a proposal of the next generation Planner [#7543](https://github.com/pingcap/tidb/pull/7543)
@@ -15,7 +46,7 @@ All notable changes to this project will be documented in this file. See also [R
 * Optimize the performance of Hash aggregate operators [#7541](https://github.com/pingcap/tidb/pull/7541)
 * Optimize the performance of Join operators [#7493](https://github.com/pingcap/tidb/pull/7493), [#7433](https://github.com/pingcap/tidb/pull/7433)
 * Fix the issue that the result of `UPDATE JOIN` is incorrect when the Join order is changed [#7571](https://github.com/pingcap/tidb/pull/7571)
-* Improve the performance of Chunk’s iterator [#7585](https://github.com/pingcap/tidb/pull/7585) 
+* Improve the performance of Chunk’s iterator [#7585](https://github.com/pingcap/tidb/pull/7585)
 ### Statistics
 * Fix the issue that the auto Analyze work repeatedly analyzes the statistics [#7550](https://github.com/pingcap/tidb/pull/7550)
 * Fix the statistics update error that occurs when there is no statistics change [#7530](https://github.com/pingcap/tidb/pull/7530)

diff --git a/ast/ast.go b/ast/ast.go
@@ -19,7 +19,7 @@ import (
 	"io"
 
 	"github.com/pingcap/tidb/model"
-	"github.com/pingcap/tidb/types"
+	"github.com/pingcap/tidb/parser/types"
 )
 
 // Node is the basic element of the AST.

diff --git a/ast/base.go b/ast/base.go
@@ -13,7 +13,7 @@
 
 package ast
 
-import "github.com/pingcap/tidb/types"
+import "github.com/pingcap/tidb/parser/types"
 
 // node is the struct implements node interface except for Accept method.
 // Node implementations should embed it in.

diff --git a/ast/ddl.go b/ast/ddl.go
@@ -15,7 +15,7 @@ package ast
 
 import (
 	"github.com/pingcap/tidb/model"
-	"github.com/pingcap/tidb/types"
+	"github.com/pingcap/tidb/parser/types"
 )
 
 var (

diff --git a/ast/functions.go b/ast/functions.go
@@ -18,7 +18,7 @@ import (
 	"io"
 
 	"github.com/pingcap/tidb/model"
-	"github.com/pingcap/tidb/types"
+	"github.com/pingcap/tidb/parser/types"
 )
 
 var (

diff --git a/ast/misc.go b/ast/misc.go
@@ -868,3 +868,12 @@ func (n *TableOptimizerHint) Accept(v Visitor) (Node, bool) {
 	n = newNode.(*TableOptimizerHint)
 	return v.Leave(n)
 }
+
+// NewDecimal creates a types.Decimal value, it's provided by parser driver.
+var NewDecimal func(string) (interface{}, error)
+
+// NewHexLiteral creates a types.HexLiteral value, it's provided by parser driver.
+var NewHexLiteral func(string) (interface{}, error)
+
+// NewBitLiteral creates a types.BitLiteral value, it's provided by parser driver.
+var NewBitLiteral func(string) (interface{}, error)
diff --git a/cmd/explaintest/r/explain_complex_stats.result b/cmd/explaintest/r/explain_complex_stats.result
@@ -158,11 +158,11 @@ Projection_5	39.28	root	test.st.cm, test.st.p1, test.st.p2, test.st.p3, test.st.
       └─TableScan_14	160.23	cop	table:st, keep order:false
 explain select dt.id as id, dt.aid as aid, dt.pt as pt, dt.dic as dic, dt.cm as cm, rr.gid as gid, rr.acd as acd, rr.t as t,dt.p1 as p1, dt.p2 as p2, dt.p3 as p3, dt.p4 as p4, dt.p5 as p5, dt.p6_md5 as p6, dt.p7_md5 as p7 from dt dt join rr rr on (rr.pt = 'ios' and rr.t > 1478185592 and dt.aid = rr.aid and dt.dic = rr.dic) where dt.pt = 'ios' and dt.t > 1478185592 and dt.bm = 0 limit 2000;
 id	count	task	operator info
-Projection_9	428.55	root	dt.id, dt.aid, dt.pt, dt.dic, dt.cm, rr.gid, rr.acd, rr.t, dt.p1, dt.p2, dt.p3, dt.p4, dt.p5, dt.p6_md5, dt.p7_md5
-└─Limit_12	428.55	root	offset:0, count:2000
-  └─IndexJoin_18	428.55	root	inner join, inner:IndexLookUp_17, outer key:dt.aid, dt.dic, inner key:rr.aid, rr.dic
-    ├─TableReader_42	428.55	root	data:Selection_41
-    │ └─Selection_41	428.55	cop	eq(dt.bm, 0), eq(dt.pt, "ios"), gt(dt.t, 1478185592)
+Projection_9	428.32	root	dt.id, dt.aid, dt.pt, dt.dic, dt.cm, rr.gid, rr.acd, rr.t, dt.p1, dt.p2, dt.p3, dt.p4, dt.p5, dt.p6_md5, dt.p7_md5
+└─Limit_12	428.32	root	offset:0, count:2000
+  └─IndexJoin_18	428.32	root	inner join, inner:IndexLookUp_17, outer key:dt.aid, dt.dic, inner key:rr.aid, rr.dic
+    ├─TableReader_42	428.32	root	data:Selection_41
+    │ └─Selection_41	428.32	cop	eq(dt.bm, 0), eq(dt.pt, "ios"), gt(dt.t, 1478185592)
     │   └─TableScan_40	2000.00	cop	table:dt, range:[0,+inf], keep order:false
     └─IndexLookUp_17	970.00	root	
       ├─IndexScan_14	1.00	cop	table:rr, index:aid, dic, range: decided by [dt.aid dt.dic], keep order:false

diff --git a/cmd/explaintest/r/explain_easy_stats.result b/cmd/explaintest/r/explain_easy_stats.result
@@ -47,10 +47,10 @@ explain select * from t1 left join t2 on t1.c2 = t2.c1 where t1.c1 > 1;
 id	count	task	operator info
 Projection_6	2481.25	root	test.t1.c1, test.t1.c2, test.t1.c3, test.t2.c1, test.t2.c2
 └─MergeJoin_7	2481.25	root	left outer join, left key:test.t1.c2, right key:test.t2.c1
-  ├─IndexLookUp_17	1999.00	root	
-  │ ├─Selection_16	1999.00	cop	gt(test.t1.c1, 1)
+  ├─IndexLookUp_17	1998.00	root	
+  │ ├─Selection_16	1998.00	cop	gt(test.t1.c1, 1)
   │ │ └─IndexScan_14	1999.00	cop	table:t1, index:c2, range:[NULL,+inf], keep order:true
-  │ └─TableScan_15	1999.00	cop	table:t1, keep order:false
+  │ └─TableScan_15	1998.00	cop	table:t1, keep order:false
   └─IndexLookUp_21	1985.00	root	
     ├─IndexScan_19	1985.00	cop	table:t2, index:c1, range:[NULL,+inf], keep order:true
     └─TableScan_20	1985.00	cop	table:t2, keep order:false

diff --git a/cmd/explaintest/r/tpch.result b/cmd/explaintest/r/tpch.result
@@ -251,7 +251,7 @@ limit 10;
 id	count	task	operator info
 Projection_14	10.00	root	tpch.lineitem.l_orderkey, 7_col_0, tpch.orders.o_orderdate, tpch.orders.o_shippriority
 └─TopN_17	10.00	root	7_col_0:desc, tpch.orders.o_orderdate:asc, offset:0, count:10
-  └─HashAgg_20	40256361.71	root	group by:tpch.lineitem.l_orderkey, tpch.orders.o_orderdate, tpch.orders.o_shippriority, funcs:sum(mul(tpch.lineitem.l_extendedprice, minus(1, tpch.lineitem.l_discount))), firstrow(tpch.orders.o_orderdate), firstrow(tpch.orders.o_shippriority), firstrow(tpch.lineitem.l_orderkey)
+  └─HashAgg_20	40227041.09	root	group by:tpch.lineitem.l_orderkey, tpch.orders.o_orderdate, tpch.orders.o_shippriority, funcs:sum(mul(tpch.lineitem.l_extendedprice, minus(1, tpch.lineitem.l_discount))), firstrow(tpch.orders.o_orderdate), firstrow(tpch.orders.o_shippriority), firstrow(tpch.lineitem.l_orderkey)
     └─IndexJoin_26	91515927.49	root	inner join, inner:IndexLookUp_25, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey
       ├─HashRightJoin_46	22592975.51	root	inner join, inner:TableReader_52, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)]
       │ ├─TableReader_52	1498236.00	root	data:Selection_51
@@ -260,9 +260,9 @@ Projection_14	10.00	root	tpch.lineitem.l_orderkey, 7_col_0, tpch.orders.o_orderd
       │ └─TableReader_49	36870000.00	root	data:Selection_48
       │   └─Selection_48	36870000.00	cop	lt(tpch.orders.o_orderdate, 1995-03-13 00:00:00.000000)
       │     └─TableScan_47	75000000.00	cop	table:orders, range:[-inf,+inf], keep order:false
-      └─IndexLookUp_25	163063881.42	root	
+      └─IndexLookUp_25	162945114.27	root	
         ├─IndexScan_22	1.00	cop	table:lineitem, index:L_ORDERKEY, L_LINENUMBER, range: decided by [tpch.orders.o_orderkey], keep order:false
-        └─Selection_24	163063881.42	cop	gt(tpch.lineitem.l_shipdate, 1995-03-13 00:00:00.000000)
+        └─Selection_24	162945114.27	cop	gt(tpch.lineitem.l_shipdate, 1995-03-13 00:00:00.000000)
           └─TableScan_23	1.00	cop	table:lineitem, keep order:false
 /*
 Q4 Order Priority Checking Query
@@ -922,13 +922,13 @@ p_brand,
 p_type,
 p_size;
 id	count	task	operator info
-Sort_13	15.00	root	supplier_cnt:desc, tpch.part.p_brand:asc, tpch.part.p_type:asc, tpch.part.p_size:asc
-└─Projection_14	15.00	root	tpch.part.p_brand, tpch.part.p_type, tpch.part.p_size, 9_col_0
-  └─HashAgg_17	15.00	root	group by:tpch.part.p_brand, tpch.part.p_size, tpch.part.p_type, funcs:count(distinct tpch.partsupp.ps_suppkey), firstrow(tpch.part.p_brand), firstrow(tpch.part.p_type), firstrow(tpch.part.p_size)
-    └─HashLeftJoin_22	4022816.68	root	anti semi join, inner:TableReader_46, equal:[eq(tpch.partsupp.ps_suppkey, tpch.supplier.s_suppkey)]
-      ├─IndexJoin_26	5028520.85	root	inner join, inner:IndexReader_25, outer key:tpch.part.p_partkey, inner key:tpch.partsupp.ps_partkey
-      │ ├─TableReader_41	1249969.60	root	data:Selection_40
-      │ │ └─Selection_40	1249969.60	cop	in(tpch.part.p_size, 48, 19, 12, 4, 41, 7, 21, 39), ne(tpch.part.p_brand, "Brand#34"), not(like(tpch.part.p_type, "LARGE BRUSHED%", 92))
+Sort_13	14.41	root	supplier_cnt:desc, tpch.part.p_brand:asc, tpch.part.p_type:asc, tpch.part.p_size:asc
+└─Projection_14	14.41	root	tpch.part.p_brand, tpch.part.p_type, tpch.part.p_size, 9_col_0
+  └─HashAgg_17	14.41	root	group by:tpch.part.p_brand, tpch.part.p_size, tpch.part.p_type, funcs:count(distinct tpch.partsupp.ps_suppkey), firstrow(tpch.part.p_brand), firstrow(tpch.part.p_type), firstrow(tpch.part.p_size)
+    └─HashLeftJoin_22	3863988.24	root	anti semi join, inner:TableReader_46, equal:[eq(tpch.partsupp.ps_suppkey, tpch.supplier.s_suppkey)]
+      ├─IndexJoin_26	4829985.30	root	inner join, inner:IndexReader_25, outer key:tpch.part.p_partkey, inner key:tpch.partsupp.ps_partkey
+      │ ├─TableReader_41	1200618.43	root	data:Selection_40
+      │ │ └─Selection_40	1200618.43	cop	in(tpch.part.p_size, 48, 19, 12, 4, 41, 7, 21, 39), ne(tpch.part.p_brand, "Brand#34"), not(like(tpch.part.p_type, "LARGE BRUSHED%", 92))
       │ │   └─TableScan_39	10000000.00	cop	table:part, range:[-inf,+inf], keep order:false
       │ └─IndexReader_25	1.00	root	index:IndexScan_24
       │   └─IndexScan_24	1.00	cop	table:partsupp, index:PS_PARTKEY, PS_SUPPKEY, range: decided by [tpch.part.p_partkey], keep order:false

diff --git a/ddl/db_test.go b/ddl/db_test.go
@@ -3022,6 +3022,33 @@ func (s *testDBSuite) TestTruncatePartitionAndDropTable(c *C) {
 	hasOldPartitionData = checkPartitionDelRangeDone(c, s, partitionPrefix)
 	c.Assert(hasOldPartitionData, IsFalse)
 	s.testErrorCode(c, "select * from t4;", tmysql.ErrNoSuchTable)
+
+	// Test truncate table partition reassign a new partitionIDs.
+	s.tk.MustExec("drop table if exists t5;")
+	s.tk.MustExec("set @@session.tidb_enable_table_partition=1;")
+	s.tk.MustExec(`create table t5(
+		id int, name varchar(50), 
+		purchased date
+	)
+	partition by range( year(purchased) ) (
+    	partition p0 values less than (1990),
+    	partition p1 values less than (1995),
+    	partition p2 values less than (2000),
+    	partition p3 values less than (2005),
+    	partition p4 values less than (2010),
+    	partition p5 values less than (2015)
+   	);`)
+	is = domain.GetDomain(ctx).InfoSchema()
+	oldTblInfo, err = is.TableByName(model.NewCIStr("test"), model.NewCIStr("t5"))
+	c.Assert(err, IsNil)
+	oldPID = oldTblInfo.Meta().Partition.Definitions[0].ID
+
+	s.tk.MustExec("truncate table t5;")
+	is = domain.GetDomain(ctx).InfoSchema()
+	c.Assert(err, IsNil)
+	newTblInfo, err := is.TableByName(model.NewCIStr("test"), model.NewCIStr("t5"))
+	newPID := newTblInfo.Meta().Partition.Definitions[0].ID
+	c.Assert(oldPID != newPID, IsTrue)
 }
 
 func (s *testDBSuite) TestPartitionUniqueKeyNeedAllFieldsInPf(c *C) {

diff --git a/ddl/partition.go b/ddl/partition.go
@@ -417,3 +417,24 @@ func isRangePartitionColUnsignedBigint(cols []*table.Column, pi *model.Partition
 	}
 	return false
 }
+
+// truncateTableByReassignPartitionIDs reassign a new partition ids.
+func truncateTableByReassignPartitionIDs(job *model.Job, t *meta.Meta, tblInfo *model.TableInfo) error {
+	newDefs := make([]model.PartitionDefinition, 0, len(tblInfo.Partition.Definitions))
+	for _, def := range tblInfo.Partition.Definitions {
+		pid, err := t.GenGlobalID()
+		if err != nil {
+			job.State = model.JobStateCancelled
+			return errors.Trace(err)
+		}
+		newDef := model.PartitionDefinition{
+			ID:       pid,
+			Name:     def.Name,
+			LessThan: def.LessThan,
+			Comment:  def.Comment,
+		}
+		newDefs = append(newDefs, newDef)
+	}
+	tblInfo.Partition.Definitions = newDefs
+	return nil
+}
diff --git a/ddl/table.go b/ddl/table.go
@@ -209,18 +209,13 @@ func onTruncateTable(d *ddlCtx, t *meta.Meta, job *model.Job) (ver int64, _ erro
 		return ver, errors.Trace(err)
 	}
 
-	// We use the new partition ID because all the old data is encoded with the old partition ID, it can not be accessed anymore.
 	var oldPartitionIDs []int64
 	if tblInfo.GetPartitionInfo() != nil {
 		oldPartitionIDs = getPartitionIDs(tblInfo)
-		for _, def := range tblInfo.Partition.Definitions {
-			var pid int64
-			pid, err = t.GenGlobalID()
-			if err != nil {
-				job.State = model.JobStateCancelled
-				return ver, errors.Trace(err)
-			}
-			def.ID = pid
+		// We use the new partition ID because all the old data is encoded with the old partition ID, it can not be accessed anymore.
+		err = truncateTableByReassignPartitionIDs(job, t, tblInfo)
+		if err != nil {
+			return ver, errors.Trace(err)
 		}
 	}
 

diff --git a/docs/design/2018-10-22-the-column-pool.md b/docs/design/2018-10-22-the-column-pool.md
@@ -0,0 +1,74 @@
+# Proposal: Support a Global Column Pool
+
+- Author(s):     [zz-jason](https://github.com/zz-jason)
+- Last updated:  2018-10-22
+- Discussion at:
+
+## Abstract
+
+This proposal is aimed to enable a fine-grained buffer reuse strategy. With the
+help of the proposed column pool, we can:
+
+1. Reduce the total memory consumption during the execution phase.
+2. Support column-oriented expression evaluation and leverage the power of
+   vectorized expression execution.
+
+## Background
+
+At present, the buffer reuse is in the granularity of Chunk. The disadvantages
+of this **Chunk**-oriented reuse strategy are:
+
+1. This buffer reuse strategy isn't effective in some scenarios. Some memory
+   hold by the operators which are inactive can also be reused to reduce total
+   memory consumption when executing a query.
+
+2. In order to reuse a Chunk, we have to design the resource recycling strategy
+   for every operator which uses multiple goroutines to exploit the
+   thread-level parallelism. For example, hash join. It makes the code more
+   complicated and harder to be maintained.
+
+3. The memory used in the current query can not be reused in the next query
+   within the same session. Thus the golang GC pressure is increased and the
+   OLTP performance on a TiDB server is impacted.
+
+## Proposal
+
+The main idea of this proposal is to change the **Chunk**-oriented buffer reuse
+strategy to **Column**-oriented and use a session-level column pool to achieve
+that.
+
+### The column pool
+
+Considering that the current TiDB only supports a limited number of types, we
+only need to consider 5 kinds of columns:
+
+- 4 fixed length columns: 4/8/16/40
+- 1 variable length column
+
+The column pool can be accessed by multiple goroutines concurrently, and to
+reduce the lock conflict, we can split the pool into shards and randomize the
+target shard in each Put/Get operation. Each shard of a column pool is
+implemented as a last in first out stack, which helps to release the unused
+column in the pool. And in most cases, the required length of a new column is
+equal to the previous column put into the column pool.
+
+![the column pool](./the-column-pool.png)
+
+## Rationale
+
+No
+
+## Compatibility
+
+No
+
+## Implementation
+
+1. Implement the column pool firstly.
+2. Remove the complicated, error-prone resource recycling strategy for each
+   operator.
+3. Replace each **NewChunkWithCapacity**() with **pool.GetChunk**().
+
+## Open issues (if applicable)
+
+No
diff --git a/docs/design/README.md b/docs/design/README.md
@@ -12,6 +12,7 @@ A proposal template: [TEMPLATE.md](./TEMPLATE.md)
 - [Proposal: Enhance constraint propagation in TiDB logical plan](./2018-07-22-enhance-propagations.md)
 - [Proposal: A SQL Planner based on the Volcano/Cascades model](./2018-08-29-new-planner.md)
 - [Proposal: Implement Radix Hash Join](./2018-09-21-radix-hashjoin.md)
+- [Proposal: Support a Global Column Pool](./2018-10-22-the-column-pool.md)
 
 ## Completed
 

diff --git a/docs/design/the-column-pool.png b/docs/design/the-column-pool.png
-Original file line number
+Diff line change
@@ Expand Up / @@ -18,7 +18,7 @@ import ( @@
     	"io"
     	"github.com/pingcap/tidb/model"
-    	"github.com/pingcap/tidb/types"
+    	"github.com/pingcap/tidb/parser/types"
     )
     var (
@@ Expand Down @@