-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: make explain
support explain anaylze
#7827
Conversation
/run-all-tests |
@shenli I think it's a different perspective of same question.
It's in Proposal, and need more disscuss, maybe finally we can let command's stats datasource be tracing-data too. 😆 |
@lysu we don't need to bind different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, LGTM up till now.
Will there be that? |
Do we need to record the |
Deeper call stack I guess. |
yes~it will get deeper call stack, and current modification way is change and |
@lysu both ways need to make modification point to every operator. |
@zz-jason yes, it need change operator to support cutting point, but "track logic" isn't distribute to every operator just in one place at
and add new operator that follow current idiom will no need take care about "track logic" any more. metric is classical crosscutting logic just like https://en.wikipedia.org/wiki/Aspect-oriented_programming or golang's but performance is question, I will change it to if-way later 😆 |
Will there be one test that test its explain result? |
@winoros |
/run-all-tests |
util/execdetails/execdetails.go
Outdated
type ExecStats map[string]*ExecStat | ||
|
||
// ExecStat collects one executor's execution info. | ||
type ExecStat struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- How about:
s/ExecStat/RuntimeStats/
- Please add comments for each struct field.
util/execdetails/execdetails.go
Outdated
} | ||
|
||
// NewExecutorStats creates new executor collector. | ||
func NewExecutorStats() ExecStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/NewExecutorStats/NewRuntimeStats/
util/execdetails/execdetails.go
Outdated
@@ -52,3 +54,55 @@ func (d ExecDetails) String() string { | |||
} | |||
return strings.Join(parts, " ") | |||
} | |||
|
|||
// ExecStats collects executors's execution info. | |||
type ExecStats map[string]*ExecStat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a map to store the runtime statistics of all the executors? Maybe its simpler to make each executor to hold a ExecStat
object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I want to get RuntimeStat
from a PhysicalPlan
, so this map will give ExplainID to RuntimeStat
mapping. or do we have better way to get Executor
from an PhysicalPlan
~? 🐱
And it seem some physical is "1 to N" relationship with executor. (e.g. IndexLookUpExecutor)
Except that map lookup also easy to combine coprocessor's RuntimeStats in furture~?
executor/update.go
Outdated
@@ -112,6 +114,12 @@ func (e *UpdateExec) canNotUpdate(handle types.Datum) bool { | |||
|
|||
// Next implements the Executor Next interface. | |||
func (e *UpdateExec) Next(ctx context.Context, chk *chunk.Chunk) error { | |||
if e.execStat != nil { | |||
start := time.Now() | |||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be simplified to:
defer e.execStat.Record(time.Now().Sub(start), chk.NumRows())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does update
statement support the explain
statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update
support it and will execute selectExec
part and not modify records...but this line is brainless added code that can be removed too... 😹
executor/simple.go
Outdated
@@ -49,6 +50,12 @@ type SimpleExec struct { | |||
|
|||
// Next implements the Executor Next interface. | |||
func (e *SimpleExec) Next(ctx context.Context, chk *chunk.Chunk) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of the statements executed by SimpleExec
can not be explained.
executor/show.go
Outdated
@@ -64,6 +64,12 @@ type ShowExec struct { | |||
|
|||
// Next implements the Executor Next interface. | |||
func (e *ShowExec) Next(ctx context.Context, chk *chunk.Chunk) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the show
statement can not be explained.
executor/set.go
Outdated
@@ -43,6 +43,12 @@ type SetExecutor struct { | |||
|
|||
// Next implements the Executor Next interface. | |||
func (e *SetExecutor) Next(ctx context.Context, chk *chunk.Chunk) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
executor/aggregate.go
Outdated
@@ -501,6 +502,10 @@ func (w *HashAggFinalWorker) run(ctx sessionctx.Context, waitGroup *sync.WaitGro | |||
|
|||
// Next implements the Executor Next interface. | |||
func (e *HashAggExec) Next(ctx context.Context, chk *chunk.Chunk) error { | |||
if e.runtimeStat != nil { | |||
start := time.Now() | |||
defer e.runtimeStat.Record(time.Now().Sub(start), chk.NumRows()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defer function input parameter will be calculated in defer define.
func main() {
foo()
foo1()
}
func foo() {
start := time.Now()
defer func() {
fmt.Println(time.Since(start))
}()
time.Sleep(1 * time.Second)
}
func foo1() {
start := time.Now()
defer fmt.Println(time.Since(start))
time.Sleep(1 * time.Second)
}
output:
1.001687191s
319ns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emmm.....that's right..@zz-jason u mislead me.:rofl:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂I'm sorry
executor/revoke.go
Outdated
@@ -15,7 +15,6 @@ package executor | |||
|
|||
import ( | |||
"fmt" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this modification caused by gofmt?
executor/prepared_test.go
Outdated
@@ -178,11 +178,6 @@ func (s *testSuite) TestPrepared(c *C) { | |||
_, _, fields, err = tk.Se.PrepareStmt("update prepare3 set a = ?") | |||
c.Assert(err, IsNil) | |||
c.Assert(len(fields), Equals, 0) | |||
|
|||
// Coverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this test?
executor/load_stats.go
Outdated
@@ -15,7 +15,6 @@ package executor | |||
|
|||
import ( | |||
"encoding/json" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this modification caused by gofmt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
it seems that there is only one LGTM? |
Seems @eurekaka send another LGTM. |
case ast.ExplainFormatROW: | ||
retFields := []string{"id", "count", "task", "operator info"} | ||
if explain.Analyze { | ||
retFields = append(retFields, "execution_info") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/execution_info/execution info/
row := []string{e.prettyIdentifier(p.ExplainID(), indent, isLastChild), count, TaskType, operatorInfo} | ||
row := []string{e.prettyIdentifier(p.ExplainID(), indent, isLastChild), count, taskType, operatorInfo} | ||
if e.Analyze { | ||
runtimeStat := e.ctx.GetSessionVars().StmtCtx.RuntimeStats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/runtimeStat/runtimeStats/
@@ -130,6 +134,7 @@ func newBaseExecutor(ctx sessionctx.Context, schema *expression.Schema, id strin | |||
schema: schema, | |||
initCap: ctx.GetSessionVars().MaxChunkSize, | |||
maxChunkSize: ctx.GetSessionVars().MaxChunkSize, | |||
runtimeStat: ctx.GetSessionVars().StmtCtx.RuntimeStats.GetRuntimeStat(id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.runtimeStat
will always be set, no matter whether it is in the explain analyze
statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx.GetSessionVars().StmtCtx.RuntimeStats == nil
so GetRuntimeStat(id)
will quick return nil, so I only +1 a set nil
, but look more uniform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's doesn't +1 memory set, because it's in struct initializer
@@ -75,6 +78,7 @@ type baseExecutor struct { | |||
maxChunkSize int | |||
children []Executor | |||
retFieldTypes []*types.FieldType | |||
runtimeStat *execdetails.RuntimeStat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Stat/Stats/
stats is short for statistics in our codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but runtimeStats is collection of runtimeStat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/RuntimeStat/ExecutorStats/
and keep RuntimeStats
a collection of the ExecutorStats
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels stranger 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then how about s/RuntimeStats/RuntimeStatsColl/
and s/RuntimeStat/RuntimeStats/
?
@@ -659,6 +659,33 @@ func (b *executorBuilder) buildTrace(v *plannercore.Trace) Executor { | |||
|
|||
// buildExplain builds a explain executor. `e.rows` collects final result to `ExplainExec`. | |||
func (b *executorBuilder) buildExplain(v *plannercore.Explain) Executor { | |||
if v.Analyze { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this to the function body of ExplainExec.Next()
.
@zz-jason now issue another PR to fix this? |
@lysu OK |
if e == nil { | ||
return "" | ||
} | ||
return fmt.Sprintf("time:%f, loops:%d, rows:%d", time.Duration(e.consume).Seconds()*1e3, e.loop, e.rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to specify the time unit in the string message. for example: "time:%fms, loops:%d, rows:%d"
return runtimecStat | ||
} | ||
|
||
func (e RuntimeStats) String() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method can be removed? BTW, the result is not stable, because the it is composed by a range
operation on a map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's current no use and can be remove
@@ -453,6 +458,7 @@ func (e *IndexLookUpExecutor) startIndexWorker(ctx context.Context, kvRanges []k | |||
func (e *IndexLookUpExecutor) startTableWorker(ctx context.Context, workCh <-chan *lookupTableTask) { | |||
lookupConcurrencyLimit := e.ctx.GetSessionVars().IndexLookupConcurrency | |||
e.tblWorkerWg.Add(lookupConcurrencyLimit) | |||
e.baseExecutor.ctx.GetSessionVars().StmtCtx.RuntimeStats.GetRuntimeStat(e.id + "_tableReader") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this collected stats is never used by the ExplainExec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.id + "_tableReader"
question~~?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you only retrieve the runtime statistics of an operator by the operator id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this is by design, GetRuntimeStat
will pre-init e.id + "_tableReader"
slot in RuntimeStatsColl, after this call will fork goroutine create executors, this pre-init prevent race condition for modify RuntimeStatsColl map, so RuntimeStatsColl
live without mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to collect the runtime statistics of the table worker of the IndexLookupExecutor, because it is never used and presented to the user in the ExplainExec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, _tableReader
can be removed~
another thing is, for index join, its inner child will be built again and again according to the outer join keys. we need to collect all the runtime statistics of the built inner child and aggregate them to only one result to present to the user. |
yes~ if inner child is unqiue plan id (e.g. |
} | ||
|
||
// GetRuntimeStat gets execStat for a executor. | ||
func (e RuntimeStats) GetRuntimeStat(planID string) *RuntimeStat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we need a mutex to prevent data race on the map
, considered the parallel executed inner children of the index join.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have no way to pre-init inner plainID's entry to make it lock free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, GetRuntimeStat is low frequence operation...let's lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lock is much more easier.
What problem does this PR solve?
Currently, our
explain
command only display the physical plan result without execution.but sometime we also need execution info to help us find reality plan question or executor performance bottleneck.
this PR add
anaylze
option toexplain
command(it just like postgresql'sexplain anaylze
does https://www.postgresql.org/docs/9.1/static/sql-explain.html), and it will cause the statement to be actually executed, not only planned.it works like this:
the difference is with addition
execution_info
columns, and has be actual executed.What is changed and how it works?
Executor
'sNext
from method to a func field to make it bindableSessionCtx.StmtCtx
level to control Normal or Explain executionExplain
to open switch then actual execute plan and output addition column.Remain question
Check List
Tests
Code changes
Side effects
Related changes
This change is