Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

*: add Reader, Transformer for relay log #108

Merged
merged 30 commits into from
Apr 22, 2019

Conversation

csuzhangxc
Copy link
Member

@csuzhangxc csuzhangxc commented Apr 10, 2019

What problem does this PR solve?

this PR is one part of #91

What is changed and how it works?

  1. add a MockReader used for testing
  2. add a FileReader used for:
    1. read relay log by syncer unit later
    2. read relay log for recovering/rollbacking an incomplete relay log file
  3. add a GenRotateEvent to generate RoateEvent for testing
  4. add a Reader (wrap TCPReader) used to read binlog events from master
    1. handle some errors
    2. do retry if possible
  5. add a Transformer used to transform binlog events
    1. extract binlog position/GTID
    2. decide the event whether needed to be written to relay log

Check List

Tests

  • Unit test

for relay log, at least following things needed to do:

  1. add a Writer used to write binlog events to relay log file
  2. compose Reader, Transformer and Writer together to pull binlog events from a master server and write them to files
  3. wrap them to support master-slave switch and clean old code

I'll do them in some PRs later.

the flow of relay log (without master-slave switch) will be:

  1. TCPReaderread binlog events from the master
  2. Reader handle possible errors
  3. Transfomer get events from Reader and transform them
  4. Writer get events from Transformer and write them to disk (with a FileWriter)

@csuzhangxc csuzhangxc added priority/important Major change, requires approval from ≥2 primary reviewers status/WIP This PR is still work in progress type/enhancement Performance improvement or refactoring labels Apr 10, 2019
@csuzhangxc
Copy link
Member Author

@GregoryIan @amyangfei PTAL

@csuzhangxc csuzhangxc added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/WIP This PR is still work in progress labels Apr 12, 2019
@csuzhangxc csuzhangxc changed the title *: refactor relay writer framework *: add Reader, Transformer for relay log Apr 12, 2019
@@ -159,6 +159,33 @@ func GenFormatDescriptionEvent(header *replication.EventHeader, latestPos uint32
return ev, errors.Trace(err)
}

// GenRotateEvent generates a RotateEvent.
// ref: https://dev.mysql.com/doc/internals/en/rotate-event.html
func GenRotateEvent(header *replication.EventHeader, latestPos uint32, nextLogName []byte, position uint64) (*replication.BinlogEvent, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's difference between latestPos and position?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latestPos is the previous end position of the latest events or the start position of this event.
position is the rotate target position for this event or the start position of the next event.

err := r.parser.ParseFile(pos.Name, int64(pos.Pos), r.onEvent)
if err != nil {
select {
case r.ech <- err:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least output error information

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed. log the error before the select.

// GetEvent implements Reader.GetEvent.
func (r *FileReader) GetEvent(ctx context.Context) (*replication.BinlogEvent, error) {
if r.stage.Get() != int32(stagePrepared) {
return nil, errors.Errorf("stage %s, expect %s", readerStage(r.stage.Get()), stagePrepared)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user/debuger need to know these stage very well, it's better to add a little annotations, like "please start sync"

case <-ctx.Done():
return nil, ctx.Err()
case <-r.ctx.Done(): // Reader closed
return nil, r.ctx.Err()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add more annotations to distinguish errorr.ctx.Err() with errorctx.Err() at L141

Copy link
Member Author

@csuzhangxc csuzhangxc Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added removed r.ctx.Done(). changed the definition. Close will be blocked if GetEvent has not returned (use RWMutex rather than Mutex and AtomicInt32)

}
}

c.Assert(ctx.Err(), IsNil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's contradict with || ctx.Err() != nil at L107, can we allow it's timeout?

Copy link
Member Author

@csuzhangxc csuzhangxc Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|| ctx.Err() != nil is used to break the for loop if there are bugs.
if timemout, then the test should be failed in this c.Assert(ctx.Err(), IsNil).
we can put c.Assert(ctx.Err(), IsNil) into the for loop.

}

// Close implements Reader.Close.
func (r *reader) Close() error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lack of lock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ added. and change Mutex to RWMutex, and remove the atomic variable check

ev, err := r.in.GetEvent(ctx)
if err == nil {
return ev, nil
} else if isRetryableError(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ctx is timeout, it would be infinite loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed timeout error handing in this reader, that should be handled in the outer caller.

if err != nil {
log.Errorf("[relay] start sync for master %s from GTID set %s error %v",
r.cfg.MasterID, gs, errors.ErrorStack(err))
return r.setUpReaderByPos()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can't try setUpReaderByPos on some errors, like stage is not right. And if we need to try setUpReaderByPos, we should check position to avoid it read binlogs from oldest position 💀

Copy link
Member Author

@csuzhangxc csuzhangxc Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's put this error to be handled by the outer or even the user.

Ignore bool // whether the event should be ignored
LogPos uint32 // binlog event's End_log_pos or Position in RotateEvent
NextLogName string // next binlog filename, only valid for RotateEvent
GTIDSet mysql.GTIDSet // GTIDSet got from QueryEvent and XIDEvent when RawModeEnabled not true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's gtid or gtidset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 It's gtid set. It's GTID in one binlog event, but go-mysql returns a GTID set (including the previous/passed in GTID set)

relay/transformer/transformer.go Outdated Show resolved Hide resolved
)

// Result represents a transform result.
type Result struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this design very well, like

  • if some binlog is ignored, should writer write it?
  • what's its transform rules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binlog event may be ignored in multiple steps in the processing chain, at least:

  1. some fake/artificial events, which will be marked as ignored in this transformer
  2. some obsolete events (like FormatDescriptionEvent), which will be ignored by the writer

maybe the name of transformer is not accurate, and in fact, more features may be added into this later.

@csuzhangxc
Copy link
Member Author

/run-all-tests

@csuzhangxc
Copy link
Member Author

@GregoryIan PTAL again

pkg/binlog/reader/file.go Outdated Show resolved Hide resolved
pkg/binlog/reader/mock_test.go Outdated Show resolved Hide resolved
relay/reader/reader.go Outdated Show resolved Hide resolved
relay/reader/reader.go Outdated Show resolved Hide resolved
relay/reader/reader_test.go Outdated Show resolved Hide resolved
amyangfei and others added 5 commits April 18, 2019 11:01
Co-Authored-By: csuzhangxc <csuzhangxc@gmail.com>
Co-Authored-By: csuzhangxc <csuzhangxc@gmail.com>
@csuzhangxc
Copy link
Member Author

@GregoryIan @amyangfei PTAL again

Copy link
Contributor

@amyangfei amyangfei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amyangfei amyangfei added status/LGT1 One reviewer already commented LGTM and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels Apr 22, 2019
return errors.New("already closed")
}

r.parser.Stop()
Copy link
Collaborator

@IANTHEREAL IANTHEREAL Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to put it after L135, even I know it's ok after I review its implementation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

@IANTHEREAL
Copy link
Collaborator

LGTM

@IANTHEREAL IANTHEREAL added status/LGT2 Two reviewers already commented LGTM, ready for merge and removed status/LGT1 One reviewer already commented LGTM labels Apr 22, 2019
@csuzhangxc csuzhangxc merged commit c452c25 into pingcap:master Apr 22, 2019
@csuzhangxc csuzhangxc deleted the relay-writer branch April 22, 2019 06:05
lichunzhu pushed a commit to lichunzhu/dm that referenced this pull request Apr 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/important Major change, requires approval from ≥2 primary reviewers status/LGT2 Two reviewers already commented LGTM, ready for merge type/enhancement Performance improvement or refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants