-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development feat structured logging #881
Development feat structured logging #881
Conversation
c2a2538
to
c9080e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over time this gonna be hard to maintain
- First read the comments
- Second i think it's gonna be really hard to maintain context of operation and error like this.
- event_action, event_kind, category are not well defined, they don't have known types and rely on whatever the developer type in. which makes it hard to track and prone to typoes
bridge/tfchain_bridge/main.go
Outdated
zerolog.SetGlobalLevel(zerolog.DebugLevel) | ||
log.Debug().Msg("debug mode enabled") | ||
} | ||
logger.Init_logger(debug) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Init_logger should be InitLogger
bridge/tfchain_bridge/main.go
Outdated
} | ||
source_log_entry := logger.SourceCommonLogEntry{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why using snake case naming in go
? go always uses camelCase
Str("event_action", "bridge_unexpectedly_exited"). | ||
Str("event_kind", "error"). | ||
Str("category", "availability"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be part of the error context somehow? not printed as strings? May be return some sort of a ErrorContext that implements error
then once received an ErrorContext can be downcasted, then the data extracted from it and printed.
The values of the category, event-kind, and action can then be pre-defined constants
@@ -23,18 +26,30 @@ func (bridge *Bridge) mint(ctx context.Context, senders map[string]*big.Int, tx | |||
} | |||
|
|||
if minted { | |||
log.Info().Str("tx_id", tx.Hash).Msg("transaction is already minted") | |||
logger.Info(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we should log before return every error, or just log when we receive an error. If the caller to mint
function now decide to log the returned error we will have double logs.
What I do is that I always log an error if i can't bubble it up more, the error then can have all the information we need to build a detailed log
Otherwise we can endup with the same error printed multiple times
@@ -46,37 +61,35 @@ func (bridge *Bridge) mint(ctx context.Context, senders map[string]*big.Int, tx | |||
} | |||
|
|||
if tx.Memo == "" { | |||
log.Info().Str("tx_id", tx.Hash).Msg("transaction has empty memo, refunding now") | |||
return bridge.refund(context.Background(), receiver, depositedAmount.Int64(), tx) | |||
ctx = context.WithValue(ctx, "refund_reason", "no memo in transaction") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a nice way to set (and get) context. typoes and naming conflicts are a problem instead it's ususally done like this
type refundReasonKey struct {}
func WithRefundReason(ctx, reason string) Context {
return context.WithValue(ctx, refundReasonKey{}, reason)
}
func GetRefundReason(ctx) string {
return context.GetValue(ctx, refundReasonKey{}).(string)
}
This makes sure
- no key conflict because a key is a type, not a value (actually it's the zero value of a type but 2 zero values of 2 different types don't equal each other)
- Clear interface to set, get a reason and reason type is clear
- no possible typoes or mistakes
@muhamadazmy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job!!
Very complete and very well documented.
I basically read the documentation and made some formatting remarks.
Implementation seems ok, I guess @muhamadazmy is better placed to evaluate the Go code and I see he already made its comments.
I would also add ADR file since it is an important feature from bridge side.
- `event_burn_tx_ready_received`: The bridge instance has received TFChain `BurnTransactionReady` event which means all bridge validators signed a withdraw transaction. | ||
- `event_burn_tx_expired_received`: The bridge instance has received TFChain `BurnTransactionExpired` event. | ||
- `withdraw_skipped`: a refund request skipped by the bridge instance as it has already been refunded. | ||
- `withdraw_proposed`: a withdraw has proposed or signed by the bridge instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has been proposed
- `event_burn_tx_created_received`: The bridge instance has received TFChain `BurnTransactionCreated` event. | ||
- `event_burn_tx_ready_received`: The bridge instance has received TFChain `BurnTransactionReady` event which means all bridge validators signed a withdraw transaction. | ||
- `event_burn_tx_expired_received`: The bridge instance has received TFChain `BurnTransactionExpired` event. | ||
- `withdraw_skipped`: a refund request skipped by the bridge instance as it has already been refunded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a refund request was skipped
- `event_burn_tx_expired_received`: The bridge instance has received TFChain `BurnTransactionExpired` event. | ||
- `withdraw_skipped`: a refund request skipped by the bridge instance as it has already been refunded. | ||
- `withdraw_proposed`: a withdraw has proposed or signed by the bridge instance. | ||
- `withdraw_completed`: a withdraw has completed and received on the target stellar account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has been completed
#### Business Events: | ||
These events are for business intelligence and always tied to a user transfer, and include a `trace_id` field. | ||
|
||
you can use `trace_id` to correlate multiple log entries and track an transfer from one chain to the other. it is simply a stellar tx ID or TFChain burn ID depend on the side the transfer initiated from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You ... It ...
|
||
### Metrics: | ||
|
||
This events describes a numeric measurement taken at given point in time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These events
| version | number | yes |Log schema version | | ||
| source | object | yes | Information about the bridge instance | | ||
| event_action | string | yes | The action captured by the event | | ||
| event_kind | string | yes | Can gives high-level information about what type of information the event contains, without being specific to the contents of the event. One of `Event`, `Alert`, `Error`, `metric` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
event
, alert
, error
, metric
- alert: This value indicates an event such as an alert or notable event. | ||
|
||
- event: This value is the most general and most common value for this field. It is used to represent events that indicate that something happened. | ||
|
||
- error: This value indicates that an error occurred during the operation of the bridge. | ||
|
||
- metric: This events describes a numeric measurement taken at given point in time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
event
: ...alert
: ...error
: ...metric
: ...
- availability | ||
- persistency | ||
- stellar_monitor | ||
- tfchain_monitor | ||
- transfer | ||
- mint | ||
- refund | ||
- withdraw | ||
- vault |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
availability
persistency
stellar_monitor
tfchain_monitor
transfer
mint
refund
withdraw
vault
- vault | ||
|
||
#### source object | ||
the source field set contains all fields which are included in the source object, it is common across all types of events except of `bridge_init_aborted` error event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source field set contains all fields which are included in the source object. It is common across all types of events except for bridge_init_aborted
error event.
##### bridge_init_aborted | ||
|
||
- kind: error | ||
|
||
- category: availability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use this formatting for all event sections below to improve md file readability:
bridge_init_aborted
event
- kind:
error
- category:
availability
…into development-feat-structured-logging
Description
Note: Require tfchain Runtime to include the update from another PR
The goal of this PR it to instrument the bridge source code for the sake of improving observability and perform tracing on workflows and data.
For full description and event schema see the the included doc here
Related Issues:
Checklist:
Please delete options that are not relevant.