-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Disccusion] Metrics API design. #2
Comments
In most business scenarios, Here are some ideas for these functions. 1. EventsEvent Log marks the occurrence of a specified situation, which is often related with some alarms. service Runtime {
// log event.
rpc OnEvent(OnEventRequest) returns (google.protobuf.Empty) {}
}
message Event {
required string event_name = 1;
optional string desciption = 2;
required long timestamp = 3;
}
message OnEventRequest {
required string app_id = 1;
required Event event = 2;
} Alarms can be set for specified events. Email reminding is the most common way to handle the alarm. User can also defined their own handlers as an ehanced function if necessary. service Runtime {
// start transaction, get the unique transaction id.
rpc CreateEventAlarm(CreateEventAlarmRequest) returns (CreateEventAlarmResponse) {}
// record action in transaction.
rpc DeleteEventAlarm(DeleteEventAlarmRequest) returns (google.protobuf.Empty) {}
}
message Alarm {
optional string alarm_name = 1;
optional repeated string handlers = 2;
}
message EventAlarm {
Alarm alarm = 1;
string event_name = 2;
}
message CreateEventAlarmRequest {
required string app_id = 1;
required string event_name = 2;
required string alarm_name = 3;
optional repeated string handlers = 4;
}
message CreateEventAlarmResponse {
required string app_id = 1;
EventAlarm event_alarm = 2;
}
message DeleteEventAlarmRequest {
required string app_id = 1;
required string event_name = 2;
required string alarm_name = 3;
}
2. Digital Index.Digital Index describes the performance changes of an application over a period of time. They can be processed in different ways and serve well for futher data analysis. service Runtime {
// start transaction, get the unique transaction id.
rpc CreateIndex(CreateIndexRequest) returns (CreateIndexResponse) {}
// record action in transaction.
rpc publishIndexData(PublishIndexDataRequest) returns (google.protobuf.Empty) {}
}
message Index {
string index_name = 1;
string data_type = 2;
repeated string processors = 3;
}
message CreateIndexRequest {
required string app_id = 1;
required string index_name = 2;
required string data_type = 3;
repeated string processors = 4;
}
message CreateIndexResponse {
required string app_id = 1;
Index index = 2;
}
message PublishIndexDataRequest {
required string app_id = 1;
required string index_name = 2;
required string value = 3;
required long timestamp = 4;
} Also, alarms can be set, and triggered when the index touch a specific amount. service Runtime {
rpc CreateIndexAlarm(CreateIndexAlarmRequest) returns (CreateIndexAlarmResponse) {}
// record action in transaction.
rpc publishIndexData(PublishIndexDataRequest) returns (google.protobuf.Empty) {}
rpc DeleteIndexAlarm(DeleteIndexAlarmRequest) returns (google.protobuf.Empty) {}
}
message IndexAlarm {
string index_name = 1;
Alarm alarm = 2;
// perhaps regular expression? or use structures with some pre-defined enums.
string rule = 3;
}
message CreateIndexAlarmRequest {
required string app_id = 1;
required string index_name = 2;
required string alarm_name = 3;
repeated string handlers = 4;
required string rule = 5;
}
message CreateIndexAlarmResponse {
required string app_id = 1;
IndexAlarm index_alarm = 2;
}
message DeleteIndexAlarmRequest {
required string app_id = 1;
required string index_name = 2;
required string alarm_name = 3;
}
Although all the fuctions should be customizable, we can also provide some 3. Action Execution Sequence.Action Execution Sequence records how an function was performed in detail, which is useful for troubleshooting. It might be the most difficult form of metric logging. service Runtime {
// start transaction, get the unique transaction id.
rpc StartTransaction(StartTransactionRequest) returns (StartTransactionAlarmResponse) {}
// record action in transaction.
rpc RecordAction(RecordActionRequest) returns (google.protobuf.Empty) {}
}
message StartTransactionRequest {
required string app_id = 1;
required string transaction_name = 2;
}
message StartTransactionAlarmResponse {
required string app_id = 1;
string transaction_id = 2;
string transaction_name = 3;
}
message RecordActionRequest {
required string app_id = 1;
required string transaction_id = 2;
required string action_name = 3;
optiona map<string, string> action_details = 4;
long timestamp = 5;
}
Let's make some futher discussion about the design of API. Looking forward for your reply~ |
cool! perfect Please give me a moment to let me understand your design. |
@JasmineJ1230 Can you pack these definitions into one proto file If you have time, can you provide java implementations of these interfaces? |
OK~ I will make a more complete api design during the National Day holiday. (Perhaps 10/1~3?) Let's do more detailed discussion on the api defination after that. I will contact you when there is any progress~ Also, when we completed the design of api, I think it would not be a difficult stuff to provide the java implementations. |
I put your definition above here: https://github.com/reactivegroup/cloud-runtimes-jvm/blob/feature/metrics/spec/proto.runtime.v1/Metrics.proto And you can directly give Layotto a proposal |
We'd better refer OpenTelemetry, which is already the public accepted standard for monitoring, tracing, and metrics. |
I have done some learning about Open Telemetry, as well as finished some demo and experiments. I put my learning report here #5, which can be surved as a reference. |
Goal
Design Metrics application-level indicator monitoring API
Progress
We can first refer to some information and define a first version of the API.
Reference
dapr/dapr#2817
mosn/layotto#90
dapr/dapr#2988
dapr/dapr#100
dapr/dapr#3449
dapr/dapr#3455
dapr/dapr#3549
mosn/layotto#214
The text was updated successfully, but these errors were encountered: