Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit source and sink types on app start #4105

Merged
merged 4 commits into from
Feb 21, 2024
Merged

Conversation

esevastyanov
Copy link
Contributor

No description provided.

@esevastyanov esevastyanov marked this pull request as ready for review February 20, 2024 15:49
Comment on lines 285 to 289
// Collect and emit information about registered source types
err = emitSourceTelemetry(ctx, opts, rt, inst)
if err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently a blocking operation with a network call that will hold back starting of the application. I think we should run it in a goroutine.

@@ -624,6 +632,49 @@ func initLogger(isVerbose bool, logFormat LogFormat) (logger *zap.Logger, cleanu
}
}

func emitSourceTelemetry(ctx context.Context, opts *AppOptions, rt *runtime.Runtime, inst *drivers.Instance) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest renaming it to reflect that it emits a start event (the choice of start attributes may increase in the future). Also, these are pretty stateful objects to pass into a util function. Would suggest making it a member function:

func (a *App) emitStartEvent(ctx context.Context) error

Comment on lines 636 to 660
controller, err := rt.Controller(ctx, inst.ID)
if err != nil {
return err
}
catalog, _, err := rt.Catalog(ctx, inst.ID)
if err != nil {
return err
}
resources, err := catalog.FindResources(ctx)
if err != nil {
return err
}
var sourceDrivers []string
var olapDrivers []string
for _, res := range resources {
if res.Kind == "rill.runtime.v1.Source" {
source, err := controller.Get(ctx, &runtimev1.ResourceName{Kind: res.Kind, Name: res.Name}, false)
if err != nil {
return err
}
spec := source.GetSource().Spec
sourceDrivers = append(sourceDrivers, spec.SourceConnector)
olapDrivers = append(olapDrivers, spec.SinkConnector)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a problem, which is that the catalog is often not correctly populated right when the application starts (needs to wait for reconcile).

Instead, it would be better to use the AnalyzeConnectors function on the project parser, which is designed to explicitly extract info about the connectors used in the project. You can see the code for the rill env configure command for an example.

Note that the AnalyzeConnectors function covers everything, not just sources. I think it might be better just to emit the names of all connectors and then separately emit the name of the main OLAPConnector.

Comment on lines 671 to 673
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = tel.Flush(ctx) // Ignore error and start the app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log the error at debug level instead of ignoring

Comment on lines 109 to 110
SourceDriver []string `json:"source_driver"`
OLAPDriver []string `json:"olap_driver"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a generic Payload map[string]any or something like that instead? Unlike the other fields, these properties seem specific to a single event

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how druid parses nested multi-value dimensions
We can also mix in additional dimensions

    val additionalDims map[string]any

    // Convert struct to map and merge with additional data
    var resultMap map[string]interface{}
    temp, _ := json.Marshal(fields)
    json.Unmarshal(temp, &resultMap)
    for key, value := range additionalDims {
        resultMap[key] = value
    }

    // Marshal the combined data to JSON
    event, _ := json.Marshal(resultMap)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isolated this into a separate commit

Comment on lines 581 to 582
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's emitted in the background, maybe we don't need the timeout?

}

func (t *Telemetry) EmitStartEvent(sourceDrivers []string, olapDriver string) {
payload := map[string]any{"source_driver": sourceDrivers, "olap_driver": olapDriver}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analyzed connectors actually not necessarily just for sources, e.g. if we add support for exporting to a connector, it would also show up here. So would recommend renaming sourceDrivers to just connectors.

And maybe just to align terminology, also consider changing olap_driver to olap_connector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

@begelundmuller begelundmuller merged commit 44ea485 into main Feb 21, 2024
4 checks passed
@begelundmuller begelundmuller deleted the 173-emit-driver-type branch February 21, 2024 09:20
nishantmonu51 pushed a commit that referenced this pull request Feb 21, 2024
* Emit source and sink types on app start

* Parser vs catalog

* Payload

* Renamed payload props
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants