Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loader elasticsearch #1490

Merged
merged 17 commits into from
Jun 28, 2024
Merged

Conversation

walterra
Copy link
Contributor

@walterra walterra commented Jun 27, 2024

As discussed on your Community Slack, I'd like to contribute a data loader example for Elasticsearch.

The original repo I created this in can be found here:
https://github.com/walterra/observable-framework-data-loader-elasticsearch

A live view can be found here:
https://walterra.observablehq.cloud/framework-example-loader-elasticsearch/

This Observable Framework example demonstrates how to write a TypeScript data loader that runs a query on Elasticsearch using the Elasticsearch Node.js client. The data loader lives in src/data/kibana_sample_data_logs.csv.ts and uses the helper src/data/es_client.ts.

Similar to #1477, I added the output of the data loader as a static file too.

To fully reproduce the example, you need to have a setup with both Elasticsearch and Kibana running to create the sample data. Here's how to set up both on macOS:

# Download and run Elasticsearch
curl -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.14.1-darwin-x86_64.tar.gz
gunzip -c elasticsearch-8.14.1-darwin-x86_64.tar.gz | tar xopf -
cd elasticsearch-8.14.1
./bin/elasticsearch

# Next, in another terminal tab, download and run Kibana
curl -O https://artifacts.elastic.co/downloads/kibana/kibana-8.14.1-darwin-x86_64.tar.gz
gunzip -c kibana-8.14.1-darwin-x86_64.tar.gz | tar xopf -
cd kibana-8.14.1
./bin/kibana

The commands for both will output instructions how to finish the setup with security enabled. Once you have both running, you can create the sample data in Kibana via this URL: http://localhost:5601/app/home#/tutorial_directory/sampleData

Finally, create the .env file with the credentials shared for the user elastic that were logged when starting Elasticsearch like this. To get the CA fingerprint for the config, run the following command from the directory you started installing Elasticsearch:

openssl x509 -fingerprint -sha256 -noout -in ./elasticsearch-8.14.1/config/certs/http_ca.crt
ES_NODE="https://elastic:<PASSWORD>@localhost:9200"
ES_CA_FINGERPRINT="<CA_FINGERPRINT>"
ES_UNSAFE_TLS_REJECT_UNAUTHORIZED="FALSE"

examples/loader-elasticsearch/src/index.md Outdated Show resolved Hide resolved
examples/loader-elasticsearch/src/index.md Outdated Show resolved Hide resolved
examples/loader-elasticsearch/src/index.md Outdated Show resolved Hide resolved
walterra and others added 5 commits June 27, 2024 15:53
tweak text

Co-authored-by: Philippe Rivière <fil@rezo.net>
adding empty lines should allow framework to convert the markdown (in this case, the backticks around .env)

Co-authored-by: Philippe Rivière <fil@rezo.net>
adding empty lines should allow framework to convert the markdown (in this case, the backticks around .env)

Co-authored-by: Philippe Rivière <fil@rezo.net>
@walterra
Copy link
Contributor Author

Thanks for the initial feedback, I updated index.md with the suggestions. Also added a link to examples/README.md.

@Fil
Copy link
Contributor

Fil commented Jun 27, 2024

Thanks for the very detailed procedure — I was able to test it locally.

I now have just a few more suggestions:

  • we should use framework version 1.9.0, or latest, in package.json.
  • Regarding the sample data sets, indicate that we need dataset "Sample web logs"; they're called kibana_sample_data_logs in the loader.
  • I personally added double quotes around the values in .env, but it's not mandatory.
  • ES_CA_FINGERPRINT was already shown in the logs of elasticsearch, so the manual ssl step seemed almost redundant. The only difference is that in one case it is written in lowercase and without colons, when the correct format has uppercase and colons (9191e5abbecb472806b453… vs. 91:91:E5:AB:BE:CB:47:28:06:B4:53:…).

Another note in passing (but probably not actionable): as I was testing the setup, I was blocked as the enrollment string linked to IP address 10.100.0.2. But this address would not connect, and this prevented the installation from working at all. I realized I had to disable NordVPN, delete the whole es installation, and start again from scratch… and it worked. (I guess it's more an issue with NordVPN than with ES, but figured maybe you'd want to investigate this.)

@walterra
Copy link
Contributor Author

walterra commented Jun 27, 2024

Great you got it working with the instructions! Getting the SSL setup right can be a bit painful.

I updated the observable framework package to latest, added quotes to the .env part and made sure to mention the name of the dataset in the Kibana UI in b2b4d06.

About ES_CA_FINGERPRINT: Did the lowercase version without colons work for you? When I used it, I got this error message: ConnectionError: Server certificate CA fingerprint does not match the value configured in caFingerprint. That's why I added the openssl command then. I will check with our ES+clients team if we can improve that. I'll also forward your problem with NordVPN!

Update: We now have an issue to track the problem with the different fingerprint formats: elastic/elastic-transport-js#107

@Fil
Copy link
Contributor

Fil commented Jun 27, 2024

Did the lowercase version without colons work for you?

Nope :). To clarify, it would have been great to have the correct format in that place and not have to do the call to openssl, as this is always stressful—but it's not a blocker for this PR!

Copy link
Contributor

@Fil Fil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add curly quotes before we merge (I don't think I am allowed to push to this branch?).

I've deployed the project to https://observablehq.observablehq.cloud/framework-example-loader-elasticsearch/

examples/loader-elasticsearch/README.md Outdated Show resolved Hide resolved
examples/loader-elasticsearch/README.md Outdated Show resolved Hide resolved
walterra and others added 2 commits June 28, 2024 11:20
fix quote

Co-authored-by: Philippe Rivière <fil@rezo.net>
fix quotes

Co-authored-by: Philippe Rivière <fil@rezo.net>
@walterra
Copy link
Contributor Author

walterra commented Jun 28, 2024

I guess because the branch is in my fork you cannot push to it. I commited the quote updates!

Thanks for the deployment! Is there something I need to add so it gets the header with "Observable Framework" title and "View source" link? I can see it in the hello world example but not on the notebook you deployed for this data loader.

@Fil
Copy link
Contributor

Fil commented Jun 28, 2024

hmm no, that was my oversight; thanks for reminding me!

To deploy the examples we have to include the common configuration file which adds the specific headers:

yarn deploy -c ../observablehq.config.js 

Copy link
Contributor

@Fil Fil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@Fil Fil enabled auto-merge (squash) June 28, 2024 09:37
@Fil Fil merged commit d3c069b into observablehq:main Jun 28, 2024
4 checks passed
@walterra walterra deleted the data-loader-elasticsearch branch June 28, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants