Elasticsearch is a very popular full-text search engine for documents and data. We will use Elasticsearch to create index on JSON data stored in MySQL
First of all, download and install Elasticsearch from https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-install.html. Download the linux binaries.
cd /opt
tar zxvf /tmp/elasticsearch-7.6.1-linux-x86_64.tar.gz
sudo ln -s /opt/elasticsearch-7.6.1 /usr/local/elastic
export PATH=$PATH:/usr/local/elastic/bin
Start Elasticsearch
elasticsearch
Elasticsearch will take sometime to start depending on your VM. Once started, you can check the status
curl -X GET "localhost:9200/_cat/health?v&pretty"
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1584347399 08:29:59 elasticsearch yellow 1 1 1 1 0 0 1 0 - 50.0%
Download the following accounts.json document from https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json?raw=true Index the JSON documents (2000 docs)
cd /opt/download/lab/elastic
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
curl "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open bank QtwKIJOCSp2HVb4M0iMNiw 1 1 1000 0 426.9kb 426.9kb
curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" }
]
}
'
curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" }
],
"from": 10,
"size": 10
}
'
Address field
curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": { "match": { "address": "mill lane" } }
}
'
Age and state
curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
}
'
Now we want to import the same accounts.json into MySQL
mysql>
create database elastic;
\q
We need to massage the accounts.json file before loading into the MySQL table. For each of the json in the accounts.json, there is a correspoinding {index:value}, we need to remove this from the input file
vi accounts.json
g:/"index"/d
w accounts_noindex.json
Copy the file accounts_noindex.json to /home/mysql/data/lab02/elastic Add the secure_file_priv in my1.cnf
secure_file_priv=''
Next, create the table in MySQL mysql>
use elastic;
create table accounts (id int not null auto_increment, doc json not null, primary key (id));
load data infile 'accounts_noindex.json' into table accounts(doc);
Download and install Logstash from https://www.elastic.co/downloads/logstash. Select the TAR format
cd /opt
tar zxvf /tmp/logstash-7.6.1.tar.gz /opt/
sudo ln -s /opt/logstash-7.6.1.tar.gz /usr/local/logstash
Create a logstash-mysql.conf file to connect to MySQL and load the json data into Elasticsearch
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/elastic"
jdbc_user => "demo"
jdbc_password => "password"
jdbc_driver_library => "/opt/mysql-connector-java-8.0.19/mysql-connector-java-8.0.19.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
statement=>"SELECT doc from accounts"
}
}
filter {
json {
source => "doc"
}
}
output {
stdout
{
codec => json_lines
}
elasticsearch {
hosts => ["localhost:9200"]
index => "mysql_accounts_index"
}
}
cd /usr/local/logstash
bin/logstash -f config/logstash-mysql.conf &
You should see a lot of log messges on your terminal that logstash is loading data and creating index Next, you can start using Elasticsearch to search the newly created index on data loaded from MySQL
curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": { "match": { "firstname": "Barnette" } }
}
'
Awesome!