Skip to content

atlas-bi/Solr-Search-ETL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Solr Search ETL

Atlas BI Library ETL | Solr Search Engine

WebsiteDemoDocumentationChat

maintainability discord chat latest release

Load Atlas metadata into the Solr search engine that powers Atlas' search.

🏃 Getting Started

In order to use these scripts you should already have Atlas published, with Solr started. See the Atlas BI Libary docs.

Dependencies

Java

For development purposes, sorl search can be started directly from the Atlas source code.

  1. Install Java JRE
  2. Add a system environment variable called JAVA_HOME with the path to java, for example C:\Program Files\Java\jdk-17.0.1.
  3. In your terminal navigate to /web/solr/ in the Atlas source code. Run ./bin/solr start to start solr.

Python

This ETL uses python > 3.8. Python can be installed from https://www.python.org/downloads/

C++ build tools are needed on Windows OS.

ODBC Driver for SQL Server is required for connecting to the database.

Install Packages

This ETL uses poetry as the package manager. Alternatively, you can use pip to install the dependencies listed in pyproject.toml/dependencies.

poetry install

Create a .env file

Variables can either be set in the environment, or added to a .env file.

SOLRURL=http://localhost:8983/solr/atlas
SOLRLOOKUPURL=http://localhost:8983/solr/atlas_lookups
ATLASDATABASE=DRIVER={ODBC Driver 18 for SQL Server};SERVER=server_name;DATABASE=atlas;UID=user_name;PWD=password;TrustServerCertificate=Yes;"

# Optional for bookstack etl
BOOKSTACKURL=https://docs.example.com
BOOKSTACKTOKENID=123456
BOOKSTACKTOKENSECRET=78910111213

Running

delete.py script should be run once daily to empty Solr.

The remaining atlas_*.py scripts can be run periodically through the day to keep search results current.

poetry run python delete.py
poetry run python atlas_collections.py
poetry run python atlas_initiatives.py
poetry run python atlas_groups.py
poetry run python atlas_terms.py
poetry run python atlas_lookups.py
poetry run python atlas_users.py
poetry run python atlas_reports.py

# Optional etl to load documents from bookstack. Use this as an example etl for loading external content into search!
poetry run python atlas_bookstack.py

🎁 Contributing

This repository uses pre-commit and commitzen. Please commit npm run commit && git push.