Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrateDB connector #3

Open
wants to merge 3 commits into
base: cratedb
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions docs/integrations/databases/CrateDB.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: "CrateDB"
sidebarTitle: "CrateDB"
---

## Credentials

Open the file named `io_config.yaml` at the root of your Mage project and enter cratedb required fields:

```yaml
version: 0.1.1
default:
CRATEDB_COLLECTION: collection_name
CRATEDB_PATH: path of the cratedb persisitant storage
```

## Dependencies

The dependency libraries are not installed in the docker image by default. You'll need to add the libraries to
project `requirements.txt` file manually and install them.

```
cratedb-client==1.6.9
sentence-transformers==2.2.2
```
Comment on lines +22 to +25
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe those are not needed at all?

Suggested change
```
cratedb-client==1.6.9
sentence-transformers==2.2.2
```


## Using Python block

1. Create a new pipeline or open an existing pipeline.
2. Add a data loader, transformer, or data exporter block (the code snippet
below is for a data loader).
3. Select `Generic (no template)`.
4. Enter this code snippet (note: change the `config_profile` from `default` if
you have a different profile):

```python
from mage_ai.settings.repo import get_repo_path
from mage_ai.io.config import ConfigFileLoader
from mage_ai.io.postgres import Postgres
from os import path
from pandas import DataFrame

if 'data_loader' not in globals():
from mage_ai.data_preparation.decorators import data_loader


@data_loader
def load_data_from_postgres(**kwargs) -> DataFrame:
query = 'SELECT 1'
config_path = path.join(get_repo_path(), 'io_config.yaml')
config_profile = 'default'

with Postgres.with_config(ConfigFileLoader(config_path, config_profile)) as loader:
return loader.load(query)
```

5. Run the block.
1 change: 1 addition & 0 deletions docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@
"integrations/databases/BigQuery",
"integrations/databases/ClickHouse",
"integrations/databases/Chroma",
"integrations/databases/CrateDB",
"integrations/databases/Druid",
"integrations/databases/DuckDB",
"integrations/databases/GoogleSheets",
Expand Down
14 changes: 14 additions & 0 deletions mage_ai/data_preparation/templates/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,13 @@
name='Chroma',
path='data_loaders/chroma.py',
),
dict(
block_type=BlockType.DATA_LOADER,
groups=[GROUP_DATABASES],
language=BlockLanguage.PYTHON,
name='CrateDB',
path='data_loaders/cratedb.py',
),
Comment on lines +209 to +215
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wow. The framework allows database adapters to be written in other languages than Python?

dict(
block_type=BlockType.DATA_LOADER,
groups=[GROUP_DATABASES],
Expand Down Expand Up @@ -594,6 +601,13 @@
name='Chroma',
path='data_exporters/chroma.py',
),
dict(
block_type=BlockType.DATA_EXPORTER,
groups=[GROUP_DATABASES],
language=BlockLanguage.PYTHON,
name='CrateDB',
path='data_exporters/cratedb.py',
),
dict(
block_type=BlockType.DATA_EXPORTER,
groups=[GROUP_DATABASES],
Expand Down
31 changes: 31 additions & 0 deletions mage_ai/data_preparation/templates/data_exporters/cratedb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from mage_ai.settings.repo import get_repo_path
from mage_ai.io.config import ConfigFileLoader
from mage_ai.io.cratedb import CrateDB
from pandas import DataFrame
from os import path

if 'data_exporter' not in globals():
from mage_ai.data_preparation.decorators import data_exporter


@data_exporter
def export_data_to_cratedb(df: DataFrame, **kwargs) -> None:
"""
Template for exporting data to a PostgreSQL database.
Specify your configuration settings in 'io_config.yaml'.

Docs: https://docs.mage.ai/design/data-loading#postgresql
"""
Comment on lines +13 to +18
Copy link

@amotl amotl Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings also need a few more adjustments, but sure all that can happen after merging to cratedb, just any time before submitting to upstream. It will probably take another while. I can also support on such details.

schema_name = 'your_schema_name' # Specify the name of the schema to export data to
table_name = 'your_table_name' # Specify the name of the table to export data to
config_path = path.join(get_repo_path(), 'io_config.yaml')
config_profile = 'default'

with CrateDB.with_config(ConfigFileLoader(config_path, config_profile)) as loader:
loader.export(
df,
schema_name,
table_name,
index=False, # Specifies whether to include index in exported table
if_exists='replace', # Specify resolution policy if table name already exists
)
26 changes: 26 additions & 0 deletions mage_ai/data_preparation/templates/data_loaders/cratedb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{% extends "data_loaders/default.jinja" %}
{% block imports %}
from mage_ai.settings.repo import get_repo_path
from mage_ai.io.config import ConfigFileLoader
from mage_ai.io.cratedb import CrateDB
from os import path
{{ super() -}}
{% endblock %}


{% block content %}
@data_loader
def load_data_from_cratedb(*args, **kwargs):
"""
Template for loading data from a PostgreSQL database.
Specify your configuration settings in 'io_config.yaml'.

Docs: https://docs.mage.ai/design/data-loading#postgresql
"""
query = 'your CrateDB query' # Specify your SQL query here
config_path = path.join(get_repo_path(), 'io_config.yaml')
config_profile = 'default'

with CrateDB.with_config(ConfigFileLoader(config_path, config_profile)) as loader:
return loader.load(query)
{% endblock %}
8 changes: 8 additions & 0 deletions mage_ai/data_preparation/templates/repo/io_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,14 @@ default:
CLICKHOUSE_PASSWORD: null
CLICKHOUSE_PORT: 8123
CLICKHOUSE_USERNAME: null
# CrateDB
CRATEDB_CONNECT_TIMEOUT: 10
CRATEDB_DBNAME: crate
CRATEDB_SCHEMA: doc # Optional
CRATEDB_USER: username
CRATEDB_PASSWORD: password
CRATEDB_HOST: hostname
CRATEDB_PORT: 5432
# Druid
DRUID_HOST: hostname
DRUID_PASSWORD: password
Expand Down
1 change: 1 addition & 0 deletions mage_ai/io/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ class DataSource(str, Enum):
BIGQUERY = 'bigquery'
CHROMA = 'chroma'
CLICKHOUSE = 'clickhouse'
CRATEDB = 'cratedb'
DRUID = 'druid'
DUCKDB = 'duckdb'
FILE = 'file'
Expand Down
21 changes: 21 additions & 0 deletions mage_ai/io/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,20 @@ class ConfigKey(str, Enum):
CLICKHOUSE_PORT = 'CLICKHOUSE_PORT'
CLICKHOUSE_USERNAME = 'CLICKHOUSE_USERNAME'

CRATEDB_CONNECTION_METHOD = 'CRATEDB_CONNECTION_METHOD'
CRATEDB_CONNECT_TIMEOUT = 'CRATEDB_CONNECT_TIMEOUT'
CRATEDB_DBNAME = 'CRATEDB_DBNAME'
CRATEDB_HOST = 'CRATEDB_HOST'
CRATEDB_PASSWORD = 'CRATEDB_PASSWORD'
CRATEDB_PORT = 'CRATEDB_PORT'
CRATEDB_SCHEMA = 'CRATEDB_SCHEMA'
CRATEDB_SSH_HOST = 'CRATEDB_SSH_HOST'
CRATEDB_SSH_PASSWORD = 'CRATEDB_SSH_PASSWORD'
CRATEDB_SSH_PKEY = 'CRATEDB_SSH_PKEY'
CRATEDB_SSH_PORT = 'CRATEDB_SSH_PORT'
CRATEDB_SSH_USERNAME = 'CRATEDB_SSH_USERNAME'
CRATEDB_USER = 'CRATEDB_USER'

DRUID_HOST = 'DRUID_HOST'
DRUID_PASSWORD = 'DRUID_PASSWORD'
DRUID_PATH = 'DRUID_PATH'
Expand Down Expand Up @@ -343,6 +357,7 @@ class VerboseConfigKey(str, Enum):
BIGQUERY = 'BigQuery'
CHROMA = 'Chroma'
CLICKHOUSE = 'ClickHouse'
CRATEDB = "CrateDB"
DRUID = 'Druid'
DUCKDB = 'Duck DB'
PINOT = 'Pinot'
Expand Down Expand Up @@ -413,6 +428,12 @@ class ConfigFileLoader(BaseConfigLoader):
VerboseConfigKey.CLICKHOUSE, 'port'),
ConfigKey.CLICKHOUSE_USERNAME: (
VerboseConfigKey.CLICKHOUSE, 'username'),
ConfigKey.CRATEDB_DBNAME: (VerboseConfigKey.CRATEDB, 'database'),
ConfigKey.CRATEDB_HOST: (VerboseConfigKey.CRATEDB, 'host'),
ConfigKey.CRATEDB_PASSWORD: (VerboseConfigKey.CRATEDB, 'password'),
ConfigKey.CRATEDB_PORT: (VerboseConfigKey.CRATEDB, 'port'),
ConfigKey.CRATEDB_SCHEMA: (VerboseConfigKey.CRATEDB, 'schema'),
ConfigKey.CRATEDB_USER: (VerboseConfigKey.CRATEDB, 'user'),
ConfigKey.DRUID_HOST: (VerboseConfigKey.DRUID, 'host'),
ConfigKey.DRUID_PASSWORD: (VerboseConfigKey.DRUID, 'password'),
ConfigKey.DRUID_PATH: (VerboseConfigKey.DRUID, 'path'),
Expand Down
Loading