This project demonstrates how to connect to a database service called Pipeline Cloud using Python. It includes two approaches: a vanilla connection using pyodbc
and a more advanced connection using SQLAlchemy
.
To utilise this connection method, you will require a 'headless user' (service account), that is linked with your primary Azure MFA user account. For more details on this process, you can speak with your Bonterra account manager.
- Docker
- Docker Compose
- Poetry
- ODBC Driver 18 for SQL Server
Poetry is a dependency management tool for Python that simplifies the process of managing project dependencies and virtual environments. Unlike pip
, Poetry automatically creates and manages a virtual environment for your project, ensuring that dependencies are isolated and consistent across different environments. This makes it easier to maintain and share your project.
- Install Poetry: Follow the official installation guide to install Poetry on your system.
- Install Dependencies: Run
poetry install
to install all dependencies specified inpyproject.toml
. - Activate Virtual Environment: Use
poetry shell
to activate the virtual environment. - Run a Script: Use
poetry run python <script_name>.py
to run a script within the virtual environment.
To connect to the Pipeline Cloud database, you'll need to create a .env
file in the root directory (the same level as .env.example
). This file will store the necessary environment variables for your database connection. If you're using a Bonterra-hosted Azure SQL instance, you will receive emails from NoReply_PipelineCloud@everyaction.com
containing the server name, database name, and a secret link to fetch your application ID, private/public key, and certificate thumbprint. Store the private key in a file named pipelinecloud_privatenopass.pem
in the root directory.
Key | Sample Value | Description |
---|---|---|
DATABASE_SERVER |
esv30ddbms001.database.windows.net or esv30pdbms002.database.windows.net |
The server address of the database. Verify the correct server in your Azure MFA user connection string (e.g., via DBeaver, Azure Data Studio). |
DATABASE_NAME |
Pipeline_***** |
The name of the database. This value is typically provided in an email from NoReply_PipelineCloud@everyaction.com during the provisioning stage of your Pipeline Cloud instance. |
TENANT_ID |
798d7834-694a-41b4-b6cb-e5448f079f6b |
The tenant ID for Azure authentication. This is usually a fixed value and you should use the sample value provided. |
APPLICATION_ID |
The application ID for Azure authentication. Obtain this from the secret link provided in the email. This value is unique to your EveryAction Pipeline Cloud instance. | |
CLIENT_CERT_PATH |
pipelinecloud_privatenopass.pem |
Path to the client certificate PEM file. Obtain your private key from the secret link provided by the EveryAction team. This value is unique to your EveryAction Azure MFA account. |
CLIENT_CERT_THUMBPRINT |
Thumbprint of the client certificate. This is provided in the secret link email and is used to verify the certificate. This value is unique to your EveryAction Azure MFA user account. |
- Private Key File: Ensure that
pipelinecloud_privatenopass.pem
is stored securely in the root directory and not within the/pipelinecloud/
app directory. - Environment File Location: The
.env
file should be located in the same directory as.env.example
and the PEM file.
.
├── .env.example
├── pipelinecloud_privatenopass.pem
├── pyproject.toml
├── README.md
└── pipelinecloud
├── dbconnection_basic.py
└── dbconnection_sqlalchemy.py
.env.example
: Template for environment variables.pipelinecloud_privatenopass.pem
: Your private key file (store in the root directory).pipelinecloud/
: Contains the Python scripts for database connection.
-
Clone the Repository:
git clone <repository-url> cd <repository-directory>
-
Set Up Environment Variables:
- Copy
.env.example
to.env
and fill in your credentials. - Ensure
pipelinecloud_privatenopass.pem
is in the root directory.
- Copy
-
Install Dependencies:
poetry install
-
Run the Scripts Using Poetry:
- Precheck: Verify that your environment variables and PEM file are set up correctly.
poetry run precheck
- Basic Database Connection: Run the basic connection script using
pyodbc
.poetry run connect-basic
- SQLAlchemy Database Connection: Run the advanced connection script using
SQLAlchemy
.poetry run connect-sqlalchemy
- Run All: Execute the precheck and both connection methods sequentially.
poetry run full-check
- Precheck: Verify that your environment variables and PEM file are set up correctly.
-
Run the Scripts Directly (Without Poetry):
- Precheck:
python pipelinecloud/precheck.py
- Basic Database Connection:
python pipelinecloud/dbconnection_basic.py
- SQLAlchemy Database Connection:
python pipelinecloud/dbconnection_sqlalchemy.py
- Precheck:
Note: When running scripts directly, ensure that your Python environment is set up correctly and all dependencies are installed. You may need to activate the virtual environment created by Poetry using poetry shell
.
This project uses SQLAlchemy in conjunction with pyodbc
to connect to an Azure SQL Database using Azure Active Directory (AD) tokens. This approach leverages the strengths of both libraries to provide a secure and efficient way to interact with the database.
- Authentication:
pyodbc
is used to handle the authentication process with Azure AD tokens. This is necessary because the ODBC driver for SQL Server supports Azure AD authentication, but requires specific handling of the access token. - Database Interactions: SQLAlchemy provides a high-level ORM (Object-Relational Mapping) interface, allowing you to interact with the database using Python objects and methods. This makes it easier to manage database operations and transactions.
-
Token Preparation:
- The Azure AD access token is prepared using Python's
struct
module. This involves converting the token into a format thatpyodbc
can use to authenticate with the SQL Server.
- The Azure AD access token is prepared using Python's
-
Connection Setup:
- A connection string is constructed without the
Authentication
attribute. Instead, the access token is passed directly topyodbc
using theattrs_before
parameter. This bypasses the need for theAuthentication
attribute in the connection string.
- A connection string is constructed without the
-
SQLAlchemy Engine Creation:
- The SQLAlchemy engine is created using the connection established by
pyodbc
. This engine is then used to perform database operations, such as executing SQL queries and managing transactions.
- The SQLAlchemy engine is created using the connection established by
-
Reflection and Raw SQL:
- SQLAlchemy's reflection feature is used to dynamically load table definitions from the Azure SQL database. This allows you to interact with tables without manually defining models.
- Raw SQL queries can be executed using SQLAlchemy's
text()
construct, providing flexibility for complex queries.
- Security: By using Azure AD tokens, this implementation ensures secure authentication without the need for storing passwords.
- Flexibility: The combination of
pyodbc
and SQLAlchemy allows for flexible and powerful database interactions, leveraging the best features of both libraries. - Compatibility: This approach is compatible with Azure SQL Database and can be adapted for use with other Azure services that support AD authentication.
Below are examples of how to query the database using different methods:
from sqlalchemy.orm import sessionmaker
def query_table(engine, table_name):
Session = sessionmaker(bind=engine)
session = Session()
table = reflect_table(engine, table_name)
query = session.query(table).limit(10)
results = query.all()
for row in results:
print(row)
session.close()
from sqlalchemy import text
def execute_raw_query(engine):
Session = sessionmaker(bind=engine)
session = Session()
raw_query = "SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE';"
result = session.execute(text(raw_query))
tables = result.fetchall()
for table in tables:
print(table[0])
session.close()
This setup ensures that your application can securely connect to Azure SQL Database using modern authentication methods, while still benefiting from the powerful features of SQLAlchemy. The hybrid approach with reflection and raw SQL execution provides flexibility and ease of use, especially when dealing with a large number of tables or complex queries.
Docker is a platform that allows you to package applications and their dependencies into a container, ensuring consistency across different environments. This setup helps verify a successful connection without being affected by your local environment. You can install Docker from the official Docker website.
-
Build and Run the Docker Container:
- First, ensure Docker is installed and running on your system.
- Build and run the Docker container using Docker Compose:
docker-compose up --build
This command will build the Docker image and start the container, running the application with the default script.
-
Switch Scripts:
- By default, the application runs
dbconnection_basic.py
. To rundbconnection_sqlalchemy.py
, modify theCMD
in theDockerfile
:# Change the CMD line to run the SQLAlchemy script CMD ["poetry", "run", "connect-sqlalchemy"]
- By default, the application runs
-
Naming the Docker Container:
- You can specify a name for your Docker container by adding the
--name
flag to thedocker-compose
command:docker-compose up --build --name pipelinecloud-app
- You can specify a name for your Docker container by adding the
-
Mac: Install via Homebrew:
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release brew update ACCEPT_EULA=Y brew install msodbcsql18
-
Windows: Download and install from the Microsoft ODBC Driver for SQL Server.
-
Docker: The ODBC driver is included in the Docker image setup.
Both scripts include a placeholder function for querying the database. Replace the example query with your specific SQL or SQLAlchemy query to fetch data and log it to the console.
Ensure all required environment variables are set in your .env
file.
Activate the Poetry environment using poetry shell
and ensure dependencies are installed with poetry install
.
Ensure ODBC Driver 18 is installed on your system. Refer to the installation instructions above.
Check your network settings and ensure the database server is accessible from your environment.
This repository provides a skeleton for connecting to the Pipeline Cloud service from EveryAction using Python. It is designed to be simple to run locally with your credentials to verify a connection can be made.
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. For more details, see https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.