Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pugsql #48

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
9 changes: 0 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,6 @@ Both the OIT Legacy Data Warehouse and the Experts Data Warehouse are Oracle
databases. See [experts\_dw on GitHub](https://github.com/UMNLibraries/experts_dw)
for supported versions of the required Oracle InstanctClient library.

#### LDAP

Experts ETL uses LDAP to search for some student researcher information. See the
[python-ldap build prerequisites](https://www.python-ldap.org/en/python-ldap-3.3.0/installing.html#build-prerequisites)
for the required system libraries to install in your local environment.

### pyenv, venv, and poetry

To install and manage Python versions we use
Expand Down Expand Up @@ -90,9 +84,6 @@ via environment variables:
* `EXPERTS_DB_PASS`
* `EXPERTS_DB_HOSTNAME`
* `EXPERTS_DB_SERVICE_NAME`
* UMN LDAP
* `UMN_LDAP_DOMAIN`
* `UMN_LDAP_PORT`

Some tests are integration tests that connect to these external services, so
these variables must be set for testing. One option is to set these
Expand Down
3 changes: 0 additions & 3 deletions env.dist
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,3 @@ EXPERTS_ETL_FROM_EMAIL_ADDRESS="expert@some.host"

EXPERTS_ETL_TICKET_EMAIL_ADDRESS="some.ticketing.system.address@umn.edu"
EXPERTS_ETL_ERROR_EMAIL_ADDRESS="error.notification.address@umn.edu"

UMN_LDAP_DOMAIN="umn.ldap.domain"
UMN_LDAP_PORT="umn_ldap_port_number"
2 changes: 1 addition & 1 deletion experts_etl/demographics.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ def latest_demographics_for_emplid(session, emplid):
session.query(PureEligibleDemogChngHst)
.filter(
PureEligibleDemogChngHst.emplid == emplid,
PureEligibleDemogChngHst.timestamp == subqry
PureEligibleDemogChngHst.timestamp == subqry.scalar_subquery()
)
.one_or_none()
)
Expand Down
30 changes: 20 additions & 10 deletions experts_etl/oit_to_edw/person.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from sqlalchemy import and_, func, text

from experts_dw import db
from experts_dw.rawsql import update_pure_sync_person_data, insert_pure_sync_person_data, update_pure_sync_user_data, insert_pure_sync_user_data, update_pure_sync_staff_org_association, insert_pure_sync_staff_org_association, delete_obsolete_primary_jobs
from experts_dw.models import PureEligiblePersonNew, PureEligiblePersonChngHst, PureEligibleDemogNew, PureEligibleDemogChngHst, Person, PureSyncPersonDataScratch, PureSyncStaffOrgAssociationScratch, PureSyncUserDataScratch
from experts_dw.sqlapi import sqlapi
from experts_etl import loggers
Expand Down Expand Up @@ -58,24 +59,33 @@ def run(
session.commit()
load_count = 0

update_targets_from_scratch()
# We now use a cx_oracle connection to prepare our target tables
with db.cx_oracle_connection() as connection:
update_targets_from_scratch(connection)

session.commit()

experts_etl_logger.info('ending: oit -> edw', extra={'pure_sync_job': 'person'})

def update_targets_from_scratch():
with sqlapi.transaction():
sqlapi.update_pure_sync_person_data()
sqlapi.insert_pure_sync_person_data()
def update_targets_from_scratch(connection):
try:
cur = connection.cursor()

sqlapi.update_pure_sync_user_data()
sqlapi.insert_pure_sync_user_data()
cur.execute(update_pure_sync_person_data)
cur.execute(insert_pure_sync_person_data)

sqlapi.update_pure_sync_staff_org_association()
sqlapi.insert_pure_sync_staff_org_association()
cur.execute(update_pure_sync_user_data)
cur.execute(insert_pure_sync_user_data)

sqlapi.delete_obsolete_primary_jobs()
cur.execute(update_pure_sync_staff_org_association)
cur.execute(insert_pure_sync_staff_org_association)

cur.execute(delete_obsolete_primary_jobs)

connection.commit()
except:
connection.rollback()
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're able to roll back this transaction, we can recover from this exception. We don't need to raise. As we do elsewhere in this project, let's log an error instead, something like this:

        except Exception as e:
            connection.rollback()
            formatted_exception = loggers.format_exception(e)
            experts_etl_logger.error(
                f'exception encountered during updating pure_sync_data tables: {formatted_exception}'
            )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the raise and implemented exception error logging as noted above.


def load_into_scratch(session, person_dict):
pure_sync_person_data = PureSyncPersonDataScratch(
Expand Down
Loading