Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an alembic migration to sync with production indexes #238

Closed
jwhitlock opened this issue Jun 1, 2021 · 2 comments · Fixed by #979
Closed

Add an alembic migration to sync with production indexes #238

jwhitlock opened this issue Jun 1, 2021 · 2 comments · Fixed by #979
Assignees

Comments

@jwhitlock
Copy link
Contributor

jwhitlock commented Jun 1, 2021

There were indexes manually added to the staging and production deployments that also need to be added to the models and via an alembic migration. The migration should also remove any temporary indexes.

For example, this index was added:

CREATE UNIQUE INDEX idx_email_primary_unique_email_lower ON public.emails USING btree (lower((primary_email)::text));

There should be liberal use of IF EXISTS clauses, to avoid errors if an index already exists.

┆Issue is synchronized with this Jira Task

@leplatrem leplatrem self-assigned this Sep 30, 2024
@leplatrem
Copy link
Contributor

Other findings:

PL/SQL functions:

  • gdpr_lookup(email_to_find text)
  • gdpr_lookup2(email_to_find text)
  • gdpr_delete(email_to_find text)
  • gdpr_delete2(email_to_find text)

And CREATE UNIQUE INDEX emails_basket_token_key ON public.emails USING btree (basket_token); which is outdated now.

@leplatrem
Copy link
Contributor

For reference, this is the source of the stored procedures that are about to be deleted from stage/prod DBs:

--
-- Name: gdpr_delete(text); Type: FUNCTION; Schema: public; Owner: ctms
--

CREATE FUNCTION public.gdpr_delete(email_to_find text) RETURNS boolean
    LANGUAGE plpgsql
    AS $_$
  DECLARE
    email_ids record;
    table_name text;
    --TODO: make these table names an external_table? Like `gdpr_relevant_tables` or something?
    table_names text[] := ARRAY['amo', 'fxa', 'mofo', 'newsletters', 'vpn_waitlist', 'emails'];
    results record;
    result boolean;
  BEGIN
    result := false;
    FOR email_ids IN
      SELECT email_id,'emails' AS lookup_table_name FROM emails WHERE LOWER(primary_email)=LOWER(email_to_find)
      UNION ALL
      SELECT email_id,'fxa' AS lookup_table_name FROM fxa WHERE LOWER(primary_email)=LOWER(email_to_find)
    LOOP
      CONTINUE WHEN email_ids.email_id IS NULL;
      RAISE NOTICE 'Email % (%) found in primary_email field of table %', email_to_find, email_ids.email_id, email_ids.lookup_table_name;
      result := true;
      FOREACH table_name IN ARRAY table_names
      LOOP
        RAISE NOTICE 'Deleting % from %', email_to_find, table_name;
        EXECUTE format('DELETE FROM %I WHERE email_id=$1 RETURNING *', table_name)
          INTO results
          USING email_ids.email_id;
        RAISE NOTICE '%', results;
      END LOOP;
    END LOOP;
    RETURN result;
  END;
$_$;


ALTER FUNCTION public.gdpr_delete(email_to_find text) OWNER TO ctms;

--
-- Name: gdpr_delete2(text); Type: FUNCTION; Schema: public; Owner: ctms
--

CREATE FUNCTION public.gdpr_delete2(email_to_find text) RETURNS text
    LANGUAGE plpgsql
    AS $_$
  DECLARE
    email_ids record;
    table_name text;
    --TODO: make these table names an external_table? Like `gdpr_relevant_tables` or something?
    table_names text[] := ARRAY['amo', 'fxa', 'mofo', 'newsletters', 'vpn_waitlist', 'emails'];
    results record;
    result text;
    result_bool boolean;
  BEGIN
    result_bool := false;
    result      := format('DELETING %s', email_to_find);
    FOR email_ids IN
      SELECT email_id,'emails' AS lookup_table_name FROM emails WHERE LOWER(primary_email)=LOWER(email_to_find)
      UNION ALL
      SELECT email_id,'fxa' AS lookup_table_name FROM fxa WHERE LOWER(primary_email)=LOWER(email_to_find)
    LOOP
      CONTINUE WHEN email_ids.email_id IS NULL;
      RAISE NOTICE 'Email % (%) found in primary_email field of table %', email_to_find, email_ids.email_id, email_ids.lookup_table_name;
      result_bool := true;
      result := format('%s (ctms id: %s).', result, email_ids.email_id);
      FOREACH table_name IN ARRAY table_names
      LOOP
        RAISE NOTICE 'Deleting % from %', email_to_find, table_name;
        EXECUTE format('DELETE FROM %I WHERE email_id=$1 RETURNING *', table_name)
          INTO results
          USING email_ids.email_id;
        IF NOT results IS NULL THEN
          RAISE NOTICE '%', results;
          IF table_name = 'mofo' OR table_name = 'fxa' THEN
            result := format('%s %s: YES.', result, table_name);
          END IF;
        ELSE
          RAISE NOTICE '%', results;
        END IF;
      END LOOP;
    END LOOP;
    IF NOT result_bool THEN
        RETURN format('%s not found in CTMS', email_to_find);
    ELSE
        RETURN result;
    END IF;
  END;
$_$;


ALTER FUNCTION public.gdpr_delete2(email_to_find text) OWNER TO ctms;

--
-- Name: gdpr_lookup(text); Type: FUNCTION; Schema: public; Owner: ctms
--

CREATE FUNCTION public.gdpr_lookup(email_to_find text) RETURNS boolean
    LANGUAGE plpgsql
    AS $_$
  DECLARE
    email_ids record;
    table_name text;
    --TODO: make these table names an external_table? Like `gdpr_relevant_tables` or something?
    table_names text[] := ARRAY['amo', 'fxa', 'mofo', 'newsletters', 'vpn_waitlist', 'emails'];
    results record;
    result boolean;
  BEGIN
    result := false;
    FOR email_ids IN
      SELECT email_id,'emails' AS lookup_table_name FROM emails WHERE LOWER(primary_email)=LOWER(email_to_find)
      UNION ALL
      SELECT email_id,'fxa' AS lookup_table_name FROM fxa WHERE LOWER(primary_email)=LOWER(email_to_find)
    LOOP
      CONTINUE WHEN email_ids.email_id IS NULL;
      RAISE NOTICE 'Email % (%) found in primary_email field of table %', email_to_find, email_ids.email_id, email_ids.lookup_table_name;
      result := true;
      FOREACH table_name IN ARRAY table_names
      LOOP
        RAISE NOTICE 'Searching % for %', table_name, email_ids.email_id;
        EXECUTE format('SELECT * FROM %I WHERE email_id=$1', table_name)
          INTO results
          USING email_ids.email_id;
        RAISE NOTICE '%', results;
      END LOOP;
    END LOOP;
    RETURN result;
  END;
$_$;


ALTER FUNCTION public.gdpr_lookup(email_to_find text) OWNER TO ctms;

--
-- Name: gdpr_lookup2(text); Type: FUNCTION; Schema: public; Owner: ctms
--

CREATE FUNCTION public.gdpr_lookup2(email_to_find text) RETURNS text
    LANGUAGE plpgsql
    AS $_$
  DECLARE
    email_ids record;
    table_name text;
    --TODO: make these table names an external_table? Like `gdpr_relevant_tables` or something?
    table_names text[] := ARRAY['amo', 'fxa', 'mofo', 'newsletters', 'vpn_waitlist', 'emails'];
    results record;
    result text;
    result_bool boolean;
  BEGIN
    result_bool := false;
    result      := email_to_find;
    FOR email_ids IN
      SELECT email_id,'emails' AS lookup_table_name FROM emails WHERE LOWER(primary_email)=LOWER(email_to_find)
      UNION ALL
      SELECT email_id,'fxa' AS lookup_table_name FROM fxa WHERE LOWER(primary_email)=LOWER(email_to_find)
    LOOP
      CONTINUE WHEN email_ids.email_id IS NULL;
      result_bool := true;
      RAISE NOTICE 'Email % (%) found in primary_email field of table %', email_to_find, email_ids.email_id, email_ids.lookup_table_name;
      result := format('%s (ctms id: %s).', result, email_ids.email_id);
      FOREACH table_name IN ARRAY table_names
      LOOP
        RAISE NOTICE 'Searching % for %', table_name, email_ids.email_id;
        EXECUTE format('SELECT * FROM %I WHERE email_id=$1', table_name)
          INTO results
          USING email_ids.email_id;
        IF NOT results IS NULL THEN
          RAISE NOTICE '%', results;
          IF table_name = 'mofo' OR table_name = 'fxa' THEN
            result := format('%s %s: YES.', result, table_name);
          ELSIF table_name = 'newsletters' THEN
            EXECUTE 'SELECT ARRAY_AGG(name) FROM newsletters WHERE email_id=$1 GROUP BY email_id'
              INTO results
              USING email_ids.email_id;
            result := format('%s newsletters: %s.', result, results);
          ELSIF table_name = 'emails' THEN
            EXECUTE 'SELECT has_opted_out_of_email FROM emails WHERE email_id=$1'
              INTO results
              USING email_ids.email_id;
            result := format('%s globally opted-out: %s.', result, results);
          END IF;
        ELSE
          RAISE NOTICE '%', results;
        END IF;
      END LOOP;
    END LOOP;
    IF NOT result_bool THEN
        RETURN format('%s not found in CTMS', email_to_find);
    ELSE
        RETURN result;
    END IF;
  END;
$_$;


ALTER FUNCTION public.gdpr_lookup2(email_to_find text) OWNER TO ctms;

The commands used to delete them:

ctms=> DROP FUNCTION IF EXISTS gdpr_lookup(email_to_find text);
DROP FUNCTION IF EXISTS gdpr_lookup2(email_to_find text);
DROP FUNCTION IF EXISTS gdpr_delete(email_to_find text);
DROP FUNCTION IF EXISTS gdpr_delete2(email_to_find text);
DROP FUNCTION
DROP FUNCTION
DROP FUNCTION
DROP FUNCTION
ctms=>

leplatrem added a commit that referenced this issue Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants