Skip to content

Creating your own Rucio Oracle DB

Eric Vaandering edited this page Nov 7, 2019 · 20 revisions

If you want to experiment with your own instance of Rucio Oracle database, you can use general purpose database service provided by CERN IT-DB:

https://cern.service-now.com/service-portal/service-element.do?name=general-purpose-db

Here is a direct link to the instructions how to request the account (need to be approved by DBA):

https://cern.service-now.com/service-portal/article.do?n=KB0000829

You will likely want either DEVDB11 or DEVDB12 database.

Once you get the account and reset the password, you can put them in the corresponding sections of your rucio.cfg ( [database] ) and alembic.ini ([bootstrap]) on your rucio server and then run

/usr/rucio/tools/bootstrap.py to create both root account and schema,

/usr/rucio/tools/reset_database.py to initialize or reset the schema.

To see your current usage:

connect to @DEVDB12 and issue the SQL command:

"select * from user_ts_quotas;"

If you do need more space, contact oracle.support@cern.ch to request an increase of your quota specifying the account, database (DEVDB12) and tablespace concerned with an explanation and estimate of your usage for the next one to three years.

To see Oracle sessions running in your instance

login to https://session-manager.web.cern.ch/ with your account/password and database

ATLAS database monitoring provides similar information in one page : https://atlas-service-dbmonitor.web.cern.ch/atlas-service-dbmonitor/dashboard/show_sessions.php?user=ATLAS_RUCIO_W&db=ADCR

In addition to web pages we also run a client-side sessions monitoring via the following SQL query:

set pagesize 0;
SET LINESIZE 140;
SET TRIMSPOOL ON;
set echo on;
select g.*, sum("count(*)") over ( order by
    "count(*)" desc,username,osuser,status,machine,client_info
   ) "Cumul" from (
   select count(*) "count(*)",
   username,osuser,status,machine,client_info
   from gv$session where username='PUT_HERE_THE_SCHEMA_OWNER_ACCOUNT'
   group by
   username,osuser,status,machine,client_info
   ) g order by "count(*)" desc,username,osuser,status,machine,client_info;
quit;

Note, that access to the system tables and views is required. This check can run no pod inside the kubernetes cluster and send the results directly to CMS monit system.

Hint to avoid running full on the devdb* instances at CERN

from Martin:

Devdb keeps automatic backups of the operations, thus when you initialize/delete the DB a lot (such in tests) the space gets full very quickly.

I added a script https://github.com/rucio/rucio/blob/master/tools/purge_bin.py which purges the entire database including the backups. If you do these re-installs a lot I can recommend using this script :-)

Oracle extensions and CMS Schema

In ericvaandering's fork of rucio, there is a branch cms_schema for keeping the schema for CMS (which differs a bit from ATLAS) as well as keeping the CMS needed jobs and procedures.

On top of the Oracles database schema, Rucio defines triggers and stored procedures for automated execution of certain tasks. The code was originally written by Gancho Dimitrov for Atlas. See his talk at 2nd Rucio Community WS.

Triggers

In CMS Rucio instance we define a subset of generic triggers, some of which are temporarily disabled to allow the reuse of the dids during PhEDEx-to-Rucio catalog synchronization.

List the triggers and their status in cms_rucio_dev_admin.int2r:

SQL> select trigger_name, status from user_triggers;

TRIGGER_NAME		       STATUS
------------------------------ --------
ACCOUNT_AVOID_UPDATE_DELETE    ENABLED
CHECK_DID_UNIQUENESS	       DISABLED
MIGRATE_DELETED_DID	       DISABLED
SCOPE_AVOID_UPDATE_DELETE      ENABLED

To change trigger status use sql commands:

ALTER TRIGGER <trigger_name> DISABLE;
ALTER TRIGGER <trigger_name> ENABLE;

Stored Procedures (found in Rucio/etc/sql/oracle/procedures.sql)

SQL> set pages 0
SQL> set linesize 2200
SQL> select OBJECT_NAME, OBJECT_TYPE from user_procedures where OBJECT_TYPE='PROCEDURE';
COLLECTION_REPLICAS_UPDATES

SQL> select text from user_source where name='COLLECTION_REPLICAS_UPDATES' order by line;
PROCEDURE		   "COLLECTION_REPLICAS_UPDATES" AS
    type array_raw is table of RAW(16) index by binary_integer;
    type array_scope is table of VARCHAR2(30) index by binary_integer;
    type array_name  is table of VARCHAR2(255) index by binary_integer;

    ids     array_raw;
    rse_ids array_raw;
    scopes  array_scope;
    names   array_name;

    ds_length		      NUMBER(19);
    ds_bytes		      NUMBER(19);
    available_replicas	      NUMBER(19);
    old_available_replicas    NUMBER(19);
    ds_available_bytes	      NUMBER(19);
    ds_replica_state	      VARCHAR2(1);
    row_exists		      NUMBER;

    CURSOR get_upd_col_rep IS SELECT id, scope, name, rse_id FROM CMS_RUCIO_PROD.updated_col_rep;
BEGIN
    -- Delete duplicates
    DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rowid > ANY (SELECT B.rowid FROM CMS_RUCIO_PROD.UPDATED_COL_REP B WHERE A.scope = B.scope AND A.name=B.name AND A.did_type=B.did_type AND (A.rse_id=B.rse_id OR (A.rse_id IS NULL and B.rse_id IS NULL)));
    -- Delete Update requests which do not have Collection_replicas
    DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rse_id IS NOT NULL AND NOT EXISTS(SELECT * FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS B WHERE B.scope = A.scope AND B.name = A.name  AND B.rse_id = A.rse_id);
    DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rse_id IS NULL AND NOT EXISTS(SELECT * FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS B WHERE B.scope = A.scope AND B.name = A.name);
    COMMIT;

    OPEN get_upd_col_rep;
    LOOP
	FETCH get_upd_col_rep BULK COLLECT INTO ids, scopes, names, rse_ids LIMIT 5000;
	FOR i IN 1 .. rse_ids.count
	LOOP
	    DELETE FROM CMS_RUCIO_PROD.updated_col_rep WHERE id = ids(i);
	    IF rse_ids(i) IS NOT NULL THEN
		-- Check one specific DATASET_REPLICA
		BEGIN
		    SELECT length, bytes, available_replicas_cnt INTO ds_length, ds_bytes, old_available_replicas FROM CMS_RUCIO_PROD.collection_replicas WHERE scope=scopes(i) and name=names(i) and rse_id=rse_ids(i);
		EXCEPTION
		    WHEN NO_DATA_FOUND THEN CONTINUE;
		END;

		SELECT count(*), sum(r.bytes) INTO available_replicas, ds_available_bytes FROM CMS_RUCIO_PROD.replicas r, CMS_RUCIO_PROD.contents c WHERE r.scope = c.child_scope and r.name = c.child_name and c.scope = scopes(i) and c.name = names(i) and r.state='A' and r.rse_id=rse_ids(i);
		IF available_replicas >= ds_length THEN
		    ds_replica_state := 'A';
		ELSE
		    ds_replica_state := 'U';
		END IF;

		IF old_available_replicas > 0 AND available_replicas = 0 THEN
		    DELETE FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS WHERE scope = scopes(i) and name = names(i) and rse_id = rse_ids(i);
		ELSE
		    UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS
		    SET state=ds_replica_state, available_replicas_cnt=available_replicas, length=ds_length, bytes=ds_bytes, available_bytes=ds_available_bytes, updated_at=sys_extract_utc(systimestamp)
		    WHERE scope = scopes(i) and name = names(i) and rse_id = rse_ids(i);
		END IF;
	    ELSE
		-- Check all DATASET_REPLICAS of this DS
		SELECT count(*), SUM(bytes) INTO ds_length, ds_bytes FROM CMS_RUCIO_PROD.contents WHERE scope=scopes(i) and name=names(i);
		UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS SET length=nvl(ds_length,0), bytes=nvl(ds_bytes,0) WHERE scope = scopes(i) and name = names(i);
		FOR rse IN (SELECT rse_id, count(*) as available_replicas, sum(r.bytes) as ds_available_bytes FROM CMS_RUCIO_PROD.replicas r, CMS_RUCIO_PROD.contents c WHERE r.scope = c.child_scope and r.name = c.child_name and c.scope = scopes(i) and c.name = names(i) and r.state='A' GROUP BY rse_id)
		LOOP
		    IF rse.available_replicas >= ds_length THEN
			ds_replica_state := 'A';
		    ELSE
			ds_replica_state := 'U';
		    END IF;
		    UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS
		    SET state=ds_replica_state, available_replicas_cnt=rse.available_replicas, available_bytes=rse.ds_available_bytes, updated_at=sys_extract_utc(systimestamp)
		    WHERE scope = scopes(i) and name = names(i) and rse_id = rse.rse_id;
		END LOOP;
	    END IF;
	    COMMIT;
	END LOOP;
	EXIT WHEN get_upd_col_rep%NOTFOUND;
    END LOOP;
    CLOSE get_upd_col_rep;
    COMMIT;
END;

78 rows selected.

SQL> 

Execute stored procedures manually and measure the execution time:

SQL> set time on 
23:05:31 SQL> set timing on 
23:05:36 SQL> exec COLLECTION_REPLICAS_UPDATES;
  
PL/SQL procedure successfully completed.

Elapsed: 00:20:30.03
23:26:24 SQL> 

Stop/start jobs and check details:

Example below drops and creates a scheduler job for a stored procedure.

SQL> exec dbms_scheduler.drop_job('COLLECTION_REPLICAS_UPDATES_JB');

BEGIN
dbms_scheduler.create_job
(
'COLLECTION_REPLICAS_UPDATES_JB',
job_type=>'STORED_PROCEDURE',
job_action=> 'COLLECTION_REPLICAS_UPDATES',
number_of_arguments=>0,
start_date=>TO_TIMESTAMP_TZ('14-MAY-2019 23.00.00 EUROPE/ZURICH','DD-MON-YYYY HH24:MI:SS TZR'),
repeat_interval=> 'FREQ=Minutely; INTERVAL=2',
job_class=>'RUCIO_JOB_CLASS',
enabled=> TRUE,
auto_drop=> FALSE,
comments=>'Every two minutes remove the duplicates from the UPDATED_COL_REP table for all scopes and update the COLLECTION_REPLICAS data'
);
END;
/BEGIN dbms_scheduler.drop_job('COLLECTION_REPLICAS_UPDATES_JB'); END;

*
ERROR at line 1:
ORA-27475: "CMS_RUCIO_DEV_ADMIN.COLLECTION_REPLICAS_UPDATES_JB" must be a job
ORA-06512: at "SYS.DBMS_ISCHED", line 224
ORA-06512: at "SYS.DBMS_SCHEDULER", line 657
ORA-06512: at line 1


SQL> SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16  

PL/SQL procedure successfully completed.

SQL> 

We get errors when trying to drop a non-existing job.

USER_SCHEDULER_JOBS and USER_SCHEDULER_JOB_RUN_DETAILS are useful to list jobs and check their details, e.g.:

23:03:39 SQL> select JOB_NAME, JOB_ACTION, LAST_RUN_DURATION from user_scheduler_jobs;

JOB_NAME
------------------------------
JOB_ACTION
--------------------------------------------------------------------------------
LAST_RUN_DURATION
---------------------------------------------------------------------------
COLLECTION_REPLICAS_UPDATES_JB
COLLECTION_REPLICAS_UPDATES



23:03:59 SQL> 

Oracle Jobs

These are created with rucio/etc/sql/oracle/jobs.sql

SQL> select JOB_NAME, LAST_RUN_DURATION,LAST_START_DATE,NEXT_RUN_DATE, STATE   from user_scheduler_jobs;
JOB_NAME                       LAST_RUN_DURATION   LAST_START_DATE                               NEXT_RUN_DATE                                 STATE          
------------------------------ ------------------- --------------------------------------------- --------------------------------------------- ---------------
COLLECTION_REPLICAS_UPDATES_JB +00 00:00:00.094285 07-NOV-19 09.00.00.200351000 PM EUROPE/ZURICH 07-NOV-19 09.30.00.000000000 PM EUROPE/ZURICH SCHEDULED      
COLL_REPL_UPDATED_JOB_CMS                                                                        07-NOV-19 09.02.00.000000000 PM EUROPE/ZURICH SCHEDULED      
RUCIO_ACCOUNT_USAGE_HIST_JOB                                                                     08-NOV-19 08.00.00.700000000 AM EUROPE/ZURICH SCHEDULED      
RUCIO_DATA_SLIDING_WINDOWS                                                                       11-NOV-19 10.00.00.200000000 AM EUROPE/ZURICH SCHEDULED      
RULES_HIST_SL_WINDOW                                                                             11-NOV-19 07.00.00.700000000 AM EUROPE/ZURICH SCHEDULED      
UPDATE_RSE_USAGE_HISTORY       +00 00:00:05.627834 07-NOV-19 09.00.00.225689000 PM EUROPE/ZURICH 07-NOV-19 09.30.00.000000000 PM EUROPE/ZURICH SCHEDULED      

References