Skip to main content

Row By Row DML Vs Single DML Statement


Recently, I was working on an Archival process for one of my client, where this process suppose to move data from Live OLTP system to Archived database. This process is going little slow, so one of my DBA has advice me following approach,
In delete part instead of using single delete statement with Join, open cursor and loop it through. Then fir delete statement for each row by row in a loop.

I was wondering how oracle would go about it ? and will it improve the things?
So, here is my small investigation on this,
     1.  If you have db_file_multiblock_read_count = 8 and Tablespace Block size = 8KB. So, in single IO 
          read request oracle would read 64KB of data.
     2. If you fir a delete statements in a loop, oracle will have to instantiate IO request for each delete statement 
         and out of that it will pick up just one row or few rows if there is one to many relation.
     3. For subsequent delete statement if the data is in the cache, then it will be an additional logical IO (gets) else 
         it will have to request for new Physical IO.
    4. If the same blocks is getting modified in a loop, oracle must be creating  new version every time instead of 
        using same version, as its using more undo blocks. So, creating new version here is an overhead.
    5. This approach will help in selecting index access path. But adding overhead of
a.       Managing multiple versions of same block
b.      Uses more undo blocks. That will result in more archive logs.
c.       Takes more elapse time.
d.      Additional get request (logical IO)
    6. In Single Delete statement case, I have check the plans and found that its using index access path with SEMI HASH OR SEMI NESTED LOOP join, So it must be fetching all the blocks and modifying them in one go with less undo and elapse time.

As per my analysis processing data Row by Row is an overhead as compare to performing it in one single DML statement.
My results on Locally managed table space with block size = 8K,  (After 5 sample runs):
                1. Single Delete statement :
a.       74263 rows deleted.
b.      Elapsed: 00:00:49.43
c.       Undo Blocks: 16008K
                2.  Deletes in Loop Row by Row:
a.       74263 rows deleted.
b.      Elapsed: 00:01:00.57
c.       Undo Blocks:16512K

Here is code snippet that you can try out, 
CREATE TABLE TESTDELETE
AS
   WITH GENERATOR
       AS (SELECT --+ materialize
                   ROWNUM ID
              FROM ALL_OBJECTS
             WHERE ROWNUM <= 5000
          )
   SELECT TRUNC (SQRT (ROWNUM - 1)) SROOT,
          ROWNUM - 1 IDS,
          TO_CHAR (TRUNC (DBMS_RANDOM.VALUE (1, 1000000))) RANDOMID,
          LPAD (ROWNUM - 1, 10) PADDING,
          RPAD ('x', 50) CTXT
     FROM GENERATOR V1, GENERATOR V2
    WHERE ROWNUM <= 5000000;
    
CREATE INDEX TESTDELETE_RANDOMID ON TESTDELETE (RANDOMID); 

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
USER,
'TestDelete',
CASCADE => TRUE,
ESTIMATE_PERCENT => NULL,
METHOD_OPT => 'for all columns size 1'
              );
END;
/

CREATE GLOBAL TEMPORARY TABLE STAGING (RANDOMID VARCHAR2 (40))
ON COMMIT PRESERVE ROWS;

INSERT INTO STAGING 
SELECT TO_CHAR(TRUNC(DBMS_RANDOM.VALUE(1,1000000))) RANDOMID
  FROM ALL_OBJECTS 
 WHERE ROWNUM <= 15000;

COMMIT;

SELECT COUNT(ROWID) FROM STAGING;
SELECT COUNT(ROWID) FROM TESTDELETE;

-- Ignore the result of first pass as this will take time to get the blocks in memory. or Disable cache.
-- Compare the result of subsequent passes.


-- Delete with Single Statm.
DELETE FROM TESTDELETE WHERE EXISTS ( SELECT ROWID FROM STAGING WHERE STAGING.RANDOMID = TESTDELETE.RANDOMID);
--74263 rows deleted.
--Elapsed: 00:00:49.43
--Undo Blocks: 16008K

-- Use Following SQL in SysDba to get the undo size
SELECT TO_CHAR (S.SID) || ',' || TO_CHAR (S.SERIAL#) SID_SERIAL,
       NVL (S.USERNAME, 'None') ORAUSER,
       S.PROGRAM,
       R.NAME UNDOSEG,
       T.USED_UBLK * TO_NUMBER (X.VALUE) / 1024 || 'K' "Undo"
  FROM SYS.V_$ROLLNAME R,
       SYS.V_$SESSION S,
       SYS.V_$TRANSACTION T,
       SYS.V_$PARAMETER X
 WHERE S.TADDR = T.ADDR 
   AND R.USN = T.XIDUSN(+) 
   AND X.NAME = 'db_block_size';

ROLLBACK;

-- Use For Loop to delete the same rows and see the diff.
DECLARE
 RCOUNT NUMBER := 0;
BEGIN
    FOR REC IN ( SELECT RANDOMID FROM STAGING )
    LOOP
    
        DELETE FROM TESTDELETE WHERE RANDOMID = REC.RANDOMID;    
        RCOUNT := RCOUNT + SQL%ROWCOUNT;
    END LOOP;        

    DBMS_OUTPUT.PUT_LINE( RCOUNT || ' rows deleted.'  );    
END;

-- 74263 rows deleted. 
-- Elapsed: 00:01:00.57
-- Undo Blocks:16512K



Comments

Popular posts from this blog

Drop all Objects from Schema In Postgres

To Drop all objects from Postgres Schema there could be following two approaches: Drop Schema with cascade all and re-create it again.  In some cases where you dont want to/not allowed to drop and recreate schema, its easy to look for objects on current schema and drop them. following script would help to do so, Create function which would do the task and then drop that function too. --- CREATE OR REPLACE FUNCTION drop_DB_objects() RETURNS VOID AS $$ DECLARE  rd_object RECORD; v_idx_statement VARCHAR(500);   BEGIN ---1. Dropping all stored functions RAISE NOTICE '%', 'Dropping all stored functions...'; FOR rd_object IN ( SELECT format('%I.%I(%s)', ns.nspname, p.proname, oidvectortypes(p.proargtypes)) as functionDef     FROM pg_proc p     INNER JOIN pg_namespace ns ON (p.pronamespace = ns.oid)    WHERE ns.nspname = current_schema      AND p.proname <...

Distributed transaction in Oracle ( Over Oracle DBLink)

To fetch the data from one server to and other Oracle server over DBLink, I experienced the following facts of Oracle Distributed transactions: Security issue: -           We cannot create Public synonym for the remote object accessed over Private DBLink of other Database.   -           It’s allowed to create private synonym for remote object, but you cannot grant the access over this synonym to any other schema. If you try to provide the grants to other schema Oracle raises an error:              [ORA-02021: DDL operations are not allowed on a remote database] “In an all you can access remote objects over private DBLink in the same schema where DBLink is created”. Fetching the Ref Cursor at Remote site:                   Let’s say we have two site...

Vacuum Analyze Full Schema in PostgreSQL Database

To analyze full database schema, you can use following shell script : -------------------------------------------------------------------------------------------------------------------------- #!/bin/sh rm -rf /tmp/vacuum_analyze_MyData.sql dbname="mydata" username="mydatadatamart" namespace="mydatadatamart" # Vacuum only those tables which are fragmented over 30% # Analyze tables only if they are not already # Process CMF and Groups Master table every time as they are master Tables=`psql $dbname $username << EOF SELECT 'VACUUM ANALYZE VERBOSE ' || tablename || ' ;' AS TableList   FROM ( SELECT *,       n_dead_tup > av_threshold AS "av_needed",       CASE WHEN reltuples > 0      THEN round(100.0 * n_dead_tup / (reltuples))       ELSE 0       END AS pct_dead FROM (SELECT C.relname AS TableName, pg_stat_get_tuples_inserted(C.oid) AS n_tup_ins, pg_s...