Tuesday, May 18, 2010

Collection_Array

PL/SQL has three collection types, Tom often demos with Nested Table and Associative Array.
Associative Array is most flexible on element indexing. Nested Table is good enough for Bulk Fetching.

But Bryn Llewellyn, Oracle PL/SQL Product Manager likes to use VArray in his demo.
"the collection is best declared as a varray with a maximum size equal to Batchsize."
See http://www.oracle.com/technology/tech/pl_sql/pdf/doing_sql_from_plsql.pdf
Probably he knows that VARRAY implemented with the most efficient internal storage, and delivery the best performance, when correctly used; it may meet most of the requirements for loop batch bulk fetching.
To find out the details, you may do a simple RunStats benchmark.

  • Associative Array type(or index-by table)
TYPE population IS TABLE OF NUMBER INDEX BY VARCHAR2(64);

  • Nested Tables
TYPE nested_type IS TABLE OF VARCHAR2(30);

    • invoke EXTEND method to add elements later

  • Collection of ADT = UDT, Abstract datatype, User defined datatype:
CREATE OR REPLACE TYPE INVDB.NUMBER_TAB_TYPE is table of number;
/

select ... from TABLE(ADT_Table_Instance);

Comments, Bulk fetch into ADT is not efficient, you may see the workaround in paper doing_sql_from_plsql.pdf

  • Variable-size array (varray)
-- Code_30 Many_Row_Select.sql
Batchsize constant pls_integer := 1000;

type Result_t is record(PK t.PK%type, v1 t.v1%type);
type Results_t is varray(1000) of Result_t;
Results Results_t;

Concept

  • Associative Array: sparse array.
  • Nested Table or ADT/UDT : dense array
REM Associative Array
declare
   type date_aat is table of date index by binary_integer;
   l_data date_aat;
begin
   l_data(-200) := sysdate;
   l_data(+200) := sysdate+1;
end;
/

collection(elements), e.g. collection_instance(element_unique_subscript_index_number)
record_name.field_name

Monday, May 03, 2010

Benchmark with RunStats

There are many approaches and tools to benchmark Oracle application.
E.g.
* mystat.sql and mystat2.sql
* SQL session trace and tkprof
* SET AUTOT[RACE] {OFF | ON | TRACE[ONLY]} [EXP[LAIN]] [STAT[ISTICS]]
* ASH/AWR
* SQL hint /*+ gather_plan_statistics */ and dbms_xplan.display_cursor(NULL,NULL, 'iostats memstats last partition');
* ALTER SESSION SET STATISTICS_LEVEL=ALL;
* Real-Time SQL Monitoring

My favorite one is Tom's RunStats. Here is the one I enhanced from Tom's original version.

/*
Goal
----
Persistent the benchmark stats, and then developer can query the report later.

Solution
--------
Add an IP column to utility.run_stats_save table, get computer IP by SYS_CONTEXT function.


Reference
---------
file://a:/Tuning/Trace/RunStatsSave.sql

http://tkyte.blogspot.com/2009/10/httpasktomoraclecomtkyte.html

How to build a simple test harness (RUNSTATS) (HOWTO) to test two different approaches from a performance perspective.

Runstats.sql
This is the test harness I use to try out different ideas. It shows two vital sets of statistics for me
The elapsed time difference between two approaches. It very simply shows me which approach is faster by the wall clock
How many resources each approach takes. This can be more meaningful then even the wall clock timings. For example, if one approach is faster then the other but it takes thousands of latches (locks), I might avoid it simply because it will not scale as well.
The way this test harness works is by saving the system statistics and latch information into a temporary table. We then run a test and take another snapshot. We run the second test and take yet another snapshot. Now we can show the amount of resources used by approach 1 and approach 2.
Requirements

In order to run this test harness you must at a minimum have:

Access to V$STATNAME, V$MYSTAT, v$TIMER and V$LATCH
You must be granted select DIRECTLY on SYS.V_$STATNAME, SYS.V_$MYSTAT, SYS.V_$TIMER and SYS.V_$LATCH. It will not work to have select on these via a ROLE.
The ability to create a table -- run_stats -- to hold the before, during and after information.
The ability to create a package -- rs_pkg -- the statistics collection/reporting piece
You should note also that the LATCH information is collected on a SYSTEM WIDE basis. If you run this on a multi-user system, the latch information may be technically "incorrect" as you will count the latching information for other sessions - not just your session. This test harness works best in a simple, controlled test environment.

*/

CREATE USER utility
  IDENTIFIED BY ?
  DEFAULT TABLESPACE USERS quota unlimited on users
ACCOUNT UNLOCK;

grant unlimited tablespace to utility;

create role schema_admin;

grant create session, create table, create view, 
 create procedure, create trigger, create any directory,
 CREATE SEQUENCE, CREATE TYPE, CREATE SYNONYM,
 create materialized view, create dimension,
 SELECT_CATALOG_ROLE,
 create database link,create public database link, drop public database link,
 create job
to schema_admin;

grant schema_admin to utility;
-- grant create procedure, create table to utility;

grant select on SYS.V_$STATNAME to utility;
grant select on SYS.V_$MYSTAT to utility;
grant select on SYS.V_$TIMER to utility;
grant select on SYS.V_$LATCH to utility;

drop table utility.run_stats;
drop table utility.run_stats_save;

create global temporary table utility.run_stats
( runid varchar2(15),
  name varchar2(80),
  value int )
on commit preserve rows;

-- Store the stats for later reporting
create table utility.run_stats_save
( 
runid varchar2(15),
name varchar2(80),
value int,
IP varchar2(30),
hostname varchar2(30)
) tablespace users;

create or replace view utility.stats
as select 'STAT...' || a.name name, b.value
      from v$statname a, v$mystat b
     where a.statistic# = b.statistic#
    union all
    select 'LATCH.' || name,  gets
      from v$latch
 union all
 select 'STAT...Elapsed Time', hsecs from v$timer;
/*
SYS_CONTEXT

The SYS_CONTEXT function is able to return the following host and IP address information for the current session:

TERMINAL - An operating system identifier for the current session. This is often the client machine name. 
HOST - The host name of the client machine. 
IP_ADDRESS - The IP address of the client machine. 
SERVER_HOST - The host name of the server running the database instance. 

SELECT SYS_CONTEXT('USERENV','HOST') FROM dual;
----------
GATES2\SKY

SELECT terminal, machine FROM v$session 
where sid = (select sid from v$mystat where rownum <= 1);
*/

CREATE OR REPLACE package utility.runstats_pkg
as
  TYPE print_tab     IS TABLE OF varchar2(200);
  --l_print dbms_sql.VARCHAR2_TABLE; 
    procedure rs_start;
    procedure rs_middle;
    procedure rs_stop( p_difference_threshold in number default 0 );
    procedure rs_report( p_difference_threshold in number default 0, p_host in varchar2 default Null );
  function rs_report( p_difference_threshold in number default 0, p_host in varchar2 default Null )
  return print_tab PIPELINED DETERMINISTIC;
end;
/

CREATE OR REPLACE package body utility.runstats_pkg
as

g_start number;
g_run1  number;
g_run2  number;
g_host varchar2(30);
g_ip varchar2(30);

procedure rs_start
is
begin

  g_host := substr(SYS_CONTEXT('USERENV','TERMINAL'),1,30);
  g_ip := substr(SYS_CONTEXT('USERENV','IP_ADDRESS'),1,30);
  delete from run_stats_save where hostname = g_host;

    execute immediate 'truncate table run_stats';
    delete from run_stats;

    insert into run_stats
    select 'before', stats.* from stats;

    g_start := dbms_utility.get_time;
end;

procedure rs_middle
is
begin
    g_run1 := (dbms_utility.get_time-g_start);

    insert into run_stats
    select 'after 1', stats.* from stats;
    g_start := dbms_utility.get_time;

end;

procedure rs_stop(p_difference_threshold in number default 0)
is
begin
    g_run2 := (dbms_utility.get_time-g_start);

    dbms_output.put_line
    ( 'Run1 ran in ' || g_run1 || ' hsecs' );
    dbms_output.put_line
    ( 'Run2 ran in ' || g_run2 || ' hsecs' );
    dbms_output.put_line
    ( 'run 1 ran in ' || round(g_run1/g_run2*100,2) ||
      '% of the time' );
    dbms_output.put_line( chr(9) );

    insert into run_stats
    select 'after 2', stats.* from stats;

    insert into run_stats_save (RUNID,NAME,VALUE,IP,HOSTNAME)
    select RUNID,NAME,VALUE, g_ip, g_host from run_stats;

    commit;

    dbms_output.put_line
    ( rpad( 'Name', 40 ) || lpad( 'Run1', 14 ) ||
      lpad( 'Run2', 14 ) || lpad( 'Diff', 14 ) );

    for x in
    ( select rpad( a.name, 40 ) ||
             to_char( b.value-a.value, '99999,999,999' ) ||
             to_char( c.value-b.value, '99999,999,999' ) ||
             to_char( ( (c.value-b.value)-(b.value-a.value)), '99999,999,999' ) data
        from run_stats a, run_stats b, run_stats c
       where a.name = b.name
         and b.name = c.name
         and a.runid = 'before'
         and b.runid = 'after 1'
         and c.runid = 'after 2'
         -- and (c.value-a.value) > 0
         and abs( (c.value-b.value) - (b.value-a.value) )
               > p_difference_threshold
       order by abs( (c.value-b.value)-(b.value-a.value)), data
    ) loop
        dbms_output.put_line( x.data );
    end loop;

    dbms_output.put_line( chr(9) );
    dbms_output.put_line
    ( 'Run1 latches total versus run2 -- difference and pct' );
    dbms_output.put_line
    ('.'|| lpad( 'Run1', 14 ) || lpad( 'Run2', 14 ) ||
      lpad( 'Diff', 14 ) || lpad( 'Pct', 11 ) );

    for x in
    ( select '.'||
             to_char( run1, '99999,999,999' ) ||
             to_char( run2, '99999,999,999' ) ||
             to_char( diff, '99999,999,999' ) ||
             to_char( round( run1/run2*100,2 ), '99,999.99' ) || '%' data
        from ( select sum(b.value-a.value) run1, sum(c.value-b.value) run2,
                      sum( (c.value-b.value)-(b.value-a.value)) diff
                 from run_stats a, run_stats b, run_stats c
                where a.name = b.name
                  and b.name = c.name
                  and a.runid = 'before'
                  and b.runid = 'after 1'
                  and c.runid = 'after 2'
                  and a.name like 'LATCH%'
                )
    ) loop
        dbms_output.put_line( x.data );
    end loop;
end;

procedure rs_report( p_difference_threshold in number default 0, p_host in varchar2 default Null )
is
begin
  g_host := substr(SYS_CONTEXT('USERENV','TERMINAL'),1,30);
  g_ip := substr(SYS_CONTEXT('USERENV','IP_ADDRESS'),1,30);

    g_run2 := (dbms_utility.get_time-g_start);

    dbms_output.put_line
    ( 'Run1 ran in ' || g_run1 || ' hsecs' );
    dbms_output.put_line
    ( 'Run2 ran in ' || g_run2 || ' hsecs' );
    dbms_output.put_line
    ( 'run 1 ran in ' || round(g_run1/g_run2*100,2) ||
      '% of the time' );
    dbms_output.put_line( chr(9) );

    dbms_output.put_line
    ( rpad( 'Name', 40 ) || lpad( 'Run1', 14 ) ||
      lpad( 'Run2', 14 ) || lpad( 'Diff', 14 ) );

    for x in
    ( select rpad( a.name, 40 ) ||
             to_char( b.value-a.value, '99999,999,999' ) ||
             to_char( c.value-b.value, '99999,999,999' ) ||
             to_char( ( (c.value-b.value)-(b.value-a.value)), '99999,999,999' ) data
        from run_stats_save a, run_stats_save b, run_stats_save c
       where a.name = b.name
         and b.name = c.name
         and a.runid = 'before'
         and b.runid = 'after 1'
         and c.runid = 'after 2'
         -- and (c.value-a.value) > 0
         and abs( (c.value-b.value) - (b.value-a.value) )
               > p_difference_threshold
         and a.hostname = g_host
         and a.hostname = b.hostname
         and a.hostname = c.hostname
       order by abs( (c.value-b.value)-(b.value-a.value))
    ) loop
        dbms_output.put_line( x.data );
    end loop;

    dbms_output.put_line( chr(9) );
    dbms_output.put_line
    ( 'Run1 latches total versus run2 -- difference and pct' );
    dbms_output.put_line
    ( lpad( 'Run1', 14 ) || lpad( 'Run2', 14 ) ||
      lpad( 'Diff', 14 ) || lpad( 'Pct', 11 ) );

    for x in
    ( select to_char( run1, '99999,999,999' ) ||
             to_char( run2, '99999,999,999' ) ||
             to_char( diff, '99999,999,999' ) ||
             to_char( round( run1/run2*100,2 ), '99,999.99' ) || '%' data
        from ( select sum(b.value-a.value) run1, sum(c.value-b.value) run2,
                      sum( (c.value-b.value)-(b.value-a.value)) diff
                 from run_stats_save a, run_stats_save b, run_stats_save c
                where a.name = b.name
                  and b.name = c.name
                  and a.runid = 'before'
                  and b.runid = 'after 1'
                  and c.runid = 'after 2'
                  and a.name like 'LATCH%'
         and a.hostname = g_host
         and a.hostname = b.hostname
         and a.hostname = c.hostname
               )
    ) loop
        dbms_output.put_line( x.data );
    end loop;
end;

-- select * from TABLE(runstats_pkg.rs_report);

function rs_report( p_difference_threshold in number default 0, p_host in varchar2 default Null )
return print_tab PIPELINED DETERMINISTIC
as
begin
  g_host := substr(SYS_CONTEXT('USERENV','TERMINAL'),1,30);
  g_ip := substr(SYS_CONTEXT('USERENV','IP_ADDRESS'),1,30);

    g_run2 := (dbms_utility.get_time-g_start);

    PIPE ROW
    ( 'Run1 ran in ' || g_run1 || ' hsecs' );
    PIPE ROW
    ( 'Run2 ran in ' || g_run2 || ' hsecs' );
    PIPE ROW
    ( 'run 1 ran in ' || round(g_run1/g_run2*100,2) ||
      '% of the time' );
    PIPE ROW( chr(9) );

    PIPE ROW
    ( rpad( 'Name', 40 ) || lpad( 'Run1', 14 ) ||
      lpad( 'Run2', 14 ) || lpad( 'Diff', 14 ) );

    for x in
    ( select rpad( a.name, 40 ) ||
             to_char( b.value-a.value, '99999,999,999' ) ||
             to_char( c.value-b.value, '99999,999,999' ) ||
             to_char( ( (c.value-b.value)-(b.value-a.value)), '99999,999,999' ) data
        from run_stats_save a, run_stats_save b, run_stats_save c
       where a.name = b.name
         and b.name = c.name
         and a.runid = 'before'
         and b.runid = 'after 1'
         and c.runid = 'after 2'
         -- and (c.value-a.value) > 0
         and abs( (c.value-b.value) - (b.value-a.value) )
               > p_difference_threshold
         and a.hostname = g_host
         and a.hostname = b.hostname
         and a.hostname = c.hostname
       order by abs( (c.value-b.value)-(b.value-a.value)), abs(c.value-b.value)
    ) loop
        PIPE ROW( x.data );
    end loop;

    PIPE ROW( chr(9) );
    PIPE ROW
    ( 'Run1 latches total versus run2 -- difference and pct' );
    PIPE ROW
    ( lpad( 'Run1', 14 ) || lpad( 'Run2', 14 ) ||
      lpad( 'Diff', 14 ) || lpad( 'Pct', 11 ) );

    for x in
    ( select to_char( run1, '99999,999,999' ) ||
             to_char( run2, '99999,999,999' ) ||
             to_char( diff, '99999,999,999' ) ||
             to_char( round( run1/run2*100,2 ), '99,999.99' ) || '%' data
        from ( select sum(b.value-a.value) run1, sum(c.value-b.value) run2,
                      sum( (c.value-b.value)-(b.value-a.value)) diff
                 from run_stats_save a, run_stats_save b, run_stats_save c
                where a.name = b.name
                  and b.name = c.name
                  and a.runid = 'before'
                  and b.runid = 'after 1'
                  and c.runid = 'after 2'
                  and a.name like 'LATCH%'
         and a.hostname = g_host
         and a.hostname = b.hostname
         and a.hostname = c.hostname
               )
    ) loop
        PIPE ROW( x.data );
    end loop;
end;

end;
/

grant execute on utility.runstats_pkg to public;
create or replace public synonym runstats_pkg for utility.runstats_pkg; 

/*
--Usage: to benchmark two approaches
--you may just leave approach 2 code part empty, to get resources of code 1 take.

set serveroutput on

execute runStats_pkg.rs_start;
 
execute runStats_pkg.rs_middle;
 
execute runStats_pkg.rs_stop;

begin
runStats_pkg.rs_start;
 for c in ()
 loop
   Null;
 end loop;
runStats_pkg.rs_middle;
 for c in ()
 loop
   Null;
 end loop;
runStats_pkg.rs_stop;

end;

--To get the report after benchmark:
select * from TABLE(runstats_pkg.rs_report);
OR
exec runStats_pkg.rs_report(10);

*/