Tuesday, March 18, 2014

RMAN cold backup gotcha

This is something new I learned today from a DBA at Boeing.

Empty pot is easy to fill. "Stay hungry".

Those days many site only do online hot backup.
But cold backup still have it's suited scenario, e.g. backup dev db and clone.

Here is the script many people used to do.
The problem is when there are some long running active transactions, database instance will wait and not be closed.

shutdown immediate;
startup mount;

Here is the workaround, force disconnect all active sessions, startup in a consistent state, and then close.

ALTER SYSTEM CHECKPOINT; -- I think this is optional.  :) 
shutdown abort;
startup restrict;
shutdown immediate;
startup mount;

Notes: This method is not recommended for production database.


Wednesday, March 05, 2014

Simulate tail -f logger_logs table - instrument PL/SQL

debugf used to be the instrumentation tool on my kit before.

At November 7, 2013, Tom Kyte said: "debug.f is really really really old :) . I recommend Logger to do PL/SQL instrumentation."

To install and use logger 2.0, please see https://github.com/tmuth/Logger---A-PL-SQL-Logging-Utility

There are 2 good debugf features that Logger 2.0 doesn't have.

1. tail -f debug_trace_file.
2. debug.f("%s %s", l_a, l_b); -- similar to C printf format string.

Today I will show you how to simulate "tail -f" on logger_logs table.

Call example


set linesize 120

select * from table(u$log.log_tail);

--Test, open another session.


 for i in 1..15
   logger.log(i||' : '||rpad('.',i,'.'));
 end loop;

To exit, press Ctrl + C.

Setup and installation.


connect sys

grant execute on dbms_lock to utility;

create or replace view utility.logger_tail_v
  EXTRA log_text, id
from logger_logs;


alter package utility.u$log compile;
grant execute on utility.u$log to public;
create or replace public synonym u$log for utility.u$log;

-- Create Package u$log.  file u$log.sql


tail -f logger_logs;

  cursor gc_log_src is
  select log_text,id  from logger_tail_v;

  TYPE log_src_tab IS TABLE OF gc_log_src%ROWTYPE;
  FUNCTION log_tail return log_src_tab PIPELINED DETERMINISTIC;


CREATE OR REPLACE PACKAGE BODY                 u$log
Goal: 1. tail -f logger_logs table,

Who| Date         | What
Charlie(Yi) | 12-June-2008 | Create the package.

alter package utility.u$log compile;
grant execute on utility.u$log to public;
create or replace public synonym u$log for utility.u$log;

| Copyright Information Here
| File name:
| Overview:
| Author(s):
| Modification History:
|  Date        Who          What
| 23-Dec-2013: Charlie(Yi): Instrument with Logger, to replace debugf,
| 08-Nov-2010: Charlie(Yi): Create the function.

It meets user requirements
It is maintainable
It runs fast enough to minimize user frustration

!!! Pl/SQL development workflow, Four steps of preparing an application !!!

 - Validate program requirements
  # ask lots of questions
  # what users ask for is not always the easiest way to solve a problem
  # consider other approaches, include business processes and programming algorithms
 - Implement header of the program
  # good name for the program, accurately represent the purpose of the program
  # inputs and outputs
  # overload sub-procedure ?
 - Define the test cases
  # Verify it works
  # how will I know when I am done with this program
 - Build test code

 gs_module varchar2(30);

 PROCEDURE initialize_pkg
   gs_module := 'package_name';
   --gc_scope_prefix constant VARCHAR2(31) := lower($$PLSQL_UNIT) || '.';
   gs_module := lower($$PLSQL_UNIT);
   DBMS_APPLICATION_INFO.SET_MODULE( gs_module,'init' );
 END initialize_pkg;

 PROCEDURE commit_tx
    logger.log('soft commit');
 END commit_tx;

 FUNCTION log_tail
   return log_src_tab PIPELINED DETERMINISTIC
 | Copyright Information Here
 | File name:
 | Overview: simulate tail -f on logger_log table.

Call example:

select * from table(u$log.log_tail);

select text,id from table(u$log.log_tail);

 | Author(s): Charlie Yi Zhu 木匠.
 | Modification History:
 Date         Who          What
 04-Mar-2014: Charlie(Yi): Create the table function.


  l_scope logger_logs.scope%type;
  l_params logger.tab_param;
  ls_action varchar2(30);
  l_cnt simple_integer := 0;
  l_row_cnt pls_integer;
  l_piece simple_integer := 0;
  l_chunk simple_integer := 10000;
  l_max_id number; l_int_id number;

  PROCEDURE initialize
    ls_action := 'log_tail';
    l_scope := gs_module ||'.'|| ls_action;
    DBMS_APPLICATION_INFO.SET_module(gs_module, ls_action);
    DBMS_APPLICATION_INFO.set_client_info( gs_module||'.'||ls_action );
    l_row_cnt := 0;
    logger.log('START', l_scope);
  END initialize;

  PROCEDURE cleanup
     DBMS_APPLICATION_INFO.set_client_info( ' ' );
     logger.log('END', l_scope);
  END cleanup;

  Main body of program

  logger.append_param(l_params, 'l_cnt', l_cnt);
  logger.log('xxxxx', l_scope, null, l_params);

  select Nvl(Max(id),0) into l_max_id from logger_logs;
  logger.log('Start id: '||l_max_id, l_scope, null, l_params);

   FOR lc_stage IN (
    select --+ index_asc(a logger_logs_pk)
     log_text, id
    from logger_tail_v a
    where id > l_max_id
    order by id ASC
    l_int_id := lc_stage.id;
    pipe ROW(lc_stage);
   --logger.log('Last max id: '||l_int_id, l_scope, null, l_params);
   l_max_id := l_int_id;
  End Loop;

  logger.log(l_row_cnt||' rows returned', l_scope, null, l_params);


  when NO_DATA_NEEDED then
     /* Don't forget to clean up here, too! */

     /* Standard error logging mechanism */
     logger.log_error('Unhandled Exception', l_scope, null, l_params);
 END log_tail;


Monday, February 10, 2014

Design to plan database capacity


The database resources is limited on CPU, Memory and IO. We have to do capacity planning well ahead. That includes transaction arrival rate, Transaction's SQL resource usage: LIO / CPU / Memory / Redo / Undo, Lock, Latch,...etc. We have to design and build a synthetic transaction workload benchmark to get these matrices.
Also "We have really got to think about the data and how it will be used, how it will be accessed so that we can utilize the optimal structure for that.
Make sure that the database organizes the data in a way that’s efficient for us to retrieve that data."
As we talked, here is the database criteria need to be filled before build a new project involving database process, to get started.
(Actually, we'd better consider it from the day one design the system for the new project)

  • I. Function priority, asynchronous batch process OK?
  • II. work load ( Planning the performance )
    • II.i. Data Usage
      • a1. UI query and OLTP transaction response time
      • a2. Batch throughput
  • III. Growth and Archive - Data lifecycle management
  • Target
    • Capacity plan
    • Data life cycle management
    • SLA

I. Function priority, asynchronous batch process OK?

 #) Downtime allowed
   - -) HA implementation (physical standby or Stream replication or RMAN backup and recovery or logical data dump expdp/impdp)
 #) Process delay/latency allowed or data consistency delay/latency allowed
   - -) to select an ACID or BASE solution

II. work load ( Planning the performance )

  #) OLTP: average and peak executions per hour per day and average/peak response time requirement
  #) Batch: throughput requirement, see System Architecture Model
See Define+database+application+SLA

II.i. Data Usage

a1. UI query and OLTP transaction response time

select ename
from emp
where job='CLERK';
Arriaval rate
Peak season
SLA 90%
SLA 99.9%
3 / minute

< 2 seconds
< 4 seconds
20 / minute
11am ~ 2pm
< 3.5 seconds
< 5 seconds

a2. Batch throughput

INSERT INTO emp_stats(job, num_emp)
select job, count(*)
from emp
where HIREDATE between to_date('2011/01/01','yyyy/mm/dd') and to_date('2011/02/01','yyyy/mm/dd');
Arriaval rate
Rows per Exec(Volume)
Peak season
Throughput SLA
2 times / day

1pm, 10pm
50,000 rows
2 times / day
1pm, 10pm
2,000,000 rows

III. Growth and Archive - Data lifecycle management

 #) Growth rate, E.g. 30MB/Day
 #) Data retention and archive plan, E.g. 1 year working data, 3 years reporting data in DW, 10 years archive data in Tape/DVD.


Based on the requirements, the architects are able to design a solution.
We will select one data process pattern(good practice) or design a new one to meet the SLA and business objective.

Capacity plan

We will plan our hardware resource to meet the above SLA.

Data life cycle management

It is not necessary to put everything into our expensive OLTP system, it tightly bind to SLA.
Usage Profile
Data Retention
Write/Read hot
Last 2 months
priority one
99.999% HA, 7 x 24
Most Expensive
Read hot
Last 2 ~ 6 months
priority 2
99.99% HA, 7 x 24
Expensive, copy to DW
Read only cooler
Last 6 ~ 36 months
priority 3
nearline storage 99% HA, 5 x 8
Read only cold
Older than 3 years
priority 4
move to DVD or tape, only mount online when needed.
Older than 9 years
priority 5
Keep aggregate data and Drop it
There are many requirements to effectively manage the data lifecycle. Effective and efficient management of the lifecycle will reduce both the hardware and management costs associated with the database facility, and improve overall database performance. To begin the process, the lifecycle for each events/fact table data stream must be understood. This includes defining who will ultimately use the data, and how they will use it. Based upon the usage profile, the characteristics of the life of the data can be quantified.


Data Usage Over Time
Description of Figure 5-2 follows

Data Lifecycle
Description of Figure 5-3 follows

Tuesday, January 14, 2014

How bad is the network round trip when accessing database.

It is a common good practice to save the network round trips when design and develop database applications.
But how bad it is? we need to setup a synthetic benchmark to find out.

My configuration is Oracle database 12.1, Linux 6.4, 64bit, on my laptop virtual box.
There are 2 database servers in one data center. To query remote table through database link.
5000 rows table t2, copy data from view all_objects.

Here is the benchmark results:

  • 1 rows, query 1000 times.

6.39 seconds v.s. 0.02 seconds. 10M network.
1.09 seconds v.s. 0.02 seconds. 100M network.

  • 1000 rows, row by row fetch v.s. bulk fetch. 100M network.

0.44 seconds : 0 seconds.

So the network latency is between 0.00044 and 0.0011 seconds per round trip.

For example, we have a system process 300 orders per second, run 300 SQL SELECT, 33% time are used for network.
It is considerable amount of time.

Here is the benchmark test setup, so you can reproduce it on your servers.




drop public database link dev1_link;
create public database link dev1_link
connect to scott
identified by abc
using 'mac';

drop public database link dev1_link;
create public database link dev1_link
connect to scott
identified by abc
using 'local';

drop table t2 purge;
create table t2 as select * from all_objects where 1=0;

-- Test 1, remove v.s. local database,

set serveroutput on
 l_cnt number;
 li number;
 g_run1 number;
 l_cnt := 1000;

g_run1 := dbms_utility.get_time;  --This function determines the current time in 100th's of a second.
for i in 1..l_cnt
  select count(*) into li from t2@dev1_link where rownum < 1;
end loop;
g_run1 := (dbms_utility.get_time-g_run1);
dbms_output.put_line( 'remote query ran in '||g_run1);

g_run1 := dbms_utility.get_time;
for i in 1..l_cnt
  select count(*) into li from t2 where rownum < 1;
end loop;
g_run1 := (dbms_utility.get_time-g_run1);
dbms_output.put_line( 'local query ran in '||g_run1);


remote query ran in 105
local query ran in 1

set autot on

          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        543  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

-- Test 2, Bulk Fetch, both from remote db,

drop table t2 purge;
create table t2 as select * from all_objects;

set serveroutput on
 l_cnt number;
 li number;
 g_run1 number;
 cursor c1 is select * from t2@dev1_link where rownum <= 1000;
 lr1 c1%rowtype;
 type tab1 is table of c1%rowtype;
 lt1 tab1;
 l_cnt := 1000;

g_run1 := dbms_utility.get_time;  --This function determines the current time in 100th's of a second.

open c1;
  fetch c1 into lr1;
  exit when c1%notfound;
end loop;
g_run1 := (dbms_utility.get_time-g_run1);
dbms_output.put_line( 'remote query ran in '||g_run1);

g_run1 := dbms_utility.get_time;
fetch c1 bulk collect into lt1;
g_run1 := (dbms_utility.get_time-g_run1);
dbms_output.put_line( 'local query ran in '||g_run1);


remote query ran in 44

local query ran in 0


Let me know your benchmark results, the ball park.

Charlie | Database Architect Developer

Tuesday, January 07, 2014

Understanding business goal and process logic

Every time when there is a problem, and people come to me to ask solutions, or to give solutions, I must ask, as always, could someone describe the business goal?
Plus current data flow and data process logic would be better.
To help me understanding the problem we're trying to solve, to figure out what the problem really is.

A solid understanding of the goal of ABC.

“And most importantly of all - a solid understanding of the goal, of what the question is.  Tuning a query or process is really hard (impossible I would say) - unless we understand the question in the first place.  I cannot tell you how many times I've not been able to tune a query until I had the question in hand.  Certainly we can derive a question from a query - however, many times that derived question is much more confining than the real question being asked.  For example, many people use outer joins in all queries - they are "afraid" of losing a row (perhaps they got "burned" in some past experience and now use outer joins everywhere).  If the objects are related in a one to one mandatory fashion - we don't need an outer join at all.  The question derived from the query is much more confining than reality.” - AskTom

To help us better understanding an application business goal and application logic, could you please help us to find out below details?

Some follow up questions will be asked, or one or more teleconference meetings will be helpful.
And then some design options can be given from there.

/******* Question Start *******/

Business goal.

  • What is the goal of ABC system? Please describe the problem that ABC is try to solve. The big picture.
    • For example: Goal: Replicate data from MySQL to Oracle.
  • What is the goal for each module?
    • For example: Goal of module A: Capture incremental changes from MySQL.

Application logic and data flow for each transaction module.

  • What are the input, process, output for each step?
  • Which step could go wrong?
  • What would be the data look like after fail?
  • How often it goes wrong? e.g. 3 times a day.
  • What are the limitations? e.g. cannot process cash orders.
  • How to handle failure/exception?  e.g. rollback.
  • Single process single connection, or multiple concurrent processes and multi-connections ?
    • publisher => subscription
    • master/coordinator => slave workers
  • What is performance SLA?
    • Response time. E.g. 90% < 2 seconds, 99.9% < 5 seconds per transaction.
    • Transaction volume. E.g. process 5000 orders/hours.
  • What is availability SLA? E.g. Max to 8 hours downtime a year.

Capture all SQL called in applicate by SQL Session Trace tool.

For all transactions that access table user_abc.table_a,

(Another email will be sent to show you how to capture all SQL)

For example, the shopping cart checkout transaction would be:

Business goal: Ensure the online shopping process clear, fast, simple ...

The goal for each module:


Application logic and data flow.

/*  create_order */
SELECT product_id, qty, ... FROM shopping_cart_item WHERE shopping_cart_id = 123
INSERT INTO order ...VALUES (...);
INSERT INTO order_item ...VALUES (...);
INSERT INTO order_audit ...VALUES (...);
DELETE shopping_cart ... WHERE shopping_cart_id = 123;
DELETE shopping_cart_item ... WHERE shopping_cart_id = 123;
UPDATE inventory SET qty = qty - 1 WHERE item_id = 7788123;

 /* create_finance_journal */
  INSERT into po_journal...VALUES (...);
  INSERT into sub_ledger ...VALUES (...);

 /* notify_buyer, call 3rd party API */

/******* End *******/

I really appreciate your effort.

Design is to finding the problem, not solution.

We’ll be doing this for performance, scalability, security and so on.
We'll meet heavy resistance, big time - I am sure.
So we always be able to talk "numerously" - in numbers.

For example:

"if we do this ETL process slow by slow (row by row) and take just 1ms per row (pretty fast don't you think?) and you have to process 5,000,000 rows - it will take almost 1.5 hours. When we get to 50 million rows - it'll be over a half a day. When we get to 500 million rows - it will take a WEEK to process. if I am allowed to do it in the database in a single DDL statement to transform the data, I can be done with 500 million rows in 2 minutes."

See my blog …, for the proof of why row by row process is slow by slow process, and why set based process is much more better than row by row process.


Charlie | Database Architect Developer