Dominic Brooks

Subscribe to Dominic Brooks feed Dominic Brooks
Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)
Updated: 11 hours 23 min ago

Row Cache Object Clones – ORA-00600 [kkdccui2]

Wed, 2022-06-29 11:45

We hit a serious problem about 5 months ago when we upgraded one of the development databases from 19.12 to 19.13. Going back to 19.12 solved the problem so definitely for us something introduced in 19.13

Nothing out there at the time on Oracle Support or from other reliable sources.

Our problem happened under row cache load which in particular effected our commit builds when there was heavy dropping and creating of objects across a number of schemas.

Best indication so far is that this is related to the cloning feature of row cache objects, present in 19.13, new feature with aims to automatically create clones of row cache objects if it detects that the object is accessed frequently.

Apparently many bugs related to this feature have already been addressed but suggestion is that disabling the feature might prevent the issue, via the snappily named parameter _bug33046179_kqr_hot_copy_sleep_limit = 0.

Will follow-up with comment on effectiveness.

Partition List Iterator occasionally reading way too many partitions still

Fri, 2022-06-24 06:06

I’m triple posting this on oracle-l, my blog and on Oracle community to get some more help.
Apologies if you’ve tried to help before and it didn’t come to anything.

This is a redux of an old issue which I looked at 18 months ago but with a new example of the problem.
https://orastory.wordpress.com/2021/02/22/oracle-19c-upgrade-query-reading-way-more-partitions-than-it-should/
https://community.oracle.com/tech/developers/discussion/4480886/partition-list-iterator-reading-too-many-partitions/

When the problem originally surfaced it was after an upgrade from 11.2.0.4 to 19.6. Now version is 19.12.

Now I have one example which gets executed four or five times a day
The example has two sql ids involved in the process but both are almost identical and have identical sql plans.

Queries in question get executed after specific data loads.
For each data load where it gets executed, it will be doing pretty much the same workload day-on-day.
The query gets supplied a collection which tends to have one or a couple of rows row and each row in the collection provides the values to prune to a specific subpartition.

The issue is that about once a week, an execution will go slow.
Exactly the same sql plan each execution.
When it works, it runs in a couple of seconds max.
When it goes wrong, the partition list iterator decides to read every partition/subpartition.

This is the snippet of the plan in question:

|   4 |     NESTED LOOPS                          |                             |    39 |  4017 |   334   (2)| 00:00:01 |       |       |
|   5 |      COLLECTION ITERATOR PICKLER FETCH    |                             |     1 |     2 |    29   (0)| 00:00:01 |       |       |
|   6 |      PARTITION LIST ITERATOR              |                             |  3853 |   380K|   305   (2)| 00:00:01 |   KEY |   KEY |
|   7 |       PARTITION LIST ITERATOR             |                             |  3853 |   380K|   305   (2)| 00:00:01 |   KEY |   KEY |
|   8 |        TABLE ACCESS STORAGE FULL          | POSITION_BALANCE            |  3853 |   380K|   305   (2)| 00:00:01 |   KEY |   KEY |

There was some speculation about possible causes in the OTN thread.
The best guess was that there is some timeout or invalidation or event or contention which means Oracle decides that it cannot reliably prune.
There might be other data loads ongoing.
There could be a partition operation (add new or exchange) which coincides with the degraded iterator performance.
There are > 70k partitions / subpartitions.


select count(*) from dba_tab_partitions where table_name = 'POSITION_BALANCE';
COUNT(*) ---------- 60941

select count(*), sum(case when segment_created = 'YES' then 1 else 0 end) from dba_tab_subpartitions where table_name = 'POSITION_BALANCE';
COUNT(*) SUM(CASEWHENSEGMENT_CREATED='YES'THEN1ELSE0END) ---------- ----------------------------------------------- 78154 33705

I’m reasonably comfortable that what I’ve described is roughly what’s happening.
ASH confirms that it’s reading lots of subpartitions
Just got no idea why and how to figure that out.
The average-ish once-a-week occurence makes it tricker.

Even though when it runs in runs in a couple of seconds, I do have some of the ASH samples that show this.

select /*+ parallel(4) */ instance_number, sql_id, sql_plan_hash_value, sql_full_plan_hash_value, sql_exec_id, sql_exec_start, min(sample_time), max(sample_time), max(sample_time) - min(sample_time) duration, count(*) cnt_samples
, sum(case when in_sql_execution = 'Y' then 1 else 0 end) cnt_in_sql_execution
, sum(case when in_parse = 'Y' then 1 else 0 end) cnt_in_parse
, sum(case when in_hard_parse = 'Y' then 1 else 0 end) cnt_in_hard_parse
from dba_hist_active_sess_history h
where  1=1 
and h.user_id IN (select user_id from dba_users where username like '%TODS%')
and sql_id in ('fw985yayy1mfz','0gk5hj9wnp2vs')
--and sql_id = 'ft2prc1xx2hmd'
and sql_exec_id is not null
group by instance_number, sql_id, sql_plan_hash_value, sql_full_plan_hash_value, sql_exec_id, sql_exec_start
order by h.sql_exec_start;


INSTANCE_NUMBER SQL_ID        SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_EXEC_ID SQL_EXEC_START       MIN(SAMPLE_TIME)                MAX(SAMPLE_TIME)                DURATION            CNT_SAMPLES CNT_IN_SQL_EXECUTION CNT_IN_PARSE CNT_IN_HARD_PARSE
--------------- ------------- ------------------- ------------------------ ----------- -------------------- ------------------------------- ------------------------------- ------------------- ----------- -------------------- ------------ -----------------
              4 0gk5hj9wnp2vs          2221895109               2867134094    67108864 23-MAY-2022 09:18:24 23-MAY-22 09.18.27.378000000 AM 23-MAY-22 09.18.27.378000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331648 24-MAY-2022 07:59:49 24-MAY-22 07.59.50.659000000 AM 24-MAY-22 07.59.50.659000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554432 25-MAY-2022 07:09:07 25-MAY-22 07.09.08.709000000 AM 25-MAY-22 07.09.08.709000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554434 25-MAY-2022 09:41:53 25-MAY-22 09.41.59.718000000 AM 25-MAY-22 12.41.49.605000000 PM +00 02:59:49.887000        1054                 1054            0                 0
              2 fw985yayy1mfz          2221895109               2867134094    33554434 25-MAY-2022 10:57:41 25-MAY-22 10.57.49.221000000 AM 25-MAY-22 01.56.37.861000000 PM +00 02:58:48.640000        1048                 1048            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331648 25-MAY-2022 11:41:04 25-MAY-22 11.41.05.539000000 AM 25-MAY-22 11.41.05.539000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331648 27-MAY-2022 08:01:28 27-MAY-22 08.01.30.371000000 AM 27-MAY-22 08.01.30.371000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331648 27-MAY-2022 08:01:38 27-MAY-22 08.01.40.611000000 AM 27-MAY-22 08.01.40.611000000 AM +00 00:00:00.000000           1                    1            0                 0
              1 0gk5hj9wnp2vs          2221895109               2867134094    16777218 01-JUN-2022 06:48:24 01-JUN-22 06.48.27.979000000 AM 01-JUN-22 09.48.17.547000000 AM +00 02:59:49.568000        1054                 1054            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331649 01-JUN-2022 09:27:03 01-JUN-22 09.27.10.915000000 AM 01-JUN-22 09.27.10.915000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331648 02-JUN-2022 07:11:55 02-JUN-22 07.11.57.315000000 AM 02-JUN-22 07.11.57.315000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331648 07-JUN-2022 07:04:13 07-JUN-22 07.04.17.155000000 AM 07-JUN-22 07.04.17.155000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554434 07-JUN-2022 08:17:56 07-JUN-22 08.17.58.693000000 AM 07-JUN-22 11.17.49.285000000 AM +00 02:59:50.592000        1054                 1054            0                 0
              1 0gk5hj9wnp2vs          2221895109               2867134094    16777216 08-JUN-2022 07:09:22 08-JUN-22 07.09.24.427000000 AM 08-JUN-22 07.09.24.427000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554432 08-JUN-2022 08:14:25 08-JUN-22 08.14.26.278000000 AM 08-JUN-22 08.14.26.278000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331649 09-JUN-2022 06:51:27 09-JUN-22 06.51.29.219000000 AM 09-JUN-22 06.51.29.219000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331648 10-JUN-2022 11:10:13 10-JUN-22 11.10.14.595000000 AM 10-JUN-22 11.10.14.595000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554433 13-JUN-2022 08:45:43 13-JUN-22 08.45.45.509000000 AM 13-JUN-22 08.45.45.509000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 fw985yayy1mfz          2221895109               2867134094    33554433 13-JUN-2022 08:45:55 13-JUN-22 08.45.55.749000000 AM 13-JUN-22 08.45.55.749000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331648 15-JUN-2022 07:08:19 15-JUN-22 07.08.22.275000000 AM 15-JUN-22 07.08.22.275000000 AM +00 00:00:00.000000           1                    1            0                 0
              3 0gk5hj9wnp2vs          2221895109               2867134094    50331649 15-JUN-2022 09:08:54 15-JUN-22 09.08.56.387000000 AM 15-JUN-22 12.08.46.083000000 PM +00 02:59:49.696000        1054                 1054            0                 0
              1 fw985yayy1mfz          2221895109               2867134094    16777216 16-JUN-2022 06:54:04 16-JUN-22 06.54.05.259000000 AM 16-JUN-22 06.54.05.259000000 AM +00 00:00:00.000000           1                    1            0                 0
              1 0gk5hj9wnp2vs          2221895109               2867134094    16777217 16-JUN-2022 08:54:13 16-JUN-22 08.54.18.891000000 AM 16-JUN-22 11.54.08.844000000 AM +00 02:59:49.953000        1054                 1054            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331649 16-JUN-2022 11:16:29 16-JUN-22 11.16.31.491000000 AM 16-JUN-22 11.16.31.491000000 AM +00 00:00:00.000000           1                    1            0                 0
              4 0gk5hj9wnp2vs          2221895109               2867134094    67108865 17-JUN-2022 07:33:10 17-JUN-22 07.33.11.282000000 AM 17-JUN-22 07.33.11.282000000 AM +00 00:00:00.000000           1                    1            0                 0
              4 fw985yayy1mfz          2221895109               2867134094    67108865 17-JUN-2022 07:33:20 17-JUN-22 07.33.21.522000000 AM 17-JUN-22 07.33.21.522000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554432 21-JUN-2022 06:59:24 21-JUN-22 06.59.25.728000000 AM 21-JUN-22 06.59.25.728000000 AM +00 00:00:00.000000           1                    1            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554433 21-JUN-2022 09:14:05 21-JUN-22 09.14.10.464000000 AM 21-JUN-22 12.14.00.096000000 PM +00 02:59:49.632000        1054                 1054            0                 0
              2 0gk5hj9wnp2vs          2221895109               2867134094    33554434 21-JUN-2022 11:07:36 21-JUN-22 11.07.44.480000000 AM 21-JUN-22 02.07.23.680000000 PM +00 02:59:39.200000        1053                 1053            0                 0
              4 fw985yayy1mfz          2221895109               2867134094    67108865 21-JUN-2022 11:43:04 21-JUN-22 11.43.13.010000000 AM 21-JUN-22 02.42.33.651000000 PM +00 02:59:20.641000        1051                 1051            0                 0
              4 fw985yayy1mfz          2221895109               2867134094    67108866 21-JUN-2022 12:06:49 21-JUN-22 12.06.57.586000000 PM 21-JUN-22 03.06.38.514000000 PM +00 02:59:40.928000        1053                 1053            0                 0
              3 fw985yayy1mfz          2221895109               2867134094    50331649 23-JUN-2022 10:05:02 23-JUN-22 10.05.05.667000000 AM 23-JUN-22 01.04.45.251000000 PM +00 02:59:39.584000        1053                 1053            0                 0
              1 0gk5hj9wnp2vs          2221895109               2867134094    16777216 23-JUN-2022 10:57:04 23-JUN-22 10.57.04.857000000 AM 23-JUN-22 10.57.04.857000000 AM +00 00:00:00.000000           1                    1            0                 0

This is a section of a RTSM report when it was ok:

Status                                 :  DONE (ALL ROWS)                                    
 SQL ID                                 :  0gk5hj9wnp2vs                                      
 SQL Execution ID                       :  16777216                                           
 Execution Started                      :  06/15/2022 10:45:41                                
 First Refresh Time                     :  06/15/2022 10:45:44                                
 Last Refresh Time                      :  06/15/2022 10:45:44   

SQL Plan Monitoring Details (Plan Hash Value=2221895109)
===============================================================================================================================================================================================
| Id |                  Operation                  |            Name             |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity | Activity Detail |
|    |                                             |                             | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |   (# samples)   |
===============================================================================================================================================================================================
|  0 | SELECT STATEMENT                            |                             |         |      |         1 |     +3 |     1 |     2255 |      |       |     . |          |                 |
|  1 |   TEMP TABLE TRANSFORMATION                 |                             |         |      |         1 |     +3 |     1 |     2255 |      |       |     . |          |                 |
|  2 |    LOAD AS SELECT (CURSOR DURATION MEMORY)  | SYS_TEMP_5FD9DBC0F_97EDDA7B |         |      |         1 |     +3 |     1 |        1 |      |       |   1MB |          |                 |
|  3 |     FILTER                                  |                             |         |      |         1 |     +3 |     1 |     2255 |      |       |     . |          |                 |
|  4 |      NESTED LOOPS                           |                             |      39 |  713 |         1 |     +3 |     1 |     243K |      |       |     . |          |                 |
|  5 |       COLLECTION ITERATOR PICKLER FETCH     |                             |       1 |   29 |         1 |     +3 |     1 |        4 |      |       |     . |          |                 |
|  6 |       PARTITION LIST ITERATOR               |                             |    3853 |  684 |         1 |     +3 |     4 |     243K |      |       |     . |          |                 |
|  7 |        PARTITION LIST ITERATOR              |                             |    3853 |  684 |         1 |     +3 |     4 |     243K |      |       |     . |          |                 |
|  8 |         TABLE ACCESS STORAGE FULL           | POSITION_BALANCE            |    3853 |  684 |         1 |     +3 |     4 |     243K |   71 |  33MB |     . |          |                 |
|  9 |      NESTED LOOPS                           |                             |       1 |    9 |         1 |     +3 | 67382 |      140 |      |       |     . |          |                 |
| 10 |       NESTED LOOPS                          |                             |       1 |    7 |         1 |     +3 | 67382 |      140 |      |       |     . |          |                 |
| 11 |        NESTED LOOPS                         |                             |       1 |    3 |         1 |     +3 | 67382 |    67382 |      |       |     . |          |                 |
| 12 |         TABLE ACCESS BY INDEX ROWID         | CLASS_SCHEME                |       1 |    1 |         2 |     +2 | 67382 |    67382 |      |       |     . |          |                 |
| 13 |          INDEX UNIQUE SCAN                  | UK_CLASS_SCHEME_CODE        |       1 |      |         1 |     +3 | 67382 |    67382 |      |       |     . |          |                 |
| 14 |         TABLE ACCESS BY INDEX ROWID BATCHED | CLASS                       |       1 |    2 |         1 |     +3 | 67382 |    67382 |      |       |     . |          |                 |
| 15 |          INDEX RANGE SCAN                   | UK_CLASS_STATUS2            |       1 |    1 |         1 |     +3 | 67382 |    67382 |      |       |     . |          |                 |
| 16 |        TABLE ACCESS BY INDEX ROWID BATCHED  | BOOK_CLASS                  |       1 |    4 |         1 |     +3 | 67382 |      140 |      |       |     . |          |                 |
| 17 |         INDEX RANGE SCAN                    | UK_BOOK_CLASS_STATUS2       |       2 |    2 |         1 |     +3 | 67382 |     336K |  230 |   2MB |     . |          |                 |
| 18 |       INDEX RANGE SCAN                      | IDX_BOOK_STATUS_01          |       1 |    2 |         1 |     +3 |   140 |      140 |      |       |     . |          |                 |
| 19 |    VIEW                                     |                             |       1 |    2 |         1 |     +3 |     1 |     2255 |      |       |     . |          |                 |
| 20 |     TABLE ACCESS STORAGE FULL               | SYS_TEMP_5FD9DBC0F_97EDDA7B |       1 |    2 |           |        |     1 |          |      |       |     . |          |                 |
===============================================================================================================================================================================================

This is a section taken whilst running in the middle of the problem:

Status                                 :  EXECUTING                                          
 SQL ID                                 :  0gk5hj9wnp2vs                                      
 SQL Execution ID                       :  33554434                                           
 Execution Started                      :  06/21/2022 11:07:36                                
 First Refresh Time                     :  06/21/2022 11:07:41                                
 Last Refresh Time                      :  06/21/2022 13:55:15   

SQL Plan Monitoring Details (Plan Hash Value=2221895109)
================================================================================================================================================================================================================================
| Id   |                  Operation                  |            Name             |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  | Mem | Activity |            Activity Detail            | Progress |
|      |                                             |                             | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes |     |   (%)    |              (# samples)              |          |
================================================================================================================================================================================================================================
|    0 | SELECT STATEMENT                            |                             |         |      |           |        |       |          |      |       |   . |          |                                       |          |
|    1 |   TEMP TABLE TRANSFORMATION                 |                             |         |      |           |        |       |          |      |       |   . |          |                                       |          |
|    2 |    LOAD AS SELECT (CURSOR DURATION MEMORY)  | SYS_TEMP_0FD9E59B6_25BEB636 |         |      |           |        |     1 |          |      |       |   . |          |                                       |          |
|    3 |     FILTER                                  |                             |         |      |           |        |     1 |          |      |       |   . |          |                                       |          |
|    4 |      NESTED LOOPS                           |                             |      39 |  710 |         1 |     +5 |     1 |        0 |      |       |   . |          |                                       |          |
| -> 5 |       COLLECTION ITERATOR PICKLER FETCH     |                             |       1 |   29 |     10061 |     +5 |     1 |        1 |      |       |   . |          |                                       |          |
|    6 |       PARTITION LIST ITERATOR               |                             |    3853 |  681 |           |        |     1 |          |      |       |   . |          |                                       |          |
|    7 |        PARTITION LIST ITERATOR              |                             |    3853 |  681 |           |        | 38175 |          |      |       |   . |          |                                       |          |
| -> 8 |         TABLE ACCESS STORAGE FULL           | POSITION_BALANCE            |    3853 |  681 |     10065 |     +1 | 44021 |        0 | 625K | 162GB |   . |   100.00 | gc cr block remote read (1)           |     100% |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | gc cr grant 2-way (26)                |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | gc cr multi block request (85)        |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | gc current grant 2-way (1)            |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | Cpu (7530)                            |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | cell multiblock physical read (1946)  |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | cell single block physical read (215) |          |
|      |                                             |                             |         |      |           |        |       |          |      |       |     |          | cell smart table scan (15)            |          |
|    9 |      NESTED LOOPS                           |                             |       1 |    9 |           |        |       |          |      |       |   . |          |                                       |          |
|   10 |       NESTED LOOPS                          |                             |       1 |    7 |           |        |       |          |      |       |   . |          |                                       |          |
|   11 |        NESTED LOOPS                         |                             |       1 |    3 |           |        |       |          |      |       |   . |          |                                       |          |
|   12 |         TABLE ACCESS BY INDEX ROWID         | CLASS_SCHEME                |       1 |    1 |           |        |       |          |      |       |   . |          |                                       |          |
|   13 |          INDEX UNIQUE SCAN                  | UK_CLASS_SCHEME_CODE        |       1 |      |           |        |       |          |      |       |   . |          |                                       |          |
|   14 |         TABLE ACCESS BY INDEX ROWID BATCHED | CLASS                       |       1 |    2 |           |        |       |          |      |       |   . |          |                                       |          |
|   15 |          INDEX RANGE SCAN                   | UK_CLASS_STATUS2            |       1 |    1 |           |        |       |          |      |       |   . |          |                                       |          |
|   16 |        TABLE ACCESS BY INDEX ROWID BATCHED  | BOOK_CLASS                  |       1 |    4 |           |        |       |          |      |       |   . |          |                                       |          |
|   17 |         INDEX RANGE SCAN                    | UK_BOOK_CLASS_STATUS2       |       2 |    2 |           |        |       |          |      |       |   . |          |                                       |          |
|   18 |       INDEX RANGE SCAN                      | IDX_BOOK_STATUS_01          |       1 |    2 |           |        |       |          |      |       |   . |          |                                       |          |
|   19 |    VIEW                                     |                             |       1 |    2 |           |        |       |          |      |       |   . |          |                                       |          |
|   20 |     TABLE ACCESS STORAGE FULL               | SYS_TEMP_0FD9E59B6_25BEB636 |       1 |    2 |           |        |       |          |      |       |   . |          |                                       |          |
================================================================================================================================================================================================================================

Potentially I could set system level trace for this sql id and wait for it to reoccur but what events would I be setting to try to capture what causes this runtime decision?
I presume it wouldn’t be an optimizer trace?

Any thoughts or ideas?

Reluctant Local Index

Thu, 2022-04-21 06:45

I have having a couple of not-completely-related performance issues at the moment but which link back to a couple of queries against large tables choosing to do a FULL segment scan rather than using a non-prefixed local index.

The problems right now are not currently reproducible in or outside of production.

But when I look at a specific point in time at the reproduction of a problem, the system state is not necessarily as it was when the problem was occurring.

My main theory is that somewhere on these large partitioned tables there is intermittently an unusable partition of the index perhaps due some aspect of a data load, data movement or other maintenance operation on a different subpartition (where each partition represents a different day of a specific feed and each subpartition represents a version of that daily load).

However I have no evidence currently to support that theory.

If this theory was in the right ballpark, what would it mean?

That a query using bind variables (tick) would not use the index by default as the optimizer would not be able to guarantee that that shareable plan would not be accessing the unusable.

A query using literals would not be subject to the same restrictions as the optimizer would know, if an unusable subpartition was relevant, that the unusable subpartition was not relevant to this specific query (and any DDL which might make a relevant partition unusable would invalidate the shared cursor).

Whilst I was musing over this yesterday, I had a vague sense of familiarity and I found these previous posts to job my memory:

But just to illustrate my theory…

Let’s start with a roughly representative table:

create table  t1
(pkey         varchar2(24) not null
,spkey        number       not null
,id           varchar2(24) not null
,version      number       not null
,status       varchar2(24) not null
,filler       varchar2(255))
partition by list (pkey)
subpartition by list (spkey)
subpartition template
(subpartition sp_1 values (1))
(partition p_0 values ('X'));

alter table t1 add partition p_abc values ('ABC');
alter table t1 add partition p_def values ('DEF');
alter table t1 add partition p_ghi values ('GHI');

create index i1 on t1 (id) local;

create sequence s1 start with 10000000;
create type o1 is object
(pkey  varchar2(24)
,spkey        number
,id           varchar2(24));
/
create type c1 is table of o1;
/

select object_name, subobject_name, object_type from user_objects where object_name = 'T1' order by object_name, subobject_name nulls first;

OBJECT_NAM SUBOBJECT_ OBJECT_TYPE            
---------- ---------- -----------------------
T1                    TABLE                  
T1         P_0        TABLE PARTITION        
T1         P_0_SP_1   TABLE SUBPARTITION     
T1         P_ABC      TABLE PARTITION        
T1         P_ABC_SP_1 TABLE SUBPARTITION     
T1         P_DEF      TABLE PARTITION        
T1         P_DEF_SP_1 TABLE SUBPARTITION     
T1         P_GHI      TABLE PARTITION        
T1         P_GHI_SP_1 TABLE SUBPARTITION    

select object_name, subobject_name, object_type from user_objects where object_name = 'I1' order by object_name, subobject_name nulls first;

OBJECT_NAM SUBOBJECT_ OBJECT_TYPE            
---------- ---------- -----------------------
I1                    INDEX                  
I1         P_0        INDEX PARTITION        
I1         P_0_SP_1   INDEX SUBPARTITION     
I1         P_ABC      INDEX PARTITION        
I1         P_ABC_SP_1 INDEX SUBPARTITION     
I1         P_DEF      INDEX PARTITION        
I1         P_DEF_SP_1 INDEX SUBPARTITION     
I1         P_GHI      INDEX PARTITION        
I1         P_GHI_SP_1 INDEX SUBPARTITION     

And I’m going to seed some data simply:

declare
 p sys.odcivarchar2list := sys.odcivarchar2list('ABC','DEF','GHI');
begin
 for i in 1 .. p.count
 loop
     for j in 1 .. 10000
     loop
         insert into t1
         values (p(i), 1, 'PB:'||s1.nextval, 1, 'LATEST', rpad('X',255,'X'));
     end loop;
 end loop;
commit;
end;
/

select pkey, spkey, count(*), min(id), max(id) from t1 group by pkey, spkey;

PKEY      SPKEY   COUNT(*) MIN(ID)                  MAX(ID)                 
---- ---------- ---------- ------------------------ ------------------------
ABC           1      10000 PB:10000000              PB:10009999             
DEF           1      10000 PB:10010000              PB:10019999             
GHI           1      10000 PB:10020000              PB:10029999    

And then just to keep it representative to my real world problem, I’m going to run a bulk update to set the status of some of the rows to SUPERSEDED (and which is then in the real world followed by the INSERT of some LATEST versions of those rows supplied by the client but no need to do that here):

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom */
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/

And I’m going to lookup my sql id from v$sql and plug it into DBMS_XPLAN to see how my UPDATE performed:

select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',0));

-------------------------------------------------------------
| Id  | Operation                                    | Name |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT                             |      |
|   1 |  UPDATE                                      | T1   |
|   2 |   PARTITION LIST SINGLE                      |      |
|   3 |    PARTITION LIST SINGLE                     |      |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1   |
|   5 |      INDEX RANGE SCAN                        | I1   |
-------------------------------------------------------------

Range Scan of I1 as desired

Now I will mark one subpartition of the index unusable (different subpartition from one affected by the update) and repeat the update, the getting of the SQL and the fetching of the plan (different child number):

alter index i1 modify subpartition P_ABC_SP_1 unusable;
--Repeat dml
--lookup sql

select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',1));
-- ^ different child cursor number because marking the index subpartition as unusable
-- has invalidated the previous child cursor 0

---------------------------------------------
| Id  | Operation                    | Name |
---------------------------------------------
|   0 | UPDATE STATEMENT             |      |
|   1 |  UPDATE                      | T1   |
|   2 |   PARTITION LIST SINGLE      |      |
|   3 |    PARTITION LIST SINGLE     |      |
|   4 |     TABLE ACCESS STORAGE FULL| T1   |
---------------------------------------------

The access method is no longer an index range scan.

As I covered initially, as part of the parse process the execution plan has to be good for all possible ranges of supplied values in the event that that SQL child cursor is shared for executions with different binds than those initially parsed with. Because an update (or select etc, etc) against rows for table subpartition (ABC,1) cannot use the index because that index subpartition is unusable, the optimizer has to discount it.

We can force the index:

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom index(t1 (id))*/
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/

select * from table(dbms_xplan.display_cursor('3m2xyxgruxkpr',0));

-------------------------------------------------------------
| Id  | Operation                                    | Name |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT                             |      |
|   1 |  UPDATE                                      | T1   |
|   2 |   PARTITION LIST SINGLE                      |      |
|   3 |    PARTITION LIST SINGLE                     |      |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1   |
|   5 |      INDEX RANGE SCAN                        | I1   |
-------------------------------------------------------------

But note, as covered in my earlier link, if we force the index in a situation where it can’t be used, it will error in current versions:

alter index i1 modify subpartition P_DEF_SP_1 unusable;
declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom index(t1 (id))*/
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/
ORA-01502: index 'DOM.I1' or partition of such index is in unusable state
ORA-06512: at line 8
01502. 00000 -  "index '%s.%s' or partition of such index is in unusable state"
*Cause:    An attempt has been made to access an index or index partition
           that has been marked unusable by a direct load or by a DDL
           operation
*Action:   DROP the specified index, or REBUILD the specified index, or
           REBUILD the unusable index partition

If I rebuild that second partition again, then I can look at my second option – using literals not binds, at least for the partition keys, but really we then lose the option of FORALL.

alter index i1 rebuild subpartition P_DEF_SP_1;

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   for i in 1 .. v1.count
   loop
     execute immediate 
     'update /*+ find_me_dom */
             t1
      set    status = ''SUPERSEDED''
      where  pkey   = '''||v1(i).pkey||''''||'
      and    spkey  = '||v1(i).spkey||'
      and    id     = :1' using v1(i).id;
   end loop;
end;
/


select * from table(dbms_xplan.display_cursor('5a3vcac58x32q',0));

----------------------------------------
| Id  | Operation               | Name |
----------------------------------------
|   0 | UPDATE STATEMENT        |      |
|   1 |  UPDATE                 | T1   |
|   2 |   PARTITION LIST SINGLE |      |
|   3 |    PARTITION LIST SINGLE|      |
|   4 |     INDEX RANGE SCAN    | I1   |
----------------------------------------

At least we have options whilst playing a waiting game to see if we can observe a problem state which might cause such an issue…. or wait for other possibilities to make themselves known…

Need an Ace Up Your Sleeve?

Thu, 2022-04-14 04:46

I’m open to new opportunities.

There’s a bit about me here https://orastory.wordpress.com/about-2/ and there are links to my linkedin profile both there and in the side nav

I’m currently an Oracle Ace (for a few more weeks at least!), I love writing code, digging out significant performance gains in existing processes and troubleshooting problems. Simples.

If you’re interested, please drop me a mail, add a comment, send me a linkedin message, anything.

Performance Issue with inline view, UNION ALL and Virtual Column

Wed, 2021-05-12 07:05

This is the distillation of a performance problem in some regulatory reporting where a mistake in the optimizer cardinality estimates causes all sorts of knock-on performance issues. This post does not look at those knock-on effects but examines what is happening with the optimizer in this case. Most of the digging in the issue belongs to Valerii Satybaldyev and thanks to Sayan Malakshinov, notably for the contributions on the oracle-l thread.

Script to reproduce (currently on 19.6):

drop table t1;


create table t1
(id            number not null
,version       number not null
,create_ts     timestamp not null
,modify_ts     timestamp
,status        varchar2(24) generated always as (NVL2("MODIFY_TS",'SUPERSEDED','LATEST'))
,id2           number not null
,yn            varchar2(1) not null
,business_date date not null);

insert into t1
(id, version, create_ts, id2, yn, business_date)
select rownum
,      1
,      systimestamp
,      rownum
,      case when mod(rownum,2) = 1 then 'Y' else 'N' end
,      trunc(sysdate,'MON') + mod(rownum,10)
from   dual
connect by level <= 1000;

exec dbms_stats.gather_table_stats(USER,'T1');
 

explain plan for
with x as
(select * from t1
 union all
 select * from t1)
select *
from x
where yn = 'Y';

select * from table(dbms_xplan.display);

The problem comes with the estimate of 1 row in the outermost SELECT, particularly for subsequent joins this cardinality estimate causes significant performance problems.

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    42 |   547   (0)| 00:00:01 |
|   1 |  VIEW                       |      |  1000 | 40000 |   548   (0)| 00:00:01 |
|   2 |   UNION-ALL                 |      |       |       |            |          |
|*  3 |    TABLE ACCESS STORAGE FULL| T1   |   500 | 20000 |   274   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS STORAGE FULL| T1   |   500 | 20000 |   274   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - storage("T1"."YN"='Y')
       filter("T1"."YN"='Y')
   4 - storage("T1"."YN"='Y')
       filter("T1"."YN"='Y')

If we look at the 10053 trace, we can see the root cause of the problem.
Compare the BASE STATISTICAL INFORMATION snippets for T1 and for the inline view X.
Note in particular the numbers for column YN which is the predicate as well as for ID2 and BUSINESS_DATE.
Note that these are columns which appear after the virtual column STATUS.
(in the original trace file, the entry under T1 for column #8 comes before column #7, I have reordered below just to make the correlation between T1 and X more obvious)

Table Stats::
  Table: T1  Alias: T1
  #Rows: 1000  SSZ: 0  LGR: 0  #Blks:  1006  AvgRowLen:  40.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#8): BUSINESS_DATE(DATE)
    AvgLen: 8 NDV: 10 Nulls: 0 Density: 0.100000 Min: 2459336.000000 Max: 2459345.000000
  Column (#7): YN(VARCHAR2)
    AvgLen: 2 NDV: 2 Nulls: 0 Density: 0.500000
  Column (#6): ID2(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
  Column (#4): MODIFY_TS(TIMESTAMP)
    AvgLen: 1 NDV: 0 Nulls: 1000 Density: 0.000000
  Column (#3): CREATE_TS(TIMESTAMP)
    AvgLen: 11 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2459347.407141 Max: 2459347.407141
  Column (#2): VERSION(NUMBER)
    AvgLen: 3 NDV: 1 Nulls: 0 Density: 1.000000 Min: 1.000000 Max: 1.000000
  Column (#1): ID(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000

Table Stats::
  Table:  X  Alias:  X  (NOT ANALYZED)
  #Rows: 2000  SSZ: 0  LGR: 0  #Blks:  2012  AvgRowLen:  40.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#8): BUSINESS_DATE(DATE)  NO STATISTICS (using defaults)
    AvgLen: 7 NDV: 0 Nulls: 0 Density: 0.000000
  Column (#7): YN(VARCHAR2)
    AvgLen: 8 NDV: 10 Nulls: 0 Density: 0.100000 Min: 2459336.000000 Max: 2459345.000000
  Column (#6): ID2(NUMBER)
    AvgLen: 2 NDV: 2 Nulls: 0 Density: 0.500000
  Column (#5): STATUS(VARCHAR2)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
  Column (#4): MODIFY_TS(TIMESTAMP)
    AvgLen: 1 NDV: 0 Nulls: 2000 Density: 0.000000 Min: 0.000000 Max: 0.000000
  Column (#3): CREATE_TS(TIMESTAMP)
    AvgLen: 11 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2459347.407141 Max: 2459347.407141
  Column (#2): VERSION(NUMBER)
    AvgLen: 3 NDV: 1 Nulls: 0 Density: 1.000000 Min: 1.000000 Max: 1.000000
  Column (#1): ID(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000

For inline view X, column STATUS inherits the stats for ID2, ID2 gets those of YN, YN those of BUSINESS_DATE and BUSINESS_DATE gets NO STATISTICS.
So for our predicate on YN we get completely the wrong base statistics.
Then in addition, because the supplied value is out of range from the wrong statistics, estimates get further pro-rated.

Potential workaround is not risk-free as it moves the implicit order of the selected columns which could affect certain code.

alter table t1 modify status invisible;
alter table t1 modify status visible;

explain plan for
with x as
(select * from t1
 union all
 select * from t1)
select *
from x
where yn = 'Y';

select * from table(dbms_xplan.display);

Plan hash value: 3505968351
------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |  1000 | 90000 |   548   (0)| 00:00:01 |
|   1 |  VIEW                       |      |  1000 | 90000 |   548   (0)| 00:00:01 |
|   2 |   UNION-ALL                 |      |       |       |            |          |
|*  3 |    TABLE ACCESS STORAGE FULL| T1   |   500 | 20000 |   274   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS STORAGE FULL| T1   |   500 | 20000 |   274   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Reason it works is that the virtual column is now logically at the end of the column definitions.

If we compare the trace once “fixed”, then we see the following in the X section:

 Table:  X  Alias:  X  (NOT ANALYZED)
...
  Column (#8): STATUS(VARCHAR2)  NO STATISTICS (using defaults)

Online Statistics Gathering

Thu, 2021-04-08 10:07

Online Statistics Gathering, https://sqlmaria.com/2017/01/03/online-statistics-gathering/, is a pretty cool feature for the ages-old version that is Oracle 12.1, especially data warehouses.

As a recap, when you do direct path load into an empty segment, Oracle will take advantage of gathering statistics on the data whilst it is loading it, i.e no need to go back and re-read in an explicit stats gathering.

Note that, per Maria’s note, only base table and column statistics are gathered, not histograms but those can be gathered without re-gathering base column statistics via GATHER AUTO.

Old news.

We have only recently upgraded a complex application from 11.2.0.4 and hit an issue which is really obvious in hindsight but just worth pointing out, particular in combination with another feature?

I’m sure the behaviour below will come as no surprise in 11.2.0.4

drop table t1;
create global temporary table t1
on commit preserve rows
as
select * from user_objects where rownum <= 100;

explain plan for
select * from t1;

select * from table(dbms_xplan.display);
select num_rows from user_tables where table_name = 'T1';
select column_name, num_nulls, num_distinct from user_tab_columns where table_name = 'T1';

The output of note:

Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation                      | Name | Rows | Bytes | Cost (%CPU)| Time  |
----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |      |  100 | 19000 |   2  (0)| 00:00:01 |
|  1 | TABLE ACCESS STORAGE FULL      | T1   |  100 | 19000 |   2  (0)| 00:00:01 |
----------------------------------------------------------------------------------

Note
-----
  - dynamic sampling used for this statement (level=2)


 NUM_ROWS
----------
      

COLUMN_NAME           NUM_NULLS NUM_DISTINCT
-------------------- ---------- ------------
GENERATED                       
SECONDARY                       
NAMESPACE                       
EDITION_NAME                      
OBJECT_NAME                      
SUBOBJECT_NAME                     
OBJECT_ID                       
DATA_OBJECT_ID                     
OBJECT_TYPE                      
CREATED                        
LAST_DDL_TIME                     
TIMESTAMP                       
STATUS                         
TEMPORARY                       
                 

We load 100 rows, there are no stats, dynamic sampling kicks in, bingo the optimizer expects 100 rows.

Then for > 11.2.0.4 ( in this case 19.6), with the addition of SCOPE column in the script for 19c, we see that we now have table and column statistics after the load despite no explicit stats gathering call. Which, given the feature we’re discussing, should be completely obvious and expected.

I’ve used a temporary table here deliberately purely to facilitate a comment at the end. Clearly this could be a normal heap table – but the SCOPE in the output is specific to the new feature of temporary table private statistics.

Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation                   | Name | Rows | Bytes | Cost (%CPU)|     Time |
----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |      |  100 | 11300 |      2  (0)| 00:00:01 |
|  1 | TABLE ACCESS STORAGE FULL   | T1   |  100 | 11300 |      2  (0)| 00:00:01 |
----------------------------------------------------------------------------------

Note
-----
  - Global temporary table session private statistics used


  NUM_ROWS SCOPE  
---------- -------
           SHARED 
       100 SESSION

COLUMN_NAME            NUM_NULLS NUM_DISTINCT SCOPE  
--------------------- ---------- ------------ -------
OBJECT_NAME                    0            7 SESSION
SUBOBJECT_NAME                 7           93 SESSION
OBJECT_ID                      0          100 SESSION
DATA_OBJECT_ID                 3           97 SESSION
OBJECT_TYPE                    0            3 SESSION
CREATED                        0           95 SESSION
LAST_DDL_TIME                  0           97 SESSION
TIMESTAMP                      0           96 SESSION
STATUS                         0            1 SESSION
TEMPORARY                      0            1 SESSION
GENERATED                      0            2 SESSION
SECONDARY                      0            1 SESSION
NAMESPACE                      0            1 SESSION
EDITION_NAME                 100            0 SESSION
SHARING                        0            1 SESSION
EDITIONABLE                  100            0 SESSION
ORACLE_MAINTAINED              0            1 SESSION
APPLICATION                    0            1 SESSION
DEFAULT_COLLATION             94            1 SESSION
DUPLICATED                     0            1 SESSION
SHARDED                        0            1 SESSION
CREATED_APPID                100            0 SESSION
CREATED_VSNID                100            0 SESSION
MODIFIED_APPID               100            0 SESSION
MODIFIED_VSNID               100            0 SESSION

Again, we’ve loaded 100 rows, as part of the direct path load, we have some base & column statstics, the optimizer expects 100 rows. No dynamic sampling necessary

But what if our data is loaded in multiple steps, might this new online statistics gathering cause us a bit of a problem?

It only gathers the stats when the segment is empty.

I’m not going to repeat the output above for the table & column statistics, just the execution plan detail.

drop table t1;
create global temporary table t1
on commit preserve rows
as
select * from user_objects where rownum <= 100;
insert /*+ append */ into t1
select * from user_objects
commit;

insert /*+ append */ into t1
select * from user_objects
commit;

explain plan for
select * from t1;

Which in 11.2.0.4 gives:

Plan hash value: 3617692013
---------------------------------------------------------------------------------------
| Id | Operation                      | Name |   Rows | Bytes | Cost (%CPU)|    Time  |
---------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |      |  32376 | 6007K |    93   (0)| 00:00:01 |
|  1 | TABLE ACCESS STORAGE FULL      | T1   |  32376 | 6007K |    93   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Note
-----
  - dynamic sampling used for this statement (level=2)

select count(*) from t1;

  COUNT(*)
----------
     34392

And in 19.6

Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation                   | Name | Rows | Bytes | Cost (%CPU)|     Time |
----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |      |  100 | 11300 |      2  (0)| 00:00:01 |
|  1 | TABLE ACCESS STORAGE FULL   | T1   |  100 | 11300 |      2  (0)| 00:00:01 |
----------------------------------------------------------------------------------

Note
-----
  - Global temporary table session private statistics used

select count(*) from t1;

  COUNT(*)
----------
     59712*

*With an obvious difference in actual rows, because these are different environments on different versions.

The problem now is that our subsequent processing code which drives off a table populated with different volumes of data in multiple stages used to work fine in 11.2.0.4 but is now at risk of some serious consequences from having cardinality mis-estimates thanks to inaccurate statistics from our new feature which is behaving as it should.

In our real-world example, it was not always the case that there was a big skew between the first load and the subsequent loads and actually this consequence, obvious in hindsight, took a few weeks to reveal itself until that first load into an empty segment happened to be very small followed by some other loads which were much larger.

Furthermore, this also involved the private temporary table aspect and so when you’re looking at solutions, you might consider whether some explicit stats gathering at the end of the load is the solution – in which case should it be private stats or shared stats or whether it even matters when quite possibly the easiest solution is to just use dynamic sampling in the subsequent select/s.

I was reminded of this issue when writing this other article earlier on a more serious problem that online statistics gathering was causing:

Real-time Statistics ORA-00600s / Integer Overflow

Thu, 2021-04-08 07:32

Unfortunately this is just an anecdotal long after-the-fact note on issues observed on upgrade from 11.2.0.4 to 19.6 with new feature real-time statistics. I thought I had joted this down previously, but doesn’t appear to be another article published or in draft. Issue was not reproducible outside of the application unfortunately and was also intermittent.

Feature was causing several issues in one particular area of the application.

First off was ORA-00600s in the processing with immediate disconnect on the processing session (apart from trace file generation)

Second was sometimes leaving evidence of integer overflow issues in column statistics.

Third, “corrupt” statistics then leading to subsequent bad execution plans for complicated queries evidenced by MERGE JOIN / MERGE JOIN CARTESIAN and row estimates of max 18E.

Examples of the latter complex queries with cardinality issues leading to bad plans and severe performance problems, when distilled to simplest form, could be seen as follows:

explain plan for
SELECT *
FROM   XXXXX
WHERE  business_date     = :busines_date
AND    id                = :id
AND    processed_flag    = 'Y';

select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |    18E|    15E|   647K  (1)| 00:00:26 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| XXXXXXXXXXXX |    18E|    15E|   647K  (1)| 00:00:26 |
|*  2 |   INDEX RANGE SCAN                  | XXXXXXXXXXXX |   733K|       |  1538   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Note
-----
   - dynamic statistics used: statistics for conventional DML
explain plan for
SELECT /*+ opt_param('optimizer_features_enable','18.1.0') */ *
FROM   XXXXX
WHERE  business_date     = :busines_date
AND    id                = :id
AND    processed_flag    = 'Y';

select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |     5 |  1405 |    13   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| XXXXXXXXXXXX |     5 |  1405 |    13   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | XXXXXXXXXXXX |    10 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

The root cause corruption evidenced as follows:

select num_nulls, notes from dba_tab_col_statistics where....

NUM_NULLS                SAMPLE_SIZE NOTES
----------------------   ----------- ------------------
-917567866115141580119           267 STATS_ON_CONVENTIONAL_DML
                   500           267 STATS_ON_CONVENTIONAL_DML
                   500           267 STATS_ON_CONVENTIONAL_DML
    -10558214104677900           267 STATS_ON_CONVENTIONAL_DML
-917567866114408000000           267 STATS_ON_CONVENTIONAL_DML
-917567866114408000000           267 STATS_ON_CONVENTIONAL_DML
         -115153305100           267 STATS_ON_CONVENTIONAL_DML

It should be obvious that NUM_NULLs cannot be < 1 and also some strange, unsuitable values in some parts on LOW_VALUE which I’ve omitted from output above.

Traction with Oracle Support was disappointing and, unfortunately, once they knew we didn’t have a problem if we just turned the feature off, that was good enough for them and to be honest we just needed to move on in our testing – this was back in August 2020.

So, feature real-time statistics is turned off across the board via the parameters below which being underscore parameters we had Oracle Support’s blessing to do.

_optimizer_use_stats_on_conventionaml_dml

_optimizer_gather_stats_on conventional_dml

Oracle 19c Upgrade: Query hanging on PX Deq: Join Ack Part 2

Thu, 2021-04-08 04:34

Previously I had blogged about an intermittent problem which was affecting one of our feeds where a parallel query would just hang as the QC would wait forever on a parallel slave which never joined the party.

The theory was the parallel slave was affected by another issue – possibly the slave process died and until PMON cleans it up, the QC assumes it is still alive (ref doc id 250960.1)

Intermittent problems can be a right pain but this issue mainly affected one particular process most days kicking off between 1 and 2 am – same process runs 30-40 times per day but only tended to have the issue once so it seemed somewhat data dependent – but also would affect some other processes in different schemas very occasionally.

Over time, evidence built from trace files suggest that the issue was mostly related to:

ORA-00600: internal error code, arguments: [17182]

or

ORA-00600: internal error code, arguments: [17114]

or both in the same error stack

If the QC received the error, connection would terminate instantly (apart from trace file generation), if the slave received the error then QC would hang.

One patch – 31045929 – suggested by Oracle Support was applied but was not effective.

Second patch – 28907129 – similar.

Ultimately, a workaround, which was tested to validate that the second patch was in the right ballpark, was effective and that was to disable _optimizer_gather_stats_on_load (https://sqlmaria.com/2017/01/03/online-statistics-gathering/)

It’s disappointing to have to turn off yet another “new” feature – this one related to stats gathering for direct path into an empty segment – when we have already had to turn off real-time statistics for conventional dml previously ( link to future post – I thought I had blogged about this but seems not) but hopefully we will still be able to pursue patch fix to success.

Count Rows in Partition Where…

Mon, 2021-02-22 09:12

Just an illustration of a SQL / XML utility to count rows in a partition.

In my case, my requirement was to count rows in all subpartitions where there is data which might not belong in that partition (due to NOVALIDATE partition exchange). My thoughts were that I wanted the equivalent of a NESTED LOOP – for each partition… do XYZ and was happy for this to run for a while in the background without consuming significant parallel resources.

Partition keys here are not nullable.

with s as
(select /*+ */ 
        sp.table_owner, sp.table_name, sp.subpartition_name
 ,      sn.snapshot_id, sn.version
 from   dba_tab_subpartitions sp 
 join   my_snaphot_table sn on (sn.subpartition_name = sp.subpartition_name)
 where  sp.table_owner = :owner 
 and    sp.table_name  = :table_name)
select s.*, x.cnt
from   s
 ,      xmltable('for $i in /ROWSET/ROW/CNT
                  return $i'
                 passing xmltype(
                          dbms_xmlgen.getxml
                          ('select /*+ parallel(2) */ count(*) cnt '
                         ||'from '||s.table_owner||'.'||s.table_name||' SUBPARTITION ('||s.subpartition_name||') '
                         ||'where (snapshot_id, snapshot_version) != (('''||s.snapshot_id||''', '||s.version||'))'
                           ))
                 columns cnt number path '/') x
where x.cnt > 0; 

Oracle 19c Upgrade: Bug 31602782 – Same Instance Slave Parse Failure Flood Control

Mon, 2021-02-22 08:26

It’s almost a universal truth that for any non-trivial application you can expect to hit one or two serious incidents in production post-upgrade, most often due to some combination of workload and/or timing which just didn’t happen in testing beforehand.

In this instance (/these instances), it was the bug 31602782, the patch for which has the above title, the doc id description being “Contention on CURSOR: Pin S wait on X when PQ slave’s execution plan does not match with QC.

This has been around since 12.2 and is first included in 20.1. The doc describes it as

When a parallel query is executed, QC sends parse message to all slaves in all instances…. if the slave could not build the cursor with the same plan as QC and could not join, another slave tries the same and so on until all slaves get exhausted

In practice, what this meant for us was avery large and significant contention on library cache lock and cursor: pin s wait for x affecting all data loads, large parsing backlogs, issues with anything then wanting library cache lock on the same underlying objects – e.g.partition maintenance operations, etc

There were two specific processes involved and had we not got a bug diagnosis when we did, I would have started to rewrite those processes. It is not uncommon that the flows which end up hitting bugs and performance issues have mileage left in them in terms of optimisation. In hindsight, I believe we did hit this problem in UAT but not in this area and in UAT I had rewritten the process in question as it was problematic anyway,

Certainly once this patch was applied, our most urgent post-upgrade issues were largely resolved.

As an aside, I would urge anyone looking at an upgrade to review beforehand the list of Important Recommended One-off Patches which for 19c is Doc Id: 2720807.1

Oracle 19c Upgrade: Query hanging on PX Deq: Join Ack

Mon, 2021-02-22 08:05

Next installment of issues observed post Oracle Upgrade (11.2.0.4 to 19.6)

Intermittent query affecting a number of SQL statement, but one particular SQL more than any other.

Event PX Deq: Join Ack is usually seen briefly when the QC (Query Co-ordinator) has to build the slave sets in order to execute a SQL statement in parallel. A message is sent to each slave and then the QC waits for acknowledgement (reference Doc Id: 250960.1)

On my application, when this problem occurs, the QC will hang forever waiting for the acknowledgement from one or a slow sequence of slaves. Longest I saw before intervention was 3 days.

Per that note, possible reason for prolonged wait times is that something happened to the PQ slave process. Unless/until PMON cleans up the internal structures, any new QC will assume the process is alive. Another possible cause is depletion of SGA memory such that the slave cannot establish its response channel.

With us, this issue is currently under investigation with Oracle Support but is believed to correlate to memory corruption errors:

ORA-00600: internal error code, arguments: [17182]

which ties back to bug 31045929: PGA Memory Corruption Caused By Cursor Frame Overrun

Oracle 19c Upgrade: Query reading way more partitions than it should

Mon, 2021-02-22 07:51

Still a work in progress… but despite a query plan which details that partition pruning should be happening, evidence (both physical reads and obvious timing) suggest that an OLTP-style query is reading all table partitions.

There is no evidence to suggest that this was happening before the upgrade (11.2.0.4 to 19.6) and plenty of evidence of it afterwards. Not consistent suggesting some significant triggering event.

Oracle: Groundbreakers Developer Community https://community.oracle.com/tech/developers/discussion/4480886/partition-list-iterator-reading-too-many-partitions/ currently has all the latest information. If I ever get to a conclusion, then I might collate & summarise here.

Exchange Partition Validation

Tue, 2020-12-22 10:06

Testing an upgrade from 11.2.0.4 to 19.6, we observed a very significant degradation in partition exchange performance.

Is it a case of expected behaviour or not?

The documentation is quite clear in both 11.2 and 19c:

https://docs.oracle.com/cd/E11882_01/server.112/e25523/part_admin002.htm#i1107555

https://docs.oracle.com/en/database/oracle/oracle-database/19/vldbg/maintenance-partition-tables-indexes.html#GUID-E08650B4-06B1-43F9-91B0-FBF685A3B848

When you specify WITHOUT VALIDATION for the exchange partition operation, this is normally a fast operation because it involves only data dictionary updates. However, if the table or partitioned table involved in the exchange operation has a primary key or unique constraint enabled, then the exchange operation is performed as if WITH VALIDATION were specified to maintain the integrity of the constraints.

Personally, in a data warehouse with exchange partition I would tend to have defined such a constraint as ENABLE NOVALIDATE if it was even required.

The CONSTRAINT is a PK constraint backed up a local index, partition keys as leading columns.

The trace file points to a clear culprit:

 select /*+ first_rows(1) ordered */ 1 
 from "X"."FB_P_161261_SP_9" A? ,"X"."FACILITY_BALANCE" B 
 where "A"."SNAPSHOT_ID" = "B"."SNAPSHOT_ID" and "A"."SNAPSHOT_VERSION" = "B"."SNAPSHOT_VERSION" and "A"."FACILITY_BALANCE_ID" = "B"."FACILITY_BALANCE_ID" and "A"."VERSION" = "B"."VERSION" 
 and ( tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,B? .rowid )  <  62358 or  tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,B? .rowid )  >  62358 ) 
 and tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,A? ."SNAPSHOT_ID" ,A? ."SNAPSHOT_VERSION" )  <>  62358 and rownum < 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      8.40       8.46          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1   4229.27    5278.73   27688674   27783294          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3   4237.67    5287.20   27688674   27783294          0           0 

We know from Jonathan Lewis’s work on http://www.nocoug.org/download/2008-11/partitioning_NOCOUG.pdf that this is expected internal SQL from a ENABLE VALIDATE constraint in an EXCHANGE WITHOUT VALIDATION scenario

So why the big change in performance?

Probably this is an indication of just a simple execution plan issue. I would need to trace the same in 11g to confirm which is easier said than done under the circumstances.

This is the info from the 19c tkprof at least:

Rows (1st) Rows (avg) Rows (max) Row Source Operation

---------- ---------- ---------- ---------------------------------------------------
         0          0          0 COUNT STOPKEY (cr=27783294 pr=27688674 pw=0 time=287636041 us starts=1)
         0          0          0  NESTED LOOPS (cr=27783294 pr=27688674 pw=0 time=287636035 us starts=1 cost=4 size=1061 card=1)
 238651024  238651024  238651024   VIEW VW_JF_SET$91EDBF25 (cr=27783288 pr=27688673 pw=0 time=3245932610 us starts=1 cost=4 size=62 card=2)
 238651024  238651024  238651024    UNION-ALL (cr=27783288 pr=27688673 pw=0 time=2708356160 us starts=1)
  11351302   11351302   11351302     PARTITION COMBINED ITERATOR PARTITION: KEY KEY (cr=69570 pr=69469 pw=0 time=5491712 us starts=1 cost=2 size=43 card=1)
  11351302   11351302   11351302      INDEX STORAGE FAST FULL SCAN FIRST ROWS PK_FACILITY_BALANCE PARTITION: KEY KEY (cr=69570 pr=69469 pw=0 time=4380693 us starts=20 cost=2 size=43 card=1)(object id 42709299)
 227299722  227299722  227299722     PARTITION COMBINED ITERATOR PARTITION: KEY KEY (cr=27713718 pr=27619204 pw=0 time=2376271490 us starts=1 cost=2 size=43 card=1)
 227299722  227299722  227299722      INDEX STORAGE FAST FULL SCAN FIRST ROWS PK_FACILITY_BALANCE PARTITION: KEY KEY (cr=27713718 pr=27619204 pw=0 time=1916674628 us starts=62357 cost=2 size=43 card=1)(object id 42709299)
         0          0          0   INDEX UNIQUE SCAN FB_P_161261_SP_9$C3 (cr=6 pr=1 pw=0 time=1734545604 us starts=238651024 cost=0 size=1030 card=1)(object id 72993779)  
		

What is clear is that if the constraint definition is changed to ENABLE NOVALIDATE then the exchange partition takes about 1 minute. Which still feels a bit slow but a heck of a lot better than 90 minutes.

If you see Mohamed Houri’s comment below, this experience is not isolated and in his case, they went for EXCHANGE PARTITION (WITH VALIDATION). After the Christmas break, I will re-evaluate the choice I made.

Parallel Window Consolidator Calls The Stops

Mon, 2020-11-23 04:59

Sharing observations of performance issue arising out of testing 19.6 upgrade from 11.2.0.4 (bug possibly from 12+ looking at the fixes which work).

Sharing this one in particular as it doesn’t seem that common from anecdotal evidence (forums, blogs, support), it doesn’t appear fixed yet, progress with support has been disappointing, however circumstances don’t seem particularly niche.

We’ve had a few issues with a number of parallel queries just “hanging” forever.

Issue seems to be related to WINDOW CONSOLIDATOR and the parallel distribution method.

Haven’t been able to find a good matching bug via Oracle Support (nor happy with the progress of SR with them) but found a good match on a Lothar Flatz blog post which led me to the PQ_DISTRIBUTE_WINDOW as cause.

SQL contains parallel + join to view, view contains analytic
SQL Executes in under a second “normally”.

Without a fix, the SQL will just hang for a long time. The longest I’ve left it is 3 days but given the wait states of the sessions involved, there’s no reason to think it would stop of its own accord.

Jumping ahead to the workarounds which work, any of the following:

  • Turn off fix_control 13345888 via alter session or opt_param
  • Turn off “_adaptive_window_consolidator_enabled” via alter session/system or opt_param
  • Hint a PQ distribution method other than 2. 3 doesn’t work here so 1 is the other option – PQ_DISTRIBUTE_WINDOW(@query_block 1)

Here we see some evidence from just now, running for nearly 30 minutes:

select s.inst_id i,s.sid,s.serial#,s.module
 ,      s.blocking_session blks,s.blocking_instance bi,s.final_blocking_session f_blk_s, s.final_blocking_instance fbi
 ,      CASE WHEN state != 'WAITING' THEN 'WORKING' ELSE 'WAITING' END AS state
 ,      CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue' ELSE event END AS sw_event, s.seconds_in_wait wait
 ,      s.sql_id,s.sql_exec_id,to_char(s.sql_exec_start,'DD-MON-YYYY HH24:MI:SS') exec_start
 from   gv$session    s
 where  sql_id = '5xyrp7v4djagv';

         I        SID    SERIAL# MODULE                     BLKS         BI    F_BLK_S        FBI STATE   SW_EVENT                        WAIT SQL_ID        SQL_EXEC_ID EXEC_START                   
---------- ---------- ---------- -------------------- ---------- ---------- ---------- ---------- ------- ------------------------- ---------- ------------- ----------- -----------------------------
         2       2805      15947 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       3095       3608 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       3367      28066 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       5610       8452 SQL Developer                                                    WAITING PX Deq: Table Q Normal          1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       5885      54481 SQL Developer              9672          2       9672          2 WAITING PX Deq: Execute Reply           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       8828      57832 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       9111      37143 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       9383      50792 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         
         2       9672      29993 SQL Developer                                                    WAITING PX Deq: Execution Msg           1492 5xyrp7v4djagv    33554434 26-OCT-2020 12:18:59         

Real-time SQL Monitoring report looks per below (note Duration compared to any other progress/metrics)… note that in later versions of Oracle, RTSM has a habit of timing out and reporting DONE(ERROR) even though the SQL is still going:



 Global Information
 ------------------------------
 Status              :  EXECUTING                  
 Instance ID         :  2                          
 SQL ID              :  5xyrp7v4djagv              
 SQL Execution ID    :  33554434                   
 Execution Started   :  10/26/2020 12:18:59        
 First Refresh Time  :  10/26/2020 12:18:59        
 Last Refresh Time   :  10/26/2020 12:18:59        
 Duration            :  1592s                      
 Module/Action       :  SQL Developer/-            
 Service             :  LNTDH8U.UK.DB.COM          
 Program             :  SQL Developer              


Global Stats
=========================================
| Elapsed |   Cpu   |  Other   | Buffer |
| Time(s) | Time(s) | Waits(s) |  Gets  |
=========================================
|    0.00 |    0.00 |     0.00 |      3 |
=========================================

Parallel Execution Details (DOP=4 , Servers Allocated=8)
==========================================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |  Other   | Buffer | Wait Events |
|                |       |         | Time(s) | Time(s) | Waits(s) |  Gets  | (sample #)  |
==========================================================================================
| PX Coordinator | QC    |         |    0.00 |    0.00 |     0.00 |      3 |             |
| p000           | Set 1 |       1 |         |         |          |        |             |
| p001           | Set 1 |       2 |         |         |          |        |             |
| p002           | Set 1 |       3 |         |         |          |        |             |
| p003           | Set 1 |       4 |         |         |          |        |             |
| p004           | Set 2 |       1 |         |         |          |        |             |
| p005           | Set 2 |       2 |         |         |          |        |             |
| p006           | Set 2 |       3 |         |         |          |        |             |
| p007           | Set 2 |       4 |         |         |          |        |             |
==========================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3844894891)
===============================================================================================================================================================================================
| Id   |                           Operation                           |              Name              |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Activity | Activity Detail |
|      |                                                               |                                | (Estim) |      | Active(s) | Active |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================================================================================
|    0 | SELECT STATEMENT                                              |                                |         |      |           |        |     1 |          |          |                 |
| -> 1 |   PX COORDINATOR                                              |                                |         |      |         1 |     +0 |     1 |        0 |   100.00 | Cpu (1)         |
|    2 |    PX SEND QC (RANDOM)                                        | :TQ20006                       |      2M |  445 |           |        |       |          |          |                 |
|    3 |     HASH JOIN                                                 |                                |      2M |  445 |           |        |       |          |          |                 |
|    4 |      PART JOIN FILTER CREATE                                  | :BF0000                        |     232 |  441 |           |        |       |          |          |                 |
|    5 |       PX RECEIVE                                              |                                |     232 |  441 |           |        |       |          |          |                 |
|    6 |        PX SEND BROADCAST                                      | :TQ20005                       |     232 |  441 |           |        |       |          |          |                 |
|    7 |         VIEW                                                  |                                |     232 |  441 |           |        |       |          |          |                 |
|    8 |          WINDOW CONSOLIDATOR BUFFER                           |                                |     232 |  441 |           |        |       |          |          |                 |
|    9 |           PX RECEIVE                                          |                                |     232 |  441 |           |        |       |          |          |                 |
|   10 |            PX SEND HASH                                       | :TQ20004                       |     232 |  441 |           |        |       |          |          |                 |
|   11 |             WINDOW BUFFER                                     |                                |     232 |  441 |           |        |       |          |          |                 |
|   12 |              HASH JOIN                                        |                                |     232 |  441 |           |        |       |          |          |                 |
|   13 |               NESTED LOOPS                                    |                                |     505 |    4 |           |        |       |          |          |                 |
|   14 |                TABLE ACCESS BY GLOBAL INDEX ROWID             | D_B                            |       1 |    3 |           |        |       |          |          |                 |
|   15 |                 PX RECEIVE                                    |                                |       1 |    2 |           |        |       |          |          |                 |
|   16 |                  PX SEND HASH (BLOCK ADDRESS)                 | :TQ20002                       |       1 |    2 |           |        |       |          |          |                 |
|   17 |                   PX SELECTOR                                 |                                |         |      |           |        |       |          |          |                 |
|   18 |                    INDEX UNIQUE SCAN                          | D_B_PK                         |       1 |    2 |           |        |       |          |          |                 |
|   19 |                INDEX FULL SCAN                                | D_R_UK01                       |     505 |    1 |           |        |       |          |          |                 |
|   20 |               PX RECEIVE                                      |                                |     108 |  437 |           |        |       |          |          |                 |
|   21 |                PX SEND BROADCAST                              | :TQ20003                       |     108 |  437 |           |        |       |          |          |                 |
|   22 |                 VIEW                                          | V_V2                           |     108 |  437 |           |        |       |          |          |                 |
|   23 |                  WINDOW CONSOLIDATOR BUFFER                   |                                |     108 |  437 |           |        |       |          |          |                 |
|   24 |                   PX RECEIVE                                  |                                |     108 |  437 |           |        |       |          |          |                 |
|   25 |                    PX SEND HASH                               | :TQ20001                       |     108 |  437 |           |        |       |          |          |                 |
|   26 |                     WINDOW SORT                               |                                |     108 |  437 |           |        |       |          |          |                 |
|   27 |                      NESTED LOOPS                             |                                |     108 |  436 |           |        |       |          |          |                 |
|   28 |                       NESTED LOOPS                            |                                |     108 |    4 |           |        |       |          |          |                 |
|   29 |                        TABLE ACCESS BY GLOBAL INDEX ROWID     | D_B                            |       1 |    3 |           |        |       |          |          |                 |
|   30 |                         PX RECEIVE                            |                                |       1 |    2 |           |        |       |          |          |                 |
|   31 |                          PX SEND HASH (BLOCK ADDRESS)         | :TQ20000                       |       1 |    2 |           |        |       |          |          |                 |
|   32 |                           PX SELECTOR                         |                                |         |      |           |        |       |          |          |                 |
|   33 |                            INDEX UNIQUE SCAN                  | D_B_PK                         |       1 |    2 |           |        |       |          |          |                 |
|   34 |                        INDEX RANGE SCAN                       | D_BS_UK01                      |     108 |    1 |           |        |       |          |          |                 |
|   35 |                       PX COORDINATOR                          |                                |         |      |           |        |       |          |          |                 |
|   36 |                        PX SEND QC (RANDOM)                    | :TQ10002                       |       1 |    4 |           |        |       |          |          |                 |
|   37 |                         BUFFER SORT                           |                                |      2M |      |           |        |       |          |          |                 |
|   38 |                          VIEW                                 | V_V1                           |       1 |    4 |           |        |       |          |          |                 |
|   39 |                           UNION ALL PUSHED PREDICATE          |                                |         |      |           |        |       |          |          |                 |
|   40 |                            TABLE ACCESS BY GLOBAL INDEX ROWID | D_S                            |       1 |    2 |           |        |       |          |          |                 |
|   41 |                             BUFFER SORT                       |                                |         |      |           |        |       |          |          |                 |
|   42 |                              PX RECEIVE                       |                                |       1 |    1 |           |        |       |          |          |                 |
|   43 |                               PX SEND HASH (BLOCK ADDRESS)    | :TQ10000                       |       1 |    1 |           |        |       |          |          |                 |
|   44 |                                PX SELECTOR                    |                                |         |      |           |        |       |          |          |                 |
|   45 |                                 INDEX UNIQUE SCAN             | D_S_PK                         |       1 |    1 |           |        |       |          |          |                 |
|   46 |                            TABLE ACCESS BY INDEX ROWID        | D_S                            |       1 |    2 |           |        |       |          |          |                 |
|   47 |                             BUFFER SORT                       |                                |         |      |           |        |       |          |          |                 |
|   48 |                              PX RECEIVE                       |                                |       1 |    1 |           |        |       |          |          |                 |
|   49 |                               PX SEND HASH (BLOCK ADDRESS)    | :TQ10001                       |       1 |    1 |           |        |       |          |          |                 |
|   50 |                                PX SELECTOR                    |                                |         |      |           |        |       |          |          |                 |
|   51 |                                 INDEX UNIQUE SCAN             | D_S                            |       1 |    1 |           |        |       |          |          |                 |
|   52 |      PX BLOCK ITERATOR ADAPTIVE                               |                                |    8255 |    3 |           |        |       |          |          |                 |
|   53 |       TABLE ACCESS STORAGE FULL                               | F_FACT                         |    8255 |    3 |           |        |       |          |          |                 |
===============================================================================================================================================================================================

ASH says it’s not doing anything (as we’d expect it to, being consistent with RTSM report above).


select count(*), min(sample_time), max(sample_time), sysdate from gv$active_session_history where sql_id = '5xyrp7v4djagv';

  COUNT(*) MIN(SAMPLE_TIME)             MAX(SAMPLE_TIME)             SYSDATE             
---------- ---------------------------- ---------------------------- --------------------
         1 26-OCT-20 12.18.59.828000000 26-OCT-20 12.18.59.828000000 26-OCT-2020 12:50:29

Miscellaneous Slave Info


select
decode(px.qcinst_id,NULL,'QC',
' - '||lower(substr(pp.SERVER_NAME,
length(pp.SERVER_NAME)-4,4) ) )"Username",
decode(px.qcinst_id,NULL, 'QC', '(Slave)') "QC/Slave" ,
to_char( px.server_set) "SlaveSet",
to_char(s.sid) "SID",
to_char(px.inst_id) "Slave INST",
decode(px.qcinst_id, NULL ,to_char(s.sid) ,px.qcsid) "QC SID",
to_char(px.qcinst_id) "QC INST",
px.req_degree "Req. DOP",
px.degree "Actual DOP"
from gv$px_session px,
gv$session s ,
gv$px_process pp
where px.sid=s.sid (+)
and px.serial#=s.serial#(+)
and px.inst_id = s.inst_id(+)
and px.sid = pp.sid (+)
and px.serial#=pp.serial#(+)
and s.sql_id = '5xyrp7v4djagv'
order by 6 , 1 desc
/

Usernam QC/Slav SlaveSet   SID       Slave INST QC SID     QC INST      Req. DOP  Actual DOP
------- ------- ---------- --------- ---------- ---------- -------- ------------- -----------
QC      QC                 5885       2          5885         
 - p007 (Slave) 2          9672       2          5885       2                   4          4
 - p006 (Slave) 2          3367       2          5885       2                   4          4
 - p005 (Slave) 2          9395       2          5885       2                   4          4
 - p004 (Slave) 2          3095       2          5885       2                   4          4
 - p003 (Slave) 1          9111       2          5885       2                   4          4
 - p002 (Slave) 1          2805       2          5885       2                   4          4
 - p001 (Slave) 1          8828       2          5885       2                   4          4
 - p000 (Slave) 1          5610       2          5885       2                   4          4

Detail! And don’t forget the notes…

Mon, 2020-10-26 05:25

…and how sparse tables can mess up your dynamic sampling.

A little lesson to self about paying attention to the detail, the obvious frickin’ detail. Getting older and slower.

We had one of those problems where a multitude of things were slow and yet there were no significant system-level problems (e.g. no general slow IO, no setting of the fast=false parameter, etc)

In such situations, if you check the high level system level metrics – AWR etc – and nothing is obvious then it makes sense to narrow the scope & drill into a specific problem, and distill that problem/statement as small as you can. If you can find the issue there then you might hope to find that that is the same as the wider problem and if not well you can just take that approach to every individual area,

I had a problem across a number of reporting modules, I picked on one module and from there I picked on the statement which was currently running and which was running for much longer than expected, I picked out one of the obvious problem joins within the problem statement and started looking at a simple two table join where the cardinality estimates were going wrong.

Most of the specific details of this plan and SQL can be ignored. It’s not important.

What is important is the cardinality difference.

And in fact, in this particular cut down example, the cardinality difference does not have the plan impact it does in the original setting.

 
set pages 999
explain plan for
SELECT /*+ parallel(4)*/
	   count(*)
FROM   psbal
LEFT OUTER JOIN
       prdct
ON    (prdct.d_product_id                        = psbal.d_product_id
AND    prdct.create_ts                           TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6'));

select * from table(dbms_xplan.display);

Plan hash value: 3885851338
--------------------------------------------------------------------------------------------------------------------------------------------------
|  Id | Operation                             | Name          | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  TQ   |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |               |    1 |    72 |   3430K (3)| 01:54:22 |       |       |       |      |            |
|   1 |  SORT AGGREGATE                       |               |    1 |    72 |            |          |       |       |       |      |            |
|   2 |   PX COORDINATOR                      |               |      |       |            |          |       |       |       |      |            |
|   3 |    PX SEND QC (RANDOM)                | :TQ10003      |    1 |    72 |            |          |       |       | Q1,03 | P->S |  QC (RAND) |
|   4 |     SORT AGGREGATE                    |               |    1 |    72 |            |          |       |       | Q1,03 | PCWP |            |
|*  5 |      HASH JOIN                        |               | 7466M|   500G|   3430K (3)| 01:54:22 |       |       | Q1,03 | PCWP |            |
|   6 |       PART JOIN FILTER CREATE         | :BF0000       |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   7 |        PX RECEIVE                     |               |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   8 |         PX SEND BROADCAST             | :TQ10000      |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | P->P |  BROADCAST |
|   9 |          PX BLOCK ITERATOR            |               |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | PCWC |            |
|  10 |           INDEX STORAGE FAST FULL SCAN| S_PK          |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | PCWP |            |
|* 11 |       HASH JOIN RIGHT OUTER           |               | 7466M|   458G|   3425K (3)| 01:54:12 |       |       | Q1,03 | PCWP |            |
|  12 |        PX RECEIVE                     |               |    1 |    30 |    306  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|  13 |         PX SEND BROADCAST             | :TQ10001      |    1 |    30 |    306  (0)| 00:00:01 |       |       | Q1,01 | P->P |  BROADCAST |
|  14 |          PX BLOCK ITERATOR            |               |    1 |    30 |    306  (0)| 00:00:01 |     1 |     3 | Q1,01 | PCWC |            |
|* 15 |           TABLE ACCESS STORAGE FULL   | PRODUCT       |    1 |    30 |    306  (0)| 00:00:01 |     1 |     3 | Q1,01 | PCWP |            |
|* 16 |        HASH JOIN                      |               | 7466M|   250G|   3420K (3)| 01:54:01 |       |       | Q1,03 | PCWP |            |
|  17 |         PX RECEIVE                    |               | 6903K|   118M|  16379  (1)| 00:00:33 |       |       | Q1,03 | PCWP |            |
|  18 |          PX SEND BROADCAST            | :TQ10002      | 6903K|   118M|  16379  (1)| 00:00:33 |       |       | Q1,02 | P->P |  BROADCAST |
|  19 |           PX BLOCK ITERATOR           |               | 6903K|   118M|  16379  (1)| 00:00:33 |     1 |     3 | Q1,02 | PCWC |            |
|* 20 |            TABLE ACCESS STORAGE FULL  | BOOK          | 6903K|   118M|  16379  (1)| 00:00:33 |     1 |     3 | Q1,02 | PCWP |            |
|  21 |         PX PARTITION LIST JOIN-FILTER |               | 8064M|   135G|   3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWC |            |
|  22 |          TABLE ACCESS STORAGE FULL    | BALANCE       | 8064M|   135G|   3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  5 - access("FCT"."S_VID"="DNSP"."S_VID")
 11 - access("PRDCT"."PRODUCT_ID"(+)="FCT"."PRODUCT_ID")
 15 - storage(NVL("PRDCT"."MODIFY_TS"(+),TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 08:16:34.822981000' AND 
          "PRDCT"."CREATE_TS"(+)TIMESTAMP' 2020-10-14 08:16:34.822981000' AND 
         "PRDCT"."CREATE_TS"(+)<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 16 - access("BOOK_ID"="FCT"."BOOK_ID" AND "VERSION"="FCT"."BOOK_VERSION")
 20 - storage("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
    filter("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
 
Note
-----
  - dynamic sampling used for this statement (level=7)
  - Degree of Parallelism is 4 because of hint

and this:


set pages 999
explain plan for
select count(*)
FROM   prdct
WHERE  prdct.create_ts  <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND    NVL(prdct.modify_ts, DATE '9999-12-31')  > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
   
select * from table(dbms_xplan.display);

Plan hash value: 2639949168
----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |    1 |    14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |    1 |    14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          |  399 |  5586 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  |  399 |  5586 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')

Once you see the problem, you can’t unsee it but for whatever reason, for ages, I was focusing on why there would be such a cardinality difference between A and B for what are basically the single table cardinality estimates for the same table without seeing the explanation right there in front of me.


Note
-----
  - dynamic sampling used for this statement (level=7)

I then got distracted and spent some time investigating why dynamic sampling was kicking in. Parallel query has different rules about when it can kick in. And I found reasons why it might.

But then I put that thought to one side.

Why is dynamic sampling saying this will return 1 row?

This is a small table and the estimate should be accurate and we like dynamic sampling so we want to show the feature to its full capability whenever and wherever it kicks in

Sure enough, as we might expect, if I hint dynamic sampling on the single table, estimate is one.


set pages 999
explain plan for
select /*+ dynamic_sampling(4) */ count(*)
FROM  prdct
WHERE prdct.create_ts                  <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND  NVL(prdct.modify_ts, DATE '9999-12-31')      > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
   
select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |    1 |    14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |    1 |    14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          |    1 |    14 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  |    1 |    14 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 
Note
-----
  - dynamic sampling used for this statement (level=4)

Without access to the sample statements, what suggestive evidence can we find?


select table_name, num_rows, blocks, avg_row_len
from  user_tables
where table_name = 'PRODUCT';

TABLE_NAME                       NUM_ROWS     BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
PRODUCT                               462       4064         193

So that’s not a lot of rows in a lot more blocks than should be necessary.

At this point, we should go off and find out what the reasons might be.

For what it’s worth, for this application, my suspicions are historic direct path maintenance code coupled with table partitioning on status (LATEST/SUPERSEDED) and therefore row movement means an update is a delete + insert, coupled with periods of very low rate of change and small row sizes.

So we end up with old rows moving partition – using a new block for the updated version and leaving behind an empty block where the version was before the update and then relatively few new version rows per run which are using new blocks, and so lots of empty blocks.

i.e. sparsely populated table.

And then dynamic sampling kicks in and samples a bunch of blocks, and finds few actual rows from those samples and comes up with a low estimate and then the whole problem starts. And these were shared tables across the reporting modules so it had a reasonable wide impact.

It would be nice to back this up with some trace files detailing any of the dynamic sampling numbers but I don’t have access by default and mocking this all up outside of work is always a worthwhile exercise but just takes up so much time.

Anyway… move some partitions, rebuild some indexes and 4000 blocks becomes just over 40.

Now our dynamic sampling is looking much better:


----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |   1 |     14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |   1 |     14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          | 297 |     14 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  | 297 |     14 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 
Note
-----
  - dynamic sampling used for this statement (level=4)

Exercise repeated for a whole bunch of tables in a similar situation and everything across the board is a lot healthier.

Negative Cell Offload – One reason

Wed, 2020-04-29 12:12

This is a post that has been sitting around for while… just an illustration of one of the reasons for negative cell offload numbers,

Global Stats
=========================================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency | Cluster  | Buffer | Read | Read  | Write | Write |  Cell   |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes | Offload |
=========================================================================================================================
|    2234 |    1660 |      574 |        0.00 |        0.00 |     0.01 |     5M | 206K |  54GB |  106K |  21GB | -29.87% |
=========================================================================================================================

Parallel Execution Details (DOP=4 , Servers Allocated=8)
===========================================================================================================================================================================================
| Name | Type  | Group# | Server# | Elapsed |   Cpu   |    IO    | Application | Concurrency | Cluster  | Buffer | Read  | Read  | Write | Write |   Cell   |         Wait Events         |
|      |       |        |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) |  Gets  | Reqs  | Bytes | Reqs  | Bytes | Offload  |         (sample #)          |
===========================================================================================================================================================================================
| j012 | QC    |        |         |     173 |     173 |          |        0.00 |             |          |     13 |       |     . |       |     . |     NaN% |                             |
| p032 | Set 1 |      1 |       1 |      23 |      23 |     0.70 |             |             |          |     1M | 13352 |   9GB |       |     . |   51.46% |                             |
| p033 | Set 1 |      1 |       2 |      23 |      22 |     0.73 |             |             |          |     1M | 13057 |   9GB |       |     . |   51.22% | cell smart table scan (1)   |
| p034 | Set 1 |      1 |       3 |      23 |      23 |     0.71 |             |             |          |     1M | 13322 |   9GB |       |     . |   51.22% |                             |
| p035 | Set 1 |      1 |       4 |      23 |      23 |     0.80 |             |             |          |     1M | 13220 |   9GB |       |     . |   51.22% |                             |
| p028 | Set 1 |      2 |       1 |     492 |     349 |      143 |             |        0.00 |     0.00 |   7100 | 38385 |   4GB | 26400 |   5GB | -112.77% | direct path read temp (129) |
|      |       |        |         |         |         |          |             |             |          |        |       |       |       |       |          | direct path write temp (21) |
| p029 | Set 1 |      2 |       2 |     492 |     346 |      146 |             |        0.00 |     0.00 |   6745 | 37827 |   4GB | 26393 |   5GB | -112.77% | direct path read temp (122) |
|      |       |        |         |         |         |          |             |             |          |        |       |       |       |       |          | direct path write temp (17) |
| p030 | Set 1 |      2 |       3 |     491 |     352 |      139 |             |        0.00 |     0.00 |   6590 | 38809 |   4GB | 26390 |   5GB | -108.33% | direct path read temp (109) |
|      |       |        |         |         |         |          |             |             |          |        |       |       |       |       |          | direct path write temp (15) |
| p031 | Set 1 |      2 |       4 |     492 |     349 |      143 |             |        0.00 |     0.00 |   6710 | 38313 |   4GB | 26391 |   5GB | -112.77% | direct path read temp (136) |
|      |       |        |         |         |         |          |             |             |          |        |       |       |       |       |          | direct path write temp (14) |

Here is an illustration I’ve been hanging on to for ages – one reason why you might get negative cell offload numbers – temp.

SQL Monitor uses something like the following for calculating cell offload:

Cell Offload = 100 * ((PHYSICAL_READ_BYTES + PHYSICAL_WRITE_BYTES) – IO INTERCONNECT_BYTES)/NULLIF((PHYSICAL_READ_BYTES + PHYSICAL_WRITE_BYTES),0)

Whilst the PHYSICAL_%_BYTES metrics are measured above the ASM level, ASM then mirrors any writes so these can be tripled for redundancy and counted in IO_INTERCONNECT_BYTES.

So actually it’s often a bit of an apples vs oranges metric…

Negative Offload

Materialize cardinality

Wed, 2020-04-29 11:07

Blowing the dust off the blog with a couple of trivial observations.

Firstly, this little one about one implication of materialize.

I’ve always been a fan of the simplicity of the materialize hint as a quick fix for certain plan stability/performance issues but it comes at a clear cost of actually materialising to temp that subquery and is often used when actually a combination of no_merge, no_unnest and/or no_push_pred might be better choice.

Having been recently working on a platform with a problematic systemic temp addiction, I rarely use it unless I know the cost of materialising that resultset once is cheaper than querying the data the requisite number of times via any alternative method particularly on Exadata where the overhead of querying the data twice might be less than you think (note to self: might be helpful to demo this in a future post).

Here is another implication of materialize on the cardinality of a set of data.

This simulates a real world problem observation where the view contained a materialize hint.

Starting with some data – one day for each of April, five versions per day, between 0 and 5 versions potentially approved each day:

drop table  ref_data_versions;

create table ref_data_versions
(business_date   DATE
,version         NUMBER
,status          VARCHAR2(10));


insert into ref_data_versions
select to_date(20200401,'YYYYMMDD') + days.rn-1
,      versions.rn
,      CASE when versions.rn = round(dbms_random.value(1,5)) then 'APPROVED' ELSE 'UNAPPROVED' END
from   dual
cross join
       (select rownum rn from xmltable('1 to 30')) days
cross join
       (select rownum rn from xmltable('1 to 5')) versions;
       
commit;

select count(*) from ref_data_versions;

The following query represents our view and happens to show the tangential observation that the optimizer does not recognize that the row_number analytic will filter any rows.

explain plan for 
with x as
(select /*+ */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x;
 
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
Plan hash value: 2125428461
 
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |   150 |  6300 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |                   |   150 |  6300 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|                   |   150 |  4350 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | REF_DATA_VERSIONS |   150 |  4350 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("RNK"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

If we add in a predicate on business date, we get:

explain plan for 
with x as
(select /*+ */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x
where  business_date = to_date(20200429,'YYYYMMDD');
 
select * from table(dbms_xplan.display);
Plan hash value: 2125428461
 
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     5 |   210 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |                   |     5 |   210 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|                   |     5 |   145 |     4  (25)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL     | REF_DATA_VERSIONS |     5 |   145 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("RNK"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
   3 - filter("RDV"."BUSINESS_DATE"=TO_DATE(' 2020-04-29 00:00:00', 'syyyy-mm-dd 
              hh24:mi:ss'))
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Note that the query expects to get 5 rows rather than 1, but that’s consistent with what we saw before.

What happens if the subquery uses materialize:

explain plan for 
with x as
(select /*+ materialize */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x
where  business_date = to_date(20200429,'YYYYMMDD');
 
select * from table(dbms_xplan.display);
Plan hash value: 1377080515
 
------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                             |   150 |  6300 |     6  (17)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                             |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D787C_3AB51228 |       |       |            |          |
|*  3 |    VIEW                                  |                             |   150 |  6300 |     4  (25)| 00:00:01 |
|*  4 |     WINDOW SORT PUSHED RANK              |                             |   150 |  4350 |     4  (25)| 00:00:01 |
|   5 |      TABLE ACCESS FULL                   | REF_DATA_VERSIONS           |   150 |  4350 |     3   (0)| 00:00:01 |
|*  6 |   VIEW                                   |                             |   150 |  6300 |     2   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL                     | SYS_TEMP_0FD9D787C_3AB51228 |   150 |  6300 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("RNK"=1)
   4 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
   6 - filter("BUSINESS_DATE"=TO_DATE(' 2020-04-29 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The filter at step 6 is now no longer having any effect on the overall cardinality.

There are approaches using cardinality and opt_estimate which you might use to address some of the underlying issues.

However, just another example of why you should think twice about the liberal application of materialize hints (or any hints!).

The system with the problem was 11.2.0.4. Examples above are run on LiveSQL which is currently 19.

Subquery

Thu, 2019-10-24 07:11


with x as (select sysdate from dual)
select * from utter_bollocks.x;

Developing for the Oracle Database

Wed, 2018-05-09 09:49

1 Coding Standards
What this document does not claim to offer is a prescriptive guide on the minutiae of coding standards.

Coding standards and naming conventions, particularly in SQL and PL/SQL, are an emotive and divisive subject.

This is largely the fault of a historical absence of mature IDEs and a lack of any rigorous, generally accepted standards as seen with other languages.

Many developers still hand-craft scripts and routines and the various tools available often have differences, subtle or otherwise, in what built-in formatting they can offer.

Good developers can adapt, good standards can adapt.

The most important objectives for coding standards are to :
• Make development faster and debugging easier
• Make easier the understanding of other people’s code
• Limit bugs

The following expectations support these objectives:
• Where possible use SQL before using PLSQL
• Code will be reasonably formatted and legible, preferably with a consistent style within the module at least.
• It is preferred but not mandated that Oracle keywords be capitalized – e.g. CREATE OR REPLACE PACKAGE, SELECT, UPDATE, DELETE, FROM – and lowercase used for objects and columns, etc
• In SELECT statements, tables should be aliased – this provides a very small benefit to the optimizer when parsing but also prevents bugs particularly in subqueries.
• Procedures, cursors and variables, etc should be properly scoped – e.g. public vs private, global vs local, parameter scope, etc
• Meaningful names will be given for code items
• Reasonable naming might include some sort of prefixed or suffixed indicator of purpose, e.g. k_ for constants, l_ or v_ for local variables, g_ for global variables, p_ for procedure, f_ for function, _pkg for package, i_ for in parameters, o_for out parameters, io_ for in out parameters.
• Package and procedure level comments should indicate why/when a particular program was changed but SVN, or other code respositories, are the appropriate mechanism for code control.
• Code comments should be used when they add value.
• Excessive commenting and stating the obvious should be avoided – these are often more effective when refactoring the code concerned into a private routine (procedure/function) which is named appropriately (e.g. function f_format_swift_string).
• CamelCase is not considered appropriate for the database as all code is stored in the db as uppercase.
• Package headers and bodies should be checked into separate files for clarity and to prevent unnecessary recompilcation of unchanged code and dependencies (version dependent)
• Performance should be built in and evidence of such documented.

2 Writing Optimal SQL

2.1 Key points
Writing Optimal SQL should be relatively simple but many people struggle particularly when making the transition from an object/attribute language to a set-based language like SQL.

The key tenets of performant database code in Oracle are:
• Think in sets.
• Think about how the database might be able to process and apply logic to a set of data with great efficiency.
• Ask the right question in the best way.
• Know your data.

In support, when thinking about database code and SQL operations:

• If the query needs to be long/big, make it long/big.
• Bulk operations are critical, row-by-row operations are a cardinal performance sin.
• Eliminate data at the earliest opportunity.
• Sort on the smallest possible set – if possible avoid aggregations, sorting and distinct operations on the largest sets of data.
• Use bind variable when you require shareable SQL and when bind variables make sense
• Use literals when literals make sense.
• Use a mix of binds and literals if appropriate.
• Avoid PL/SQL in SQL.
• Be careful of applying functions (TRUNC, etc) to columns in the WHERE clause.
• User-defined functions which are called from SQL and which themselves contain SQL are, almost without exception, unacceptable.
• Never rely on implicit datatype conversion. Use the correct datatypes for parameters and where possible convert parameters NOT columns.

2.2 Thinking in Sets

Crucial.

For further reading on thinking in sets, see:
http://explainextended.com/2009/07/12/double-thinking-in-sql/

2.3 What’s the question?

When writing SQL, focus on the question being asked by the SQL statement.

If you put the question into words as a comment before a complex SQL statement, then this can often add value to the next developer.

Often the most performant version of a SQL statement is the one which asks the question at hand in the most natural way.

To this end, proper consideration needs to be given to:
• Subqueries – EXISTS / IN / NOT EXISTS / NOT IN
• Set-operators – MINUS, UNION, UNION ALL, INTERSECT
• Use of DISTINCT is often an indication of a wrong sql statement or poor design
• Common Table Expressions:

Often it can help to use of Common Table Expressions (CTE), aka the WITH clause, for separating the main logic of the query from the subsequent fetching of additional data/attributes, e.g.

WITH main_logic AS
(SELECT …
FROM …
WHERE …)
SELECT ml.*, x.this, y.that, z.something_else
FROM main_logic ml
, …
WHERE …;

2.4 To ANSI or Not To ANSI

Another divisive subject is ANSI SQL vs Oracle Join syntax.

Again, this guide should not seek to be prescriptive on the preference of one over the other.

The bottom line should be that if a developer finds it easier to write a correct and optimal SQL statement using one rather than the other, then that is most important.

There are some SQL statement constructs which are more conveniently written in ANSI – the FULL OUTER JOIN for example.

It is also true that the optimizer always transforms ANSI SQL to the equivalent Oracle syntax and there are some limitations to the optimizer’s other complex query transformations when using ANSI SQL.

And unfortunately there are bugs in both.

2.5 Eliminate Early

Where there are predicates (WHERE clauses) which can significantly reduce the dataset early, check that they are being applied early enough in the execution plan (more information to follow), check whether the SQL statement might be rephrased or reconstructed (CTE/WITH) to make sure they are applied at an appropriate stage.

2.6 Sort / Aggregate on the smallest possible dataset

Similar to eliminate early. Sorting and aggregating requires memory and under certain conditions can spill to expensive (unscalable) disk operations.
Wherever possible, do the sort or aggregation on the smallest set of rows (not necessarily applicable to the order by clause of a query).

2.7 What’s the big deal with PL/SQL functions called from SQL?

The bottom line is that it’s about performance.

We could get in a whole argument about reusability vs performance but performance eventually wins in the end.

Often the correct mechanism for reusability in the Oracle database is not a function but a view joined to appropriately in the main SQL.

Functions cause a relatively expensive context switch between SQL and PLSQL engines.

In the name of reusability, functions encourage row-by-row operations and discourage thinking in sets.

If the function itself contains SQL, then this SQL will not be part of the read consistency mechanism of the calling statements which can be potentially problematic.

If you absolutely have to, have to, have to use functions in SQL, then think again.

Then if you really, really do then please look at deterministic functions and consider wrapping the function call in a (select from dual) to expose the potential benefits of subquery caching for functions called with repeated parameters.

2.8 What about simple functions like TRUNC in the WHERE clause?

Using functions on columns in the WHERE clause can prevent the optimizer from using an index or from pruning a partition unless a function-based index is in place on the column.

For this reason, it is often best to avoid this sort of construct:
WHERE TRUNC(some_date_column) = TO_DATE(’01-NOV-2013’,’DD-MON-YYYY’)

In favour of this:
WHERE some_date_column) >= TO_DATE(’01-NOV-2013’,’DD-MON-YYYY’)
AND some_date_column) < TO_DATE(’02-NOV-2013’,’DD-MON-YYYY’)

2.9 Using the correct datatype, be explicit

Performance problems related to using the incorrect datatypes are common.

The optimizer will implicitly add functions to make sure the datatypes on both sides of the predicate match.

Always convert date-like parameters to DATEs where the column datatype is also DATE.

Never rely on implicit datatype conversion.

3 Execution Plans & Metrics – What we want, why we want it and how to get it
We have flown through some aspects of how to have a better chance of writing an optimal SQL statement.

3.1 How can we tell if it’s optimal?

Run it.

Run it twice to rule out the effect of uncontrollable factors like OS caching, SAN caching, etc.

Run it on representative data.

Run it on current volumes.

Run it on expected future volumes.

Then what?

In order to validate that our SQL statement is likely to have effective performance, what we want is the actual execution plan used and preferably the actual rowsource metrics.

3.2 How?

3.2.1 Serial (i.e. non-parallel) Execution Plans

In general, the following is usually a good approach across a variety of tools – SQL Developer, Toad and SQL*Plus for example:
Alter session set statistics_level = all;
–bind setup
Var bind1 number
Exec :bind1 := …;

–run target sql statement
select ….
— fetch execution plan and metrics
select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

Then run as a script.

Firstly, for getting actual execution metrics we can do one of two things prior to running the SQL statement concerned:
1. Add the /*+ gather_plan_statistics */ hint to the SQL or
2. In the same session, run alter session set statistics_level = all;

Then run the target SQL and immediately afterwards run this select:

select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

This is a convenient wrapper to get the execution plan and metrics from V$SQL_PLAN.

The first parameter is SQL_ID and by passing in NULL, we default to the previous SQL_ID run in this session.

The second parameter is CHILD_CURSOR_NO and this should be the previous child_id for the previous sql_id.
The third parameter is the FORMAT and ‘ALLSTATS LAST’ format says to get all statistics for the last execution.

If this works this should produce an output which is examined in more detail in section 6.

3.2.2 What if this doesn’t work?

If you find you don’t have privilege to run these commands – you need access to V$SESSION for example to use DBMS_XPLAN.DISPLAY_CURSOR – then you need privilege. There is no reason for privilege not to be given.

Otherwise the approach above is effective 90% of the time.

For parallel execution plans, see section 3.2.3 below.

However, in addition and specifically to SQL Developer, there are some recursive operations run by the tool which means that SQL Developer runs some internal commands such that when our DBMS_XPLAN statement runs, the previous SQL ID is no longer our target SQL statement.

There is one such example in SQL Developer 3 related to timestamp columns which affects the test script when running everything as a script (F5). In this case, there are two alternatives. Firstly, run the individual commands in SQL Developer as Run Statement (F9 / Ctrl + Enter). Alternatively, just comment out the timestamp columns in the SELECT part of the statement, for the purposes of this exercise.

Furthermore, in SQL Developer 4 there are further changes to recursive operations which seem to affect some statements.

In all such cases, if the output of the DBMS_XPLAN.DISPLAY_CURSOR is not the execution plan of the statement being profiled then the approach should be to identify the SQL statement in the shared pool (Look for matching SQL_TEXT in V$SQL) and plug the specific SQL_ID into the first argument of the DBMS_XPLAN call (no need to rerun the target SQL statement).

3.2.3 Parallel Execution Plans

For parallel execution plans, the approach of using DBMS_XPLAN.DISPLAY_CURSOR with the format of ‘ALLSTATS LAST’ is not appropriate because it fetches the execution metrics from the last execution of the statement – which is the Query Coordinator (QC) and does not include the metrics of the parallel slaves.

A better approach for parallel execution plans is to use real time sql monitoring and the easiest way to do this is to run the following and capture the output report:
select dbms_sqltune.report_sql_monitor(‘’) from dual;

This requires you to identify the SQL_ID of the target sql statement from V$SQL (seek matching text in SQL_TEXT/SQL_

FULLTEXT column).
It may also require you to add the /*+ monitor */ hint to your SQL statement as by default this only kicks in on executions which last longer than a default number of seconds (2?) and for statements which are less than a certain length

3.2.4 When all else fails

Fall back on SQL Trace.
Alter session set sql_trace = true;
SELECT….
Alter session set sql_trace = false;

This produces a trace file on the database server and the trace file name can be identified by:
select * from v$diag_info where name = ‘Default Trace File’;

This can be run through TKPROF to get the execution metrics but TKPROF can also lie about the execution plan so this should be double checked in V$SQL_PLAN or by using DBMS_XPLAN.

In rare circumstances and if all the above alternatives are unavailable or impractical for some reason, only then might EXPLAIN PLAN or AUTOTRACE be acceptable.

For example, in any modern version of Oracle, you can do the following
explain plan for select…;
select * from table(dbms_xplan.display);

Now this is not useless but, for numerous reasons, EXPLAIN PLAN cannot be trusted and is not sufficient for our purposes.

AUTOTRACE also does not tell the truth (because it itself relies on EXPLAIN PLAN).
EXPLAIN PLAN is an estimate of what the execution plan for a SQL statement will be.

It doesn’t peek at binds.

It assumes all binds are VARCHAR2.

3.3 Why are we doing this?

We want this information documented as part of the change, attached to the Jira or whatever tool using for change management and included in any code review.

The most effective mechanism for tuning SQL is “Tuning by Cardinality Feedback”: http://www.centrexcc.com/Tuning%20by%20Cardinality%20Feedback.pdf

This follows the principle that:
“if an access plan is not optimal it is because the cardinality estimate for one or more of the row sources is grossly incorrect”
and
“the cbo (cost-based optimizer) does an excellent job of finding the best access plan for a given sql provided it is able to accurately estimate the cardinalities of the row sources in the plan”

By gathering the actual execution plan and the actual execution metrics, we can show whether the optimizer was accurate in its estimations and if it was accurate, then, from a developer perspective and for the purposes of most code reviews, there is a good likelihood that the SQL is good enough for the optimizer to do a good job with.

4 Interpretation of Execution Plans and Execution Metrics

If we’ve been lucky we should have the actual execution plan and the executions metrics.

4.1 What are we looking for? How do we interpret it?

Providing a thorough guide on how to interpret most variations of execution plans is beyond the scope of this guide, although we ill provide a basic guide in Appendix A.

Essentially, what we want to see in the execution metrics is that the optimizer’s estimates are broadly accurate.

How accurate?

In general, we shouldn’t necessarily be overly concerned until we get to a factor of 10x or even more.

Estimated 100K rows, Actual 1M rows – probably not too bothered.

Estimate of 1 row, Actual 10000 rows – likely to be significant inefficiencies in either join order, join mechanism and/or access path.

And when we are looking at estimates vs actual, we need to consider the “Starts” so what we are looking for is that “Starts * E-rows” is in the right ballpark compared to “A-rows”. For more information, please see Appendix A.

Here are a couple of examples:
SQL_ID fst03j2p1czpb, child number 0
————————————-
select * from t1 , t2 where t1.col1 = t2.col1
Plan hash value: 1838229974

————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.03 | 1172 |
|* 1 | HASH JOIN | | 1 | 10000 | 10000 |00:00:00.03 | 1172 |
| 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 10000 |00:00:00.01 | 576 |
| 3 | TABLE ACCESS FULL| T2 | 1 | 10000 | 10000 |00:00:00.01 | 596 |
————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

1 – access(“T1″.”COL1″=”T2”.”COL1″)

In the above, the estimates are accurate so there is a very good chance that this is a good plan.

Here’s another, this time not so good because the estimate of rows in T1 was 1 whereas the actual was 10000.

This led the optimizer to choose an index access path over a full table scan and a NESTED LOOP rather than a HASH JOIN.
SQL_ID 9quvuvkf8tzwj, child number 0
————————————-
select /*+ cardinality(t1 1) */ * from t1 , t2 where t1.col1 =
t2.col1

Plan hash value: 931243032

——————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
——————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 | 12640 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 | 12640 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 | 2640 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 | 596 |
|* 4 | INDEX UNIQUE SCAN | SYS_C00446778 | 10000 | 1 | 10000 |00:00:00.01 | 2044 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 | 10000 |
——————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

4 – access(“T1″.”COL1″=”T2”.”COL1″)

4.2 Do I need to worry about things like NESTED LOOPS vs HASH JOINS?

For the purposes of this exercise, no but the more knowledge the better.

Accuracy of estimates should be sufficient.

The remainder of the information should be attached to the change tool for review.

5 Appendix A: Basic guide to reading an execution plan

Using the following execution plan from a two table join:
SQL_ID 9quvuvkf8tzwj, child number 0
————————————-
select /*+ cardinality(t1 1) */ * from t1 , t2 where t1.col1 =
t2.col1

Plan hash value: 931243032

—————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
—————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |
—————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

4 – access(“T1″.”COL1″=”T2”.”COL1″)

There are four key elements:
• The SQL statement
• The SQL ID – a hash value of the sql statement, usually consistent between databases and even across versions
• The execution plan
• The predicate section – not to be overlooked. Can highlight issues with implicit functions and datatype conversions amongst other things

For the execution plan itself there are a number of elements to be concerned with:
• Optimizer – all modern version of Oracle use the Cost-based Optimizer (CBO). This uses statistics and cost calculations to choose a best-cost plan for execution.

• Cost – Cost is an estimated indicator of time which the optimizer uses to compare execution plan possibilities, usually choosing the lowest cost plan. However, to all intents and purposes, developers should ignore it.

• Cardinality – An estimate of the number of rows for a particular rowsource, for a particular join, etc. Exposed in the execution plan as E-Rows for estimate and A-Rows for actuals. When comparing E-Rows to A-Rows it is important to take Starts into account, i.e to compare “Starts * E-Rows” to A-Rows. The Nested loop operations for example will have multiple starts for the inner/probed rowsource.

• Parent:child operations – An execution plan is generally a succession of parent:child operations – follow and match the indentation. A join mechanism should have two children.

• Join mechanism – A join mechanism joins two rowsources. There are a variety of mechanisms but in general there are two main methods depending on the cardinalities:

o NESTED LOOP – Essentially a FOR LOOP – For each row in the outer/driving rowsource, probe the inner/probed rowsource. Generally used for low cardinality rowsources.

o HASH JOIN – Hash all the outer/driving rowsource based on the join key(s) then hash all the inner rowsource. Generally used for high cardinality rowsources. If the cardinality estimate is too low, work area sizes used for hashing maybe too small and spill to temp space on disk – slow/unscalable

• Join order – Depending on the cardinalities, the optimizer can choose to join T1 to T2 or T2 to T1. The number of permutations for join order is N! where N is the number of tables being joined. The optimizer will limit itself to a maximum number of permutations to evaluate.

• Access path – how the data is fetched from the table, i.e. by index via various different index access mechanisms or by tablescan, etc.

• Buffers – A measure of logical IO. See below.

5.1 What happens first in the execution plan?
—————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
—————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |
—————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
4 – access(“T1″.”COL1″=”T2”.”COL1″)

There are a couple of exceptions but in general the execution plan starts at the first operation without a child.

So, following the indentation, the first operation without a child is:
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |

This is considered to be the inner/driving rowsource of the parent operation at:
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |

For each row in this rowsource, we probe the inner rowsource:
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |

Which is actually an index lookup on the primary key using the predicate:
4 – access(“T1″.”COL1″=”T2”.”COL1″)

The data produced by this join is then used in the parent operation:
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |

Which uses the rowids from the unique index/primary key for table T2 to get the actual table data from T2:
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |

5.2 Red flags?

• Row estimates of 1:
o The minimum row estimate is 1 and in some cases this actually means 0.
o If this is not a primary key access and there really isn’t 0/1, then are there any statistics for this object?
o Row estimates of 0/1 where the actual number of rows is significantly more than 1 can cause significant performance problems

• MERGE JOIN CARTESIAN + BUFFER SORT – particularly where the estimate is 1. Can be particularly detrimental if the actual rows are greater than 1. Rarely a good operation but can be symptomatic of a missing join.

• Implicit datatype conversions

• Nested loop operations where the inner/probed table/rowsource is a FULL segment scan.

• VIEW operations – symptomatic of a non-mergeable view which may or may not be a problem

• FILTER operations where the row-by-row operation is significant

5.3 Is there anything else to look out for?

Yes, that buffers column is a measure of logical IO.

When comparing different ways of doing things, when tuning SQL, one of the key measures that should be targeted is a reduction in logical IO.

If one approach uses significantly less logical IO compared to another approach then that is significant. The statement with the lower IO is likely to be better, is more likely to benefit from having more of the data it’s interested in cached and is less likely to impact other queries and the caching of other data.

There should probably be a rule of thumb about the ratio of logical IO to rows fetched. The difficulty is picking the right indicators.

If a query selects 100 rows from 100 million buffer gets and those all-important estimates are reasonably accurate, this should be a strong signal that perhaps the indexes are not optimal for that particular query.

As a rule of thumb, a ratio of a couple of consistent gets or less per row is damn good. 100,000s or millions may well be an indicator of significant inefficiencies.

But, as always, it depends.

It also significantly depends on whether the query itself is fast enough for the business requirement and whether it has the potential to impact other users of the database.

Furthermore, one lone query is unlikely to justify a new index but that is beyond the scope of this guide.

5.4 Further Reading

A 10053 Trace provides a very detailed walkthrough of the optimizer’s process of coming up with the execution plan. Not for the faint-hearted.
http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-explain-the-explain-plan-052011-393674.pdf
Jonathan Lewis: Cost-based Oracle Fundamentals: http://www.apress.com/9781590596364

6 Appendix B Interacting with the database
The Oracle database is a very powerful piece of software.

It’s also likely to be one of the most expensive pieces of software in an application’s tech stack.

The keys to a performant Oracle database also differ significantly from other vendors.

How you do best approach something in Sybase or SQL Server is not necessarily how you should do something in Oracle.

One classic example is the use of temporary tables.

Developers should know how to get the best out of a particular database.

To treat it like a bit bucket or a “slow, dumb backup datastore” is to waste money and resources.

6.1 Vendor-specific Database features

Application developers should not be overly wary of using a feature particular to the Oracle database. Some tools can make it difficult to use vendor-specific features or optimizations but an investment in time and effort to do so can reap significant performance benefits.

Whilst this attitude might be relevant for third-party product developers who have to write software that can be installed on different database vendors, this is largely not true of enterprises writing internal software systems.

It is unlikely that the Oracle database on a particular system will be replaced by another vendor database.

It is far more likely that a Java component interacting with the database will eventually be replaced by a C# component or that the usage of the Oracle database will be deprecated in favour of caching and NOSQL technologies, so if you’re going to use SQL, use Oracle-specific features where they offer benefit.

6.2 Round-tripping

The default fetchsize for JDBC and for SQL*Plus is 10. The default is almost never appropriate for general usage as many SQL statements can be expected to fetch significantly more than 10 rows and therefore significant gains can be made by increasing this setting beyond the default.

The issue is not only about roundtrips across the network, it’s also related to the logical IO that a query needs to do. If you ask for just 10 rows, the database will do all the IO it needs to do to fetch the first ten rows. When you ask for the next 10 rows, the server process on the database might well have to do a logical read of some of the same blocks as the previous fetch which can lead to significant inefficiencies compared to a larger fetchsize.

6.3 Abstraction & Interfacing with the database

Abstraction is a principle that is put on a pedestal in the middle tier and yet often abandoned when interacting with the database.

Put simply if SQL is embedded in Java code then this introduces unnecessary dependencies on the database model and limits the ability to make subtle changes to the SQL or to the model without making a change to the application server code and doing an app server release.

Views, procedures and packages can all provide an interface to the database and the data model.

6.4 It’s all about the data.

Interacting with data appropriately, regardless of database vendor, is crucial.

Think in Sets.

Also consider the success of Engineered Systems like Oracle Exadata.

One of the things that Exadata focuses on, for example, is being able to eliminate redundant data as early as possible.

This means that the logic in the storage cells can eliminate the data before it even gets to the traditional database memory, before it goes anywhere near the network, long before it goes up to the application.

And it can do this with significant degrees of parallelism, usually with far more efficiency than similar processing in application threads.

Why is this relevant?

Eliminate early.

Let the database do the work it was designed to do.

Applications should let the database give them the smallest set of data that they need and should not bring excessive amounts of data into the middle tier for elimination and aggregation there.

Volumes of data are exploding. The best chances of scaling efficiently to deal with these volumes of data are to interact with the data appropriately.

Blocking Prepared XA Transaction

Sat, 2018-01-20 06:20

There was an oracle-l thread last month about blocking sessions which could not be identified.

I replied back about a very niche scenario which I’d seen a few times before. And it’s just happened again so I thought it would be useful to just post some evidence on this.

Here are a couple of articles posted previously on XA transactions:

First, a reminder, from the posts above, that a transaction doesn’t have to have a session.

Normal “vanilla” sessions, there’s a 1:1 relationship. But with an XA transaction, a session can attach and detach – but only one session can be attached to a transaction at any one time.

And a bit of context about these XA transactions.

This is a complex Java application with multiple resources participating in the XA transaction. There was HornetQ involved and HornetQ was apparently timing out and was bounced, the taking down being down probably with a kill -9.

And an XA transaction does what is called a 2-phase commit where the first phase involves the transaction co-ordinator instructing the participating resources to prepare to commit and subsequently the second phase is to commit.

What I have seen many times on this application is that something happens to the transaction co-ordinator in between the PREPARE and the COMMIT and we are left we an orphaned transaction.

This transaction will eventually time-out – depending on the properties set on the transaction by the co-ordinator – at which point it will become an in-doubt transaction and be visible in dba_2pc_pending.

Back to reality… there is nothing in DBA_2PC_PENDING:

select * from dba_2pc_pending;

no rows selected

But we can see there is a transaction in prepared state in V$TRANSACTION, a transaction which started yesterday afternoon:

select * from v$transaction where status = 'PREPARED';
ADDR                 XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC STATUS           START_TIME           START_SCNB START_SCNW START_UEXT START_UBAFIL START_UBABLK START_UBASQN START_UBAREC SES_ADDR               FLAG SPACE RECURSIVE NOUNDO PTX NAME                                                                                                                                                                                                                                                             PRV_XIDUSN PRV_XIDSLT PRV_XIDSQN PTX_XIDUSN PTX_XIDSLT PTX_XIDSQN     DSCN-B     DSCN-W  USED_UBLK  USED_UREC     LOG_IO     PHY_IO     CR_GET  CR_CHANGE START_DATE            DSCN_BASE  DSCN_WRAP  START_SCN DEPENDENT_SCN XID              PRV_XID          PTX_XID        
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- -------------------- ---------- ---------- ---------- ------------ ------------ ------------ ------------ ---------------- ---------- ----- --------- ------ --- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------------------- ---------- ---------- ---------- ------------- ---------------- ---------------- ----------------
0000004A833D6868        691         32     315541         22    1085802     -28624         18 PREPARED         01/19/18 15:39:45    3454176189       2913         13           22      1085802       -28624            9 0000004B82E584A0    4201987 NO    NO        NO     NO                                                                                                                                                                                                                                                                            0          0          0          0          0          0          0          0          1         12     -40287       -163  -66597824      -1385 19-JAN-2018 15:39:45          0          0    1.3E+13             0 B302200095D00400 0000000000000000 0000000000000000 


And this is the script which I mentioned on the oracle-l thread which is one I use to see what locks transaction are holding, when the transaction started and what sessions are attached:

select s.machine
,      lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      ob.owner
,      ob.object_name
,      ob.subobject_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      tx.xid
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr
order by txn_start_time, session_id, object_name;

For privacy reasons and as this is a real-world situation and not an isolated test case, I won’t share the output of the script.

But it shows that:

  • the transaction is holding mode 6 exclusive TX row locks on a number of objects
  • that the transaction is in PREPARED
  • and that there is no session attached to the transaction (although v$locked_object does tell us what the SID was when it was there)

Now, from the application perspective, something has apparently rolled back a message perhaps because now HornetQ has been bounced, everything is back up and running and it seems like the message that resulted in our orphaned transaction is being reattempted and is being blocked by the exclusive locks still being held.

From an ACTIVE SESSION HISTORY perspective, this is what we see from this script from which I’ve removed columns for brevity and privacy but left them in so you can see what I run normally:

select count(*) over (partition by h.sample_time) sess_cnt
--,      h.user_id
--,      (select username from dba_users u where u.user_id = h.user_id) u, h.service_hash
,      xid--, sample_id
, sample_time, session_state, session_id, session_serial#,--sql_id,
sql_exec_id, sql_exec_start, event, --p1,
mod(p1,16), blocking_session,blocking_session_serial#--, current_obj#
--,      (select object_name||' - '||subobject_name from dba_objects where object_id = current_obj#) obj
--,      (select sql_fulltext from v$sql s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--,      (select sql_text from dba_hist_sqltext s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--, h.*
from   v$active_session_history h
where event = 'enq: TX - row lock contention'
order by h.sample_id desc;
XID              SESSION_STATE SESSION_ID SESSION_SERIAL# EVENT                                                            MOD(P1,16) BLOCKING_SESSION BLOCKING_SESSION_SERIAL#
---------------- ------------- ---------- --------------- ---------------------------------------------------------------- ---------- ---------------- ------------------------
4F021700A3C00600 WAITING              232           53035 enq: TX - row lock contention                                             6
FC010B00196E0A00 WAITING              471            5205 enq: TX - row lock contention                                             6
670320004FA50300 WAITING             2652           11791 enq: TX - row lock contention                                             6
640204005BA40500 WAITING             4300           49665 enq: TX - row lock contention                                             6

So, you can see we have four sessions being blocked on exclusive mode 6 row level locks and that the blocking session is null. If I showed you the full history then you would see that these sessions have been repeatedly trying for many hours.

BTW, this is RAC but all these sessions are intentionally on the same instance so there’s none of that jiggery-pokery involved.

I imagine at some point there was an internal conversation in Oracle about whether to report blocking session or blocking transaction.

At this point, it’s just a normal lock held by a “normal” transaction which hasn’t committed yet and actually the application just keeps trying to run the transaction waits for 1 minute until they hit the default distributed transaction timeout:

ORA-02049: timeout: distributed transaction waiting for lock

which will be logged somewhere obscure in the application logs – and there’ll be a brief pause and then it starts all over again.

Anyway at this point what to do?

At this point, the transaction hasn’t timed out.

In a few hours, the transaction will time out and become an in-doubt transaction.

Once it does, the application will receive a message:

ORA-01591: lock held by in-doubt distributed transaction

At which time, it can be manually resolved.

Typically, this is what we usually do:

begin
for txn in (select local_tran_id from dba_2pc_pending)
loop
-- if txn is not already forced rollback
execute immediate 'rollback force '''||txn.local_tran_id||'''';
commit;
dbms_transaction.purge_lost_db_entry(txn.local_tran_id);
commit;
end loop;
end;
/

If we were going to intervene right now and resolve it, what could we do?

We could bounce the database. Then the application would become in-doubt and see above.

But bouncing the database is quite drastic.

A lot of the advice in the manual is about what to do once it is IN-DOUBT. Some of it might work now – not sure.

What I advise is that if we know we want to rollback or commit this transaction now and we don’t want to wait for it to become IN-DOUBT which often we can’t wait for then programatically we can attach to the transaction using DBMS_XA and do something with it.

First of all, we need some information about the transaction.
CAVEAT DBA!!! This isn’t exactly well documented but I have found that what we tend to need is to identify the transactions in V$GLOBAL_TRANSACTION which are in COLLECTING state:

select state
,      UTL_RAW.CAST_TO_BINARY_INTEGER (globalid)
,      UTL_RAW.CAST_TO_BINARY_INTEGER (branchid)
,      t.* 
from v$global_transaction t where state = 'COLLECTING';
STATE                                  UTL_RAW.CAST_TO_BINARY_INTEGER(GLOBALID) UTL_RAW.CAST_TO_BINARY_INTEGER(BRANCHID)   FORMATID GLOBALID                                                                                                                         BRANCHID                                                                                                                           BRANCHES   REFCOUNT PREPARECOUNT STATE                                       FLAGS COUPLING      
-------------------------------------- ---------------------------------------- ---------------------------------------- ---------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ------------ -------------------------------------- ---------- ---------------
COLLECTING                                                                   49                                       45     131075 312D2D35363832376566363A393236643A35613562363664363A633738353036                                                                 2D35363832376566363A393236643A35613562363664363A633738353065                                                                              1          1            1 COLLECTING                                      1 TIGHTLY COUPLED 


Then we can replace the formatid, global id and branch id in the script below. Whether you require numbers or raw depends on the signature to DBMS_XA_XID – see documentation.

set serveroutput on
DECLARE
l_xid     DBMS_XA_XID :=
DBMS_XA_XID(131075,
'312D2D35363832376566363A393236643A35613562363664363A633738353036',
'2D35363832376566363A393236643A35613562363664363A633738353065');
l_return  PLS_INTEGER;
BEGIN
l_return := SYS.dbms_xa.xa_rollback(xid =&gt;  l_xid);
dbms_output.put_line(l_return);
END;
/

This approach above comes with no guarantees.
But it has worked for me several times in the past.

Pages