Friday, July 25, 2025

Determining Optimal Index Key Compression Length

 In 2008, Richard Foote wrote this still excellent 4-part guide to index key compression. 

He started with this comment, that I think is just as valid as it was then:

“Index compression is perhaps one of the most under used and neglected index options available. It has the potential to substantially reduce the overall size of non-Unique indexes and multi-column unique indexes, in some scenarios dramatically so... Not only will it potentially save storage, but if the resultant index contains fewer leaf blocks, that’s potentially fewer LIOs and from the Cost Based Optimizer’s point of view, potentially a cheaper execution plan option.”

Index key compression is a highly effective option for reducing index size and improving index performance.  

“Oracle stores each distinct combination of compressed column values found within a specific index leaf block in a ‘Prefix’ table within the leaf block and assigns each combination a unique prefix number.” If the prefix length (the number of leading columns to be compressed) is too great, then the prefix table will contain more entries, ultimately one for every row in the index. The compressed index could end up being larger than the uncompressed index!  If the prefix length is too small, then you might not get as much compression as you might with a longer prefix length.  

In other words, there is a sweet spot where you will achieve optimal compression.  That sweet spot can vary from no compression to compressing all the columns.  It will vary from index to index, from partition to partition, and potentially over time as the data in an index changes.

Test Every Option to Determine Optimal Compression

One way to determine optimal compression is through exhaustive testing.  Each index could be rebuilt at each possible compression prefix length, and the size of the index could be compared, and the performance of application processes could be tested.

The following PL/SQL script (available on GitHub) rebuilds each index on a named table at each possible compression length, collects statistics and stores them in a table.

REM index_compression_test.sql
create table gfc_index_compression_stats
(table_name varchar2(128)
,index_name varchar2(128)
,num_rows number
,last_analyzed date
,prefix_length number 
,blevel number 
,leaf_blocks number 
,avg_leaf_blocks_per_key number 
,avg_data_blocks_per_key number 
,clustering_factor number 
,constraint gfc_index_compression_stats_pk primary key (table_name, index_name, prefix_length)
);

DECLARE
  l_table_name VARCHAR2(128) := 'PSTREENODE';
  l_num_cols INTEGER;
  l_sql CLOB;
  e_invalid_compress_length EXCEPTION;
  PRAGMA EXCEPTION_INIT(e_invalid_compress_length,-25194); 
BEGIN
  FOR i IN (
    SELECT table_name, index_name, column_position prefix_length FROM user_ind_columns
    WHERE table_name = l_table_name
    UNION
    SELECT table_name, index_name, 0 FROM user_indexes
    WHERE table_name = l_table_name
    ORDER BY table_name, index_name, prefix_length DESC
  ) LOOP
   IF i.prefix_length > 0 THEN 
     l_sql := 'ALTER INDEX '||i.index_name||' REBUILD COMPRESS '||i.prefix_length;
   ELSE
     l_sql := 'ALTER INDEX '||i.index_name||' REBUILD NOCOMPRESS';
   END IF;

   BEGIN
     dbms_output.put_line(l_sql);
     EXECUTE IMMEDIATE l_sql;
     dbms_stats.gather_index_stats(user,i.index_name);
   
     MERGE INTO gfc_index_compression_stats u
     USING (SELECT * FROM user_indexes WHERE table_name = i.table_name And index_name = i.index_name) s
     ON (u.table_name = s.table_name AND u.index_name = s.index_name AND u.prefix_length = NVL(s.prefix_length,0))
     WHEN MATCHED THEN UPDATE SET u.num_rows = s.num_rows, u.last_analyzed = s.last_analyzed, u.blevel = s.blevel, u.leaf_blocks = s.leaf_blocks, u.avg_leaf_blocks_per_key = s.avg_leaf_blocks_per_key, u.avg_data_blocks_per_key = s.avg_data_blocks_per_key, u.clustering_factor = s.clustering_factor
     WHEN NOT MATCHED THEN INSERT (table_name, index_name, num_rows, last_analyzed, prefix_length, blevel, leaf_blocks, avg_leaf_blocks_per_key, avg_data_blocks_per_key, clustering_factor)
     VALUES (s.table_name, s.index_name, s.num_rows, s.last_analyzed, NVL(s.prefix_length,0), s.blevel, s.leaf_blocks, s.avg_leaf_blocks_per_key, s.avg_data_blocks_per_key, s.clustering_factor);
   EXCEPTION 
     WHEN e_invalid_compress_length THEN NULL;
   END;  
  END LOOP; 
END;
/
The following chart presents the data collected by the script above for the PSTREENODE table in PeopleSoft.  The number of leaf blocks is graphed against the compression prefix length. The left-hand end of each line shows the uncompressed size of the index. 
For most indexes, the size decreases as the compression prefix length increases until it reaches a minimum.  That is the optimal compression.  Beyond that point, where the prefix columns are (or are almost) unique, the compressed index is larger than the uncompressed index. In my example, only two indexes benefit from the entire key index being compressed.  For all the other indexes, the optimal compression is obtained when some, but not all, of the key columns are compressed.
There are 8 indexes on PSTREENODE, and in all, the script performs 53 index rebuilds.  On a small table such as this, it only takes a few minutes to work through this, but on a larger table, this could easily become time-consuming.

Let Oracle Calculate the Optimal Compression Prefix Length

The alternative is to let Oracle calculate the optimal compression prefix length.
Before Oracle introduced the DBMS_STATS package, we used the ANALYZE command to collect optimizer statistics.  The last remaining use of this command is to validate object structures.
ANALYZE INDEX … VALIDATE STRUCTURE CASCADE;
You can validate an object and all dependent objects (for example, indexes) by including the CASCADE option. The following statement validates the emp table and all associated indexes:
ANALYZE TABLE emp VALIDATE STRUCTURE CASCADE;

For an index, Oracle Database verifies the integrity of each data block in the index and checks for block corruption. This clause does not confirm that each row in the table has an index entry or that each index entry points to a row in the table. You can perform these operations by validating the structure of the table with the CASCADE clause.
Oracle Database also computes compression statistics (optimal prefix compression count) for all normal indexes.
Oracle Database stores statistics about the index in the data dictionary views INDEX_STATS and INDEX_HISTOGRAM.

The following script analyses each index.
INDEX_STATS displays only the results for the last ANALYZE command in the current session, so the script transfers them to a permanent table.  It does not reanalyse indexes for which there is already a result stored.
REM calc_opt_comp.sql
REM (c)Go-Faster Consultancy Ltd. 2014
REM see https://blog.psftdba.com/2016/02/implementing-index-compression-and.html
set serveroutput on autotrace off
clear columns
SPOOL calc_opt_comp

REM DROP TABLE sysadm.gfc_index_stats PURGE;

--create working storage table with same structure as INDEX_STATS
CREATE TABLE sysadm.gfc_index_stats 
AS SELECT * FROM index_stats
WHERE 1=2
/

ALTER TABLE sysadm.gfc_index_stats
MODIFY name NOT NULL
/

CREATE UNIQUE INDEX sysadm.gfc_index_stats
ON sysadm.gfc_index_stats (name, partition_name)
/

undefine table_name
DECLARE
 l_sql        VARCHAR2(100);
 l_owner      VARCHAR2(8) := 'SYSADM';
 l_table_name VARCHAR2(30) := '&&table_name';
BEGIN
 FOR i IN (
  SELECT i.index_name, ip.partition_name
  FROM   all_indexes i
  ,      all_ind_partitions ip
  WHERE  i.index_type like '%NORMAL'
  AND    i.table_owner = l_owner
  AND    i.partitioned = 'YES'
  AND    i.table_name = l_table_name
  AND    ip.index_owner = i.owner
  AND    ip.index_name  = i.index_name
  AND    ip.subpartition_count = 0
  AND    ip.segment_created = 'YES'
  UNION
  SELECT i.index_name, isp.subpartition_name
  FROM   all_indexes i
  ,      all_ind_subpartitions isp
  WHERE  i.index_type like '%NORMAL'
  AND    i.table_owner = l_owner
  AND    i.partitioned = 'YES'
  AND    i.table_name = l_table_name
  AND    isp.index_owner = i.owner
  AND    isp.index_name  = i.index_name
  AND    isp.segment_created = 'YES'
  UNION
  SELECT i.index_name, NULL
  FROM   all_indexes i
  WHERE  i.index_type like '%NORMAL'
  AND    i.table_owner = l_owner
  AND    i.table_name = l_table_name
  AND    i.partitioned = 'NO'
  AND    i.segment_created = 'YES'
  MINUS
  SELECT name, partition_name
  FROM   sysadm.gfc_index_stats
 ) LOOP
  IF i.partition_name IS NULL THEN
    l_sql := 'ANALYZE INDEX '||l_owner||'.'||i.index_name||' VALIDATE STRUCTURE';
  ELSE
    l_sql := 'ANALYZE INDEX '||l_owner||'.'||i.index_name||' PARTITION ('||i.partition_name||') VALIDATE STRUCTURE';
  END IF;

  dbms_output.put_line(l_sql);
  EXECUTE IMMEDIATE l_sql;

  DELETE FROM sysadm.gfc_index_stats g
  WHERE EXISTS(
	SELECT  'x'
	FROM	index_stats i
	WHERE 	i.name = g.name
	AND	(i.partition_name = g.partition_name OR (i.partition_name IS NULL AND g.partition_name IS NULL)));

  INSERT INTO sysadm.gfc_index_stats 
  SELECT i.* FROM index_stats i;
  COMMIT;
 END LOOP;
END;
/
…
The script produces reports of its analysis.  The summary report shows the optimal compression length for each index and lists the columns that are and are not compressed.  We can see that the result of the ANALYZE command agrees with the result of the previous test that rebuilt each index at each compression length and measured the size of the index.
                                                     Summary Report

                                      Opt Comp                         Weighted         Est.                            
                                        Prefix        Num               Average         Comp                            
Table Name         Index Name           Length FREQ Parts       Blocks Saving %       Blocks                            
------------------ ------------------ -------- ---- ----- ------------ -------- ------------                            
Compress Columns                                            Do Not Compress Columns                                     
----------------------------------------------------------- ----------------------------------------------------------- 
PSTREENODE         PSAPSTREENODE             4    1     0        2,048     41.0        1,208                            
SETID, TREE_NAME, EFFDT, TREE_BRANCH                        TREE_NODE, TREE_NODE_NUM, TREE_NODE_NUM_END, TREE_NODE_TYPE 
                                                                                                                        
                   PSBPSTREENODE             8    1     0        1,920     34.0        1,267                            
SETID, TREE_NAME, TREE_BRANCH, TREE_NODE_NUM, TREE_NODE, TR                                                             
EE_NODE_NUM_END, TREE_LEVEL_NUM, TREE_NODE_TYPE                                                                         
                                                                                                                        
                   PSDPSTREENODE             3    1     0        1,280     61.0          499                            
SETID, TREE_NAME, EFFDT                                     PARENT_NODE_NUM                                             
                                                                                                                        
                   PSFPSTREENODE             2    1     0        1,024     67.0          338                            
TREE_NAME, EFFDT                                                                                                        
                                                                                                                        
                   PSGPSTREENODE             2    1     0        2,304     35.0        1,498                            
PARENT_NODE_NAME, TREE_NAME                                 EFFDT, TREE_NODE, SETID                                     
                                                                                                                        
                   PSHPSTREENODE             2    1     0        2,048     24.0        1,556                            
TREE_NODE, TREE_NAME                                        EFFDT, SETID, SETCNTRLVALUE, TREE_NODE_NUM                  
                                                                                                                        
                   PSIPSTREENODE             3    1     0        1,152       .0        1,152                            
SETID, TREE_NAME, EFFDT                                     TREE_NODE, TREE_NODE_NUM, TREE_NODE_NUM_END                 
                                                                                                                        
                   PS_PSTREENODE             4    1     0        1,792     46.0          968                            
SETID, SETCNTRLVALUE, TREE_NAME, EFFDT                      TREE_NODE_NUM, TREE_NODE, TREE_BRANCH                       
                                                                                                                        
******************                                  ----- ------------          ------------                            
                                                                                                                        
                                                                                                                        
sum                                                     0       13,568                 8,486

Compression of Partitioned Indexes

If you partition an index, then the script validates the structure of each physical partition.  The detailed report shows the optimal compression for each partition.  You may find that Oracle determines that the optimal compression is different for different partitions.      
Only a single compression length can be specified for each index.  It is then applied to all the partitions, although compression can be disabled on specific partitions.  A judgement has to be made as to what is the best balance.  
                                                     Detail Report

                                                                     Opt Comp                             Est.          
                                                                       Prefix              Saving         Comp          
Table Name         Index Name         Partition Name                   Length       Blocks      %       Blocks          
------------------ ------------------ ------------------------------ -------- ------------ ------ ------------          
…
                   PSHJRNL_LN         JRNL_LNH201612                        1      143,264  142.0      -60,171          
                                      JRNL_LNH201712                        0       88,192   74.0       22,930          
                                      JRNL_LNH201812                        6       12,240     .0       12,240          
                                      JRNL_LNH201912                        6       11,104     .0       11,104          
…
                                      JRNL_LNH202201                        6       13,752     .0       13,752          
                                      JRNL_LNH202202                        6        5,496     .0        5,496          
                                      JRNL_LNH202203                        6        6,504     .0        6,504          
                                      JRNL_LNH202204                        6        5,920     .0        5,920          
                                      JRNL_LNH202205                        6        6,864     .0        6,864          
                                      JRNL_LNH202206                        6       13,584     .0       13,584          
                                      JRNL_LNH202207                        6       12,408     .0       12,408          
                                      JRNL_LNH202208                        3      212,904  121.0      -44,710          
                                      JRNL_LNH202209                        0      262,472  111.0      -28,872          
                                      JRNL_LNH202210                        3      228,552  102.0       -4,571          
                   ******************                                         ------------        ------------          
                   sum                                                           1,625,328             574,550
NB: ANALYZE INDEX on some partitions predicated a saving greater than 100%, leading to a negative predicted size estimate.  This is obviously impossible. In the past, there was a bug (now long since resolved) that caused this behaviour.  This occurs when the predicted optimal compression length is less than the current compression length in an index that is already compressed.  However, this problem does not occur consistently.  
In the above example, 6 columns of PSHJRLN_LN have already been compressed for all partitions.  The script has validated that, for the majority of partitions, this is optimal and calculates that there is no further space saving available.  However, some partitions require less compression.  
The choice is between either:
  • Choosing to compress the entire index at a shorter compression.  In which case, most of the partitions will be larger, the exception partitions will be small, but the net effect is that the index will be larger. 
  • Disabling compression on these partitions.  Over-compressed indexes are generally only slightly larger than uncompressed indexes, so the benefit is probably only small
  • Leave compression at the length that is optimal for most of the partitions, accepting that a few partitions will be over-compressed.  This usually results in the smallest index overall.
On balance, I tend to favour the last option on the basis that an over-compressed index is only slightly larger than an uncompressed index.  

Monday, June 30, 2025

Optimising Journal Line Queries: 5. Conclusion

This is the last of five articles that examine the challenges posed by typical queries on the journal line table (PS_JRNL_LN).

  1. Problem Statement
  2. Exadata System Statistics
  3. Partitioning
  4. Compression
  5. Conclusion

After introducing Exadata system statistics, partitioning, and compression and also archiving some historical data, we arrived at the point where the execution plan of the statement changes without needing to use hints.

Original Execution Plan (With Hints)

I have used hints to force the original execution plan.  

ALTER SESSION SET STATISTICS_LEVEL=ALL;
SELECT /*+LEADING(A) USE_NL(B) 
          INDEX(B (PS_JRNL_LN.BUSINESS_UNIT PS_JRNL_LN.JOURNAL_ID PS_JRNL_LN.JOURNAL_DATE PS_JRNL_LN.UNPOST_SEQ PS_JRNL_LN.JOURNAL_LINE PS_JRNL_LN.LEDGER))*/
     A.FISCAL_YEAR, A.ACCOUNTING_PERIOD, B.BUSINESS_UNIT, B.JOURNAL_ID, TO_CHAR(B.JOURNAL_DATE,'YYYY-MM-DD'), B.LEDGER, B.ACCOUNT
, B.PRODUCT, B.PROJECT_ID, B.CHARTFIELD1, B.CHARTFIELD2, B.CURRENCY_CD, B.AFFILIATE, SUM( B.MONETARY_AMOUNT), B.FOREIGN_CURRENCY
, SUM( B.FOREIGN_AMOUNT), A.REVERSAL_CD, A.REVERSAL_ADJ_PER, A.JRNL_HDR_STATUS, TO_CHAR(A.POSTED_DATE,'YYYY-MM-DD'), A.OPRID, A.DESCR254
, B.DEPTID, A.SOURCE, B.ALTACCT, TO_CHAR(CAST((A.DTTM_STAMP_SEC) AS TIMESTAMP),'YYYY-MM-DD-HH24.MI.SS.FF'), B.LINE_DESCR 
FROM PS_JRNL_HEADER A, PS_JRNL_LN B 
WHERE (A.BUSINESS_UNIT = B.BUSINESS_UNIT 
AND A.JOURNAL_ID = B.JOURNAL_ID 
AND A.JOURNAL_DATE = B.JOURNAL_DATE 
AND A.UNPOST_SEQ = B.UNPOST_SEQ 
AND A.JRNL_HDR_STATUS IN('P','V','U') 
AND A.FISCAL_YEAR IN (2024) 
AND A.ACCOUNTING_PERIOD BETWEEN 1 AND 12
AND B.CHARTFIELD1 IN ('1234567','1234568','1234569')
AND B.LEDGER IN ('LEDGER')) 
GROUP BY A.FISCAL_YEAR, A.ACCOUNTING_PERIOD, B.BUSINESS_UNIT, B.JOURNAL_ID, B.JOURNAL_DATE, B.LEDGER, B.ACCOUNT, B.PRODUCT, B.PROJECT_ID, B.CHARTFIELD1, 
B.CHARTFIELD2, B.CURRENCY_CD, B.AFFILIATE, B.FOREIGN_CURRENCY, A.REVERSAL_CD, A.REVERSAL_ADJ_PER, A.JRNL_HDR_STATUS, A.POSTED_DATE, A.OPRID, A.DESCR254, 
B.DEPTID, A.SOURCE, B.ALTACCT, A.DTTM_STAMP_SEC, B.LINE_DESCR
/
select * from table(dbms_xplan.display_cursor(null,null,'ADVANCED +ADAPTIVE'));

Rather than get the execution plan from EXPLAIN PLAN, I have executed the query with STATISTICS_LEVEL set to ALL, and then displayed the cursor.

The execution plan starts with the PS_JRNL_HEADER table and uses a nested loop join with a lookup of the unique index on PS_JRNL_LN.  Although note that the optimizer costs were produced with Exadata system statistics.

Plan hash value: 4030641493

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name           | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                |      1 |        |       |       |  1740K(100)|          |       |       |    209K|00:06:41.85 |     238M|     15M|
|   1 |  HASH GROUP BY                      |                |      1 |    498K|   108M|   129M|  1740K  (1)| 00:01:08 |       |       |    209K|00:06:41.85 |     238M|     15M|
|   2 |   NESTED LOOPS                      |                |      1 |    498K|   108M|       |  1722K  (1)| 00:01:08 |       |       |    495K|01:03:03.80 |     238M|     15M|
|   3 |    NESTED LOOPS                     |                |      1 |    498K|   108M|       |  1722K  (1)| 00:01:08 |       |       |    459M|00:11:20.66 |    5549K|   4259K|
|*  4 |     TABLE ACCESS STORAGE FULL       | PS_JRNL_HEADER |      1 |    430K|    41M|       |  1135   (8)| 00:00:01 |       |       |    430K|00:00:00.34 |   88642 |  88637 |
|   5 |     PARTITION RANGE ITERATOR        |                |    430K|      1 |       |       |     3   (0)| 00:00:01 |   KEY |   KEY |    459M|00:10:38.60 |    5460K|   4170K|
|*  6 |      INDEX RANGE SCAN               | PS_JRNL_LN     |    430K|      1 |       |       |     3   (0)| 00:00:01 |   KEY |   KEY |    459M|00:09:55.80 |    5460K|   4170K|
|*  7 |    TABLE ACCESS BY LOCAL INDEX ROWID| PS_JRNL_LN     |    459M|      1 |   127 |       |     4   (0)| 00:00:01 |     1 |     1 |    495K|00:50:25.33 |     233M|     11M|
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
…
Predicate Information (identified by operation id):
---------------------------------------------------

   4 - storage(("A"."FISCAL_YEAR"=2024 AND INTERNAL_FUNCTION("A"."JRNL_HDR_STATUS") AND "A"."ACCOUNTING_PERIOD"<=12 AND "A"."ACCOUNTING_PERIOD">=1))
       filter(("A"."FISCAL_YEAR"=2024 AND INTERNAL_FUNCTION("A"."JRNL_HDR_STATUS") AND "A"."ACCOUNTING_PERIOD"<=12 AND "A"."ACCOUNTING_PERIOD">=1))
   6 - access("A"."BUSINESS_UNIT"="B"."BUSINESS_UNIT" AND "A"."JOURNAL_ID"="B"."JOURNAL_ID" AND "A"."JOURNAL_DATE"="B"."JOURNAL_DATE" AND
              "A"."UNPOST_SEQ"="B"."UNPOST_SEQ" AND "B"."LEDGER"='LEDGER')
       filter("B"."LEDGER"='LEDGER')
   7 - filter(("B"."CHARTFIELD1"='1234567' OR "B"."CHARTFIELD1"='1234568' OR "B"."CHARTFIELD1"='1234569'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3
---------------------------------------------------------------------------

1 -  SEL$1
	           -  LEADING(A)

   6 -  SEL$1 / B@SEL$1
           -  INDEX(B (PS_JRNL_LN.BUSINESS_UNIT PS_JRNL_LN.JOURNAL_ID PS_JRNL_LN.JOURNAL_DATE PS_JRNL_LN.UNPOST_SEQ PS_JRNL_LN.JOURNAL_LINE PS_JRNL_LN.LEDGER))
           -  USE_NL(B)

The cost of this execution plan depends mainly upon how many journal header rows are selected.  There is a cost of 3 per index lookup, plus another 1 for the table access, making a total of 4 per journal header row.  Here we selected 430K rows from PS_JRNL_HEADER, so 430K rows * 4/row = 1720K.  We got an actual cost of 1722K.  The discrepancy is because the 430K was rounded off by the representation of numbers in the execution plan.  Then the cost of the GROUP BY operation is 18K.  Hence, the overall cost is 1740K.

The actual execution time of the query was 63 minutes (78 minutes with time taken to fetch 211K rows across the network to the client).  

New Execution Plan (Without Hints)

Without the hints, the execution plan changes.  It still starts with a full scan of PS_JRNL_HEADER, but then full scans PS_JRNL_LN, filtering the rows by LEDGER and CHARTFIELD1 (operation at line 12) and Bloom filtering by the columns joined to PS_JRNL_HEADER.  PS_JRNL_LN is then hash joined against the rows from PS_JRNL_HEADER to remove false positives returned by the Bloom filter
Plan hash value: 1053505630

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                             | Name           | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT                      |                |      1 |        |       |       |   535K(100)|          |       |       |    209K|00:00:10.06 |      27M|     27M|
|     1 |  HASH GROUP BY                        |                |      1 |    498K|   108M|   129M|   535K (24)| 00:00:21 |       |       |    209K|00:00:10.06 |      27M|     27M|
|  *  2 |   HASH JOIN                           |                |      1 |    498K|   108M|    46M|   517K (24)| 00:00:21 |       |       |    495K|00:00:09.23 |      27M|     27M|
|     3 |    PART JOIN FILTER CREATE            | :BF0000        |      1 |    498K|   108M|       |   517K (24)| 00:00:21 |       |       |    430K|00:00:00.66 |   88642 |  88638 |
|-    4 |     NESTED LOOPS                      |                |      1 |    498K|   108M|    46M|   517K (24)| 00:00:21 |       |       |    430K|00:00:00.30 |   88642 |  88638 |
|-    5 |      NESTED LOOPS                     |                |      1 |        |       |       |            |          |       |       |    430K|00:00:00.24 |   88642 |  88638 |
|-    6 |       STATISTICS COLLECTOR            |                |      1 |        |       |       |            |          |       |       |    430K|00:00:00.19 |   88642 |  88638 |
|  *  7 |        TABLE ACCESS STORAGE FULL      | PS_JRNL_HEADER |      1 |    430K|    41M|       |  1135   (8)| 00:00:01 |       |       |    430K|00:00:00.13 |   88642 |  88638 |
|-    8 |       PARTITION RANGE ITERATOR        |                |      0 |        |       |       |            |          |   KEY |   KEY |      0 |00:00:00.01 |       0 |      0 |
|- *  9 |        INDEX RANGE SCAN               | PS_JRNL_LN     |      0 |        |       |       |            |          |   KEY |   KEY |      0 |00:00:00.01 |       0 |      0 |
|- * 10 |      TABLE ACCESS BY LOCAL INDEX ROWID| PS_JRNL_LN     |      0 |      1 |   127 |       |   515K (24)| 00:00:21 |     1 |     1 |      0 |00:00:00.01 |       0 |      0 |
|    11 |    PARTITION RANGE JOIN-FILTER        |                |      1 |    498K|    60M|       |   515K (24)| 00:00:21 |:BF0000|:BF0000|    815K|00:00:07.65 |      27M|     27M|
|  * 12 |     TABLE ACCESS STORAGE FULL         | PS_JRNL_LN     |     18 |    498K|    60M|       |   515K (24)| 00:00:21 |:BF0000|:BF0000|    815K|00:00:07.55 |      27M|     27M|
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "A"@"SEL$1")
      FULL(@"SEL$1" "B"@"SEL$1")
      LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
      USE_HASH(@"SEL$1" "B"@"SEL$1")
      USE_HASH_AGGREGATION(@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."JOURNAL_DATE"="B"."JOURNAL_DATE" AND "A"."BUSINESS_UNIT"="B"."BUSINESS_UNIT" AND "A"."JOURNAL_ID"="B"."JOURNAL_ID" AND "A"."UNPOST_SEQ"="B"."UNPOST_SEQ")
   7 - storage(("A"."FISCAL_YEAR"=2024 AND INTERNAL_FUNCTION("A"."JRNL_HDR_STATUS") AND "A"."ACCOUNTING_PERIOD"<=12 AND "A"."ACCOUNTING_PERIOD">=1))
       filter(("A"."FISCAL_YEAR"=2024 AND INTERNAL_FUNCTION("A"."JRNL_HDR_STATUS") AND "A"."ACCOUNTING_PERIOD"<=12 AND "A"."ACCOUNTING_PERIOD">=1))
   9 - access("A"."BUSINESS_UNIT"="B"."BUSINESS_UNIT" AND "A"."JOURNAL_ID"="B"."JOURNAL_ID" AND "A"."JOURNAL_DATE"="B"."JOURNAL_DATE" AND "A"."UNPOST_SEQ"="B"."UNPOST_SEQ"
              AND "B"."LEDGER"='LEDGER')
       filter("B"."LEDGER"='LEDGER')
  10 - filter(("B"."CHARTFIELD1"='1234567' OR "B"."CHARTFIELD1"='1234568' OR "B"."CHARTFIELD1"='1234569'))
  12 - storage(("B"."LEDGER"='LEDGER' AND INTERNAL_FUNCTION("B"."CHARTFIELD1")))
       filter(("B"."LEDGER"='LEDGER' AND INTERNAL_FUNCTION("B"."CHARTFIELD1")))
The actual execution time is just 21 seconds (or 13 minutes, including fetches).  So, this is much faster.  
The optimizer cost drops to just 535K.  This is the cheapest plan, and therefore Oracle uses it without further intervention.  515K of the cost comes from the full scan of PS_JRNL_LN, the Bloom filter takes it up by just 2K, and the GROUP BY operation by 18K.  It is an adaptive plan, so Oracle can still switch between the nested loop and the Bloom/hash join at run time on the basis of the statistics collected at run time (at line 6).
The full scan Bloom-Hash filter of PS_JRNL_LN is cheaper than the nested loop with 430K index probes.  In fact, the tipping point will be whenever the optimizer estimates that it gets at least 129K rows from PS_JRNL_HEADER (535K / 4/row =128.75K).

TL;DR

Through a combination of Exadata system stats, archiving, partitioning and compression, the cost of smart-full scanning the PS_JRNL_LN table, that contains several billion rows, Oracle has switched to a more efficient execution plan.  


Friday, June 27, 2025

Optimising Journal Line Queries: 4. Compression

This is the fourth of five articles that examine the challenges posed by typical queries on the journal line table (PS_JRNL_LN).

  1. Problem Statement
  2. Exadata System Statistics
  3. Partitioning
  4. Compression
  5. Conclusion
If it had been possible to referentially partition PS_JRNL_LN by FISCAL_YEAR, then the sample query shown in earlier posts would have been able to prune partitions by fiscal year.  This would have significantly reduced the cost of the full scan.  
Instead, if the sample query has to full scan PS_JRNL_LN, it has to scan all the partitions in PS_JRNL_LN and then filter them against the keys retrieved from PS_JRNL_HEADER.  The only way to reduce the cost of the full scan is to reduce the number of blocks being scanned by purge and compression.  On Exadata, Hybrid Columnar Compression (HCC) will achieve a much higher compression ratio, and unwanted compression units will be eliminated efficiently.
To meet reporting requirements, Financials systems inevitably need to keep journal data going back several years.  Nonetheless, purging unneeded historical journals as aggressively as permissible is the first step in improving or at least preserving query performance.

Compression

Compressing the historical journal line partitions will further reduce the number of blocks in the segments and reduce the cost of the full scan in the query, thus making the optimiser more likely to switch away from the nested loop join to the full scan/Bloom filter/hash join.
In Oracle, basic compression and Hybrid Columnar Compression (HCC) are well suited to data warehouse applications, but they also have application in OLTP systems.  Compression occurs during direct-path insert and segment reorganisation operations and does not apply to anything modified by PeopleSoft processes that use conventional DML.  
Basic compression is available on any Oracle database; essentially, it avoids storing the same data values many times in the same data block.  Hybrid Columnar Compression (HCC) is available on Exadata. Different HCC compression levels use different compression algorithms.  DML on compressed data will decompress it.
I do not recommend using Advanced Compression to compress current periods due to the impact on day-to-day processing.

Which Partitions are Static?

DBA_TAB_MODIFICATIONS shows the number of DML operations in each partition since statistics were last collected.
SELECT p.partition_position, m.*, p.high_value
FROM dba_tab_modifications m
  INNER JOIN dba_tab_partitions p 
    ON p.table_owner = m.table_owner AND p.table_name = m.table_name AND p.partition_name = m.partition_name
WHERE m.table_owner = 'SYSADM' AND m.table_name = 'PS_JRNL_LN'
ORDER BY 1
/
This report was generated in December 2024.  Most of the updates are in the current and previous monthly periods.  There are almost no updates that are older than a year.
Part Table                                                                                              Drop                                                                                 
Pos# Owner    TABLE_NAME   PARTITION_NAME       S   INSERTS   UPDATES   DELETES TIMESTAMP           TRU Segs HIGH_VALUE                                                                      
---- -------- ------------ -------------------- - --------- --------- --------- ------------------- --- ---- --------------------------------------------------------------------------------
…
  40 SYSADM   PS_JRNL_LN   JRNL_LN_202212                 0         0         0 13/10/2024 10:08:56 NO     0 TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  41 SYSADM   PS_JRNL_LN   JRNL_LN_202301                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  42 SYSADM   PS_JRNL_LN   JRNL_LN_202302                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  43 SYSADM   PS_JRNL_LN   JRNL_LN_202303                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  44 SYSADM   PS_JRNL_LN   JRNL_LN_202304                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  45 SYSADM   PS_JRNL_LN   JRNL_LN_202305                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  46 SYSADM   PS_JRNL_LN   JRNL_LN_202306                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  47 SYSADM   PS_JRNL_LN   JRNL_LN_202307                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-08-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  48 SYSADM   PS_JRNL_LN   JRNL_LN_202308                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-09-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  49 SYSADM   PS_JRNL_LN   JRNL_LN_202309                 0         0         0 13/10/2024 10:08:56 NO     0 TO_DATE(' 2023-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  50 SYSADM   PS_JRNL_LN   JRNL_LN_202310                 0         0         0 27/10/2024 10:59:45 NO     0 TO_DATE(' 2023-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  51 SYSADM   PS_JRNL_LN   JRNL_LN_202311                 0         0         0 29/09/2024 10:01:16 NO     0 TO_DATE(' 2023-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  52 SYSADM   PS_JRNL_LN   JRNL_LN_202312                34       193        34 10/12/2024 14:21:38 NO     0 TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

  53 SYSADM   PS_JRNL_LN   JRNL_LN_202401             42374    127736       210 12/12/2024 05:27:31 NO     0 TO_DATE(' 2024-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  54 SYSADM   PS_JRNL_LN   JRNL_LN_202402             34803     92215         0 12/12/2024 05:26:30 NO     0 TO_DATE(' 2024-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  55 SYSADM   PS_JRNL_LN   JRNL_LN_202403             54940    166263         0 12/12/2024 05:12:29 NO     0 TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  56 SYSADM   PS_JRNL_LN   JRNL_LN_202404              5900     13730         0 13/12/2024 05:29:32 NO     0 TO_DATE(' 2024-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  57 SYSADM   PS_JRNL_LN   JRNL_LN_202405              6151     13869         0 13/12/2024 05:31:06 NO     0 TO_DATE(' 2024-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  58 SYSADM   PS_JRNL_LN   JRNL_LN_202406             18317     58263         0 13/12/2024 16:15:49 NO     0 TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  59 SYSADM   PS_JRNL_LN   JRNL_LN_202407           5067792  14937405         0 13/12/2024 16:02:36 NO     0 TO_DATE(' 2024-08-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  60 SYSADM   PS_JRNL_LN   JRNL_LN_202408           5217744  15378822         0 13/12/2024 18:02:57 NO     0 TO_DATE(' 2024-09-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  61 SYSADM   PS_JRNL_LN   JRNL_LN_202409             65389    243360       160 13/12/2024 12:45:25 NO     0 TO_DATE(' 2024-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  62 SYSADM   PS_JRNL_LN   JRNL_LN_202410             44839    152210         0 13/12/2024 00:28:54 NO     0 TO_DATE(' 2024-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  63 SYSADM   PS_JRNL_LN   JRNL_LN_202411          28279594  53637873  27478940 13/12/2024 18:18:00 NO     0 TO_DATE(' 2024-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  64 SYSADM   PS_JRNL_LN   JRNL_LN_202412          34761590  53485631  27484239 13/12/2024 19:16:11 NO     0 TO_DATE(' 2025-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

  65 SYSADM   PS_JRNL_LN   JRNL_LN_202501            137138    473452         0 13/12/2024 19:18:09 NO     0 TO_DATE(' 2025-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORI1
  66 SYSADM   PS_JRNL_LN   JRNL_LN_202502                 0         0         0 10/11/2024 10:08:21 NO     0 TO_DATE(' 2025-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  67 SYSADM   PS_JRNL_LN   JRNL_LN_202503               466         0         0 13/12/2024 03:59:20 NO     0 TO_DATE(' 2025-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  68 SYSADM   PS_JRNL_LN   JRNL_LN_202504                 0         0         0 17/11/2024 10:03:01 NO     0 TO_DATE(' 2025-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
  69 SYSADM   PS_JRNL_LN   JRNL_LN_202505                 0         0         0 17/11/2024 10:03:01 NO     0 TO_DATE(' 2025-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
…
Therefore, compressing journal lines in the current or previous fiscal years is not viable, as subsequent updates would destroy the compression. However, I can look at explicitly compressing older partitions.  

Attribute Clustering

Attribute Clustering sorts or clusters the rows by certain columns.  Like compression, this is declarative and will only take effect during direct-path load or segment reorganisation (such as a compression operation).  It will not take effect during normal DML.  It can be defined at table level and will be implemented during compression.
I have defined attribute clustering on PS_JRNL_LN to sort the rows by the unique key columns of its parent record, PS_JRNL_HEADER.
ALTER TABLE ps_jrnl_ln ADD CLUSTERING BY LINEAR ORDER (business_unit, journal_id, journal_date, unpost_Seq);
ALTER TABLE ps_jrnl_ln MODIFY CLUSTERING YES ON LOAD YES ON DATA MOVEMENT;
I have found it produces a small (approximately 2-5% of the original size) improvement in the resulting compression, further reducing the cost of the full table scan.  So, it is a marginal gain for no additional cost.

Hybrid Columnar Compression (HCC)

I have chosen to compress partitions older than one complete fiscal year with QUERY LOW, and those than 3 complete fiscal years with QUERY HIGH.
Each historical partition can be compressed using an online compress operation.  This can be done in parallel.  It is important to specify that it maintains the indexes so that they remain valid and usable after the operation.   This approach will not lock PS_JRNL_LN, and so the application can continue to run during this process.
…
ALTER TABLE ps_jrnl_ln MOVE PARTITION jrnl_ln_202012 COMPRESS FOR QUERY HIGH UPDATE INDEXES ONLINE PARALLEL;
ALTER TABLE ps_jrnl_ln MOVE PARTITION jrnl_ln_202101 COMPRESS FOR QUERY LOW UPDATE INDEXES ONLINE PARALLEL;
…
The statistics on the newly compressed partitions should be updated, as well as the global statistics.  

Statistics

Optimizer statistics have to be regathered on the freshly compressed partitions; otherwise, the reduction in the cost of the full scan will not occur.  
On some systems, statistics collection on the very largest tables, including PS_JRNL_LN, may not be completed within the regular maintenance window.  Then, Oracle may repeatedly attempt to collect statistics on these segments, and other statistics can become stale too.  Instead, I usually lock the statistics on such tables (to remove them from the maintenance window job) and create a specific recurring job to collect statistics at a convenient time (e.g. at the weekend).
An interim option is to simply manually update the number of blocks in the partition statistics to the number of blocks in the compressed segment (if it is lower), and recalculate the total number of blocks in the whole table for the global statistics.
set serveroutput on timi on
DECLARE 
  l_table_name VARCHAR2(18) := 'PS_JRNL_LN';
  l_part_update BOOLEAN := FALSE;
BEGIN
  FOR i IN(
    select t.table_name, t.partition_name, t.num_rows, t.blocks stat_blocks, s.blocks seg_blocks
    ,      s.tablespace_name, p.compress_for, t.num_rows/NULLIF(LEAST(t.blocks,s.blocks),0) rpb
    from user_segments s
      inner join user_tab_partitions p ON p.table_name = s.segment_name AND p.partition_name = s.partition_name
      inner join user_tab_statistics t ON s.segment_name = t.table_name AND s.partition_name = t.partition_name and t.blocks>s.blocks
    where s.segment_type = 'TABLE PARTITION' and p.compress_for IS NOT NULL and s.segment_name = l_table_name 
  ) LOOP
    l_part_update := TRUE;
    dbms_output.put_line(i.table_name||' ('||i.partition_name||') '||i.stat_blocks||' => '||i.seg_blocks||' blocks');
    dbms_stats.set_table_stats(ownname=>'SYSADM',tabname=>i.table_name,partname=>i.partition_name,numblks=>i.seg_blocks,force=>TRUE);
  END LOOP;
  IF l_part_update THEN
    FOR i IN (
      SELECT table_name, sum(blocks) blocks
      FROM user_tab_statistics
      WHERE table_name = l_table_name AND partition_name IS NOT NULL 
      GROUP BY table_name
    ) LOOP
      dbms_output.put_line(i.table_name||' = '||i.blocks||' blocks');
      dbms_stats.set_table_stats(ownname=>'SYSADM',tabname=>i.table_name,numblks=>i.blocks,force=>TRUE);
    END LOOP;
  ELSE
    dbms_output.put_line(l_table_name||' - no action required');
  END IF;
END;
/

Orphaned Index Entries & Space Recovery

One side effect of the table segment compression operation (or any other segment reorganisation operation) is that we get orphaned entries in any global indexes.  I could rebuild these indexes.
ALTER INDEX psdjrnl_ln REBUILD ONLINE TABLESPACE psindex PARALLEL;
Or, I could wait for the PMO_DEFERRED_GIDX_MAINT_JOB job to run (scheduled by default during the maintenance window) to clean out orphaned index entries from all currently impacted global indexes. Another alternative is to manually run the dbms_part.cleanup_gidx procedure that is in turn called by this job (see also Richard Foote: 12c Asynchronous Global Index Maintenance Part II)
However, another side effect is that global indexes can grow as they are maintained by the table partition compression operations.  These indexes can be coalesced, cleaned up, and then shrunk.  The shrink implies a coalesce.  ALTER INDEX … SHRINK SPACE COMPACT is equivalent to ALTER INDEX … COALESCE - see Jonathan Lewis's Oracle Scratchpad: Shrinking indexes).  However, the coalesce can be done in parallel, but the shrink cannot.  Therefore, I have chosen to do a parallel coalesce that includes a clean-up of orphaned entries, and then I shrink the segment.
ALTER INDEX psdjrnl_ln COALESCE CLEANUP PARALLEL;
ALTER INDEX psdjrnl_ln SHRINK SPACE;
I also rebuild corresponding partitions in locally partitioned indexes.
…
ALTER INDEX ps_jrnl_ln REBUILD PARTITION jrnl_ln_202012 ONLINE;
ALTER INDEX ps_jrnl_ln REBUILD PARTITION jrnl_ln_202101 ONLINE;
…
At the end of this process, space has been released back to the tablespace, but the free space will be distributed throughout the tablespace, so it probably won't be possible to release space back to the file system.
If you are compressing many partitions, perhaps because you are doing this for the first time, and they are in a dedicated tablespace, then I would suggest completely rebuilding the objects into new tablespaces.  Tablespaces can be renamed as required.

Thursday, June 26, 2025

Optimising Journal Line Queries: 3. Partitioning

This is the third of a series of five articles that examine the challenges posed by typical queries on the journal line table (PS_JRNL_LN).

  1. Problem Statement
  2. Exadata System Statistics
  3. Partitioning
  4. Compression
  5. Conclusion

PeopleSoft does not partition tables by default.  Application Designer does not support partitioning, mainly because different databases implement partitioning differently.  Thus, it is always left to the customer to implement as a customisation.  In this article, I am only going to discuss partitioning on Oracle.

Interval Partitioning

This is the sample query that I started with.

SELECT A.FISCAL_YEAR, A.ACCOUNTING_PERIOD, B.BUSINESS_UNIT, B.JOURNAL_ID, TO_CHAR(B.JOURNAL_DATE,'YYYY-MM-DD'), B.LEDGER, B.ACCOUNT
, B.PRODUCT, B.PROJECT_ID, B.CHARTFIELD1, B.CHARTFIELD2, B.CURRENCY_CD, B.AFFILIATE, SUM( B.MONETARY_AMOUNT), B.FOREIGN_CURRENCY
, SUM( B.FOREIGN_AMOUNT), A.REVERSAL_CD, A.REVERSAL_ADJ_PER, A.JRNL_HDR_STATUS, TO_CHAR(A.POSTED_DATE,'YYYY-MM-DD'), A.OPRID, A.DESCR254
, B.DEPTID, A.SOURCE, B.ALTACCT, TO_CHAR(CAST((A.DTTM_STAMP_SEC) AS TIMESTAMP),'YYYY-MM-DD-HH24.MI.SS.FF'), B.LINE_DESCR 
FROM PS_JRNL_HEADER A, PS_JRNL_LN B 
WHERE (A.BUSINESS_UNIT = B.BUSINESS_UNIT 
AND A.JOURNAL_ID = B.JOURNAL_ID 
AND A.JOURNAL_DATE = B.JOURNAL_DATE 
AND A.UNPOST_SEQ = B.UNPOST_SEQ 
AND A.JRNL_HDR_STATUS IN('P','V','U') 
AND A.FISCAL_YEAR IN (2024) 
AND A.ACCOUNTING_PERIOD BETWEEN 1 AND 12 
AND B.CHARTFIELD1 IN ('1234567','1234568','1234569')
AND B.LEDGER IN ('LEDGER')) 
GROUP BY A.FISCAL_YEAR, A.ACCOUNTING_PERIOD, B.BUSINESS_UNIT, B.JOURNAL_ID, B.JOURNAL_DATE, B.LEDGER, B.ACCOUNT, B.PRODUCT
, B.PROJECT_ID, B.CHARTFIELD1, B.CHARTFIELD2, B.CURRENCY_CD, B.AFFILIATE, B.FOREIGN_CURRENCY, A.REVERSAL_CD, A.REVERSAL_ADJ_PER
, A.JRNL_HDR_STATUS, A.POSTED_DATE, A.OPRID, A.DESCR254, B.DEPTID, A.SOURCE, B.ALTACCT, A.DTTM_STAMP_SEC, B.LINE_DESCR

It would have been desirable to have been able to partition PS_JRNL_LN by FISCAL_YEAR.  However, that column is only present on the parent table, PS_JRNL_HEADER.  Oracle can do referential partitioning, where the child table is partitioned by an attribute of a column in the parent table.  The parent table must also be partitioned similarly, thus producing a 1:1 mapping of partitions between the parent and child tables.  However, this feature also requires the presence of an enforced foreign key constraint between parent and child tables.  

PeopleSoft has never implemented or supported database-enforced referential integrity constraints (again, mainly because it was done differently on different databases).  Although it is tempting to add a foreign key constraint between these tables, that would be a customisation to PeopleSoft that Oracle would not support.  The application would then have to insert parent rows before child rows and delete child rows before deleting parent rows.  It has never been tested against these constraints.  

Therefore, it is only possible to consider partitioning by a column on PS_JRNL_LN.  A column in the unique key is an obvious choice.

  • Depending on how BUSINESS_UNIT is set up and used, you might be able to list sub-partition by this column, and split journal lines down into several subpartitions.  However, it is almost inevitable that the volumes will be heavily skewed.
  • It is tempting to range partition on JOURNAL_ID.  Although this column usually contains an entirely numeric value, it is in fact defined as a character (VARCHAR2) data type.  Therefore, it is not possible to interval partition upon it.  Periodically, it would be necessary to add partitions manually.
  • The alternative is to interval range partition on JOURNAL_DATE.  I chose to define a monthly interval.  I specified the first few partitions for whole years because at this customer, these partitions contained less data after archiving.  Thereafter, Oracle automatically creates monthly partitions as data is inserted.

CREATE TABLE PS_JRNL_LN 
(BUSINESS_UNIT VARCHAR2(5 CHAR) NOT NULL
,JOURNAL_ID VARCHAR2(10 CHAR) NOT NULL 
,JOURNAL_DATE DATE NOT NULL 
,UNPOST_SEQ NUMBER(*,0) NOT NULL 
,JOURNAL_LINE NUMBER(9,0) NOT NULL 
,LEDGER VARCHAR2(10 CHAR) NOT NULL 
...
) PARTITION BY RANGE (JOURNAL_DATE) INTERVAL (NUMTOYMINTERVAL(1, 'MONTH')) 
(PARTITION JRNL_LN_2016 VALUES LESS THAN (TO_DATE('2017-01-01', 'YYYY-MM-DD'))
,PARTITION JRNL_LN_2017 VALUES LESS THAN (TO_DATE('2018-01-01', 'YYYY-MM-DD'))
,PARTITION JRNL_LN_2018 VALUES LESS THAN (TO_DATE('2019-01-01', 'YYYY-MM-DD'))
,PARTITION JRNL_LN_2019 VALUES LESS THAN (TO_DATE('2020-01-01', 'YYYY-MM-DD')) 
) 
/
Partitioning on JOURNAL_DATE still arranges the journal line data into current and historical partitions. We find that the various financial processes only create journal lines in the current and previous fiscal years.  Therefore, earlier fiscal years are effectively static.  This presents an opportunity to compress these partitions because nothing will subsequently update them that would decompress compressed rows.  Thus, partitioning and compression go together.

Renaming Partitions

By default, interval partitions are given system-generated names.  I find it convenient to rename them to something more meaningful.  This has no bearing on performance.  In this case, I used something based on the date to which the partition relates.  There are 2 implicit cursors in the following PL/SQL block.  The first renames table partitions, and the second renames index partitions.  This script should be run periodically as new partitions are added. It is available on GitHub.
rem rename_jrnl_ln_partitions.sql    
rem requires https://github.com/davidkurtz/psscripts/blob/master/psftapi.sql
spool rename_jrnl_ln_partitions.lst
set serveroutput on
DECLARE
  l_high_value DATE;
  l_sql CLOB;
  l_new_partition_name VARCHAR2(30);
BEGIN
  psft_ddl_lock.set_ddl_permitted(TRUE);
  FOR i IN (
    select /*+LEADING(r upt upkc utc)*/ r.recname, upt.table_name, utp.partition_name, utp.high_value, upt.interval interval_size
    from sysadm.psrecdefn r 
      INNER JOIN user_part_tables upt ON upt.table_name = DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) 
        AND upt.partitioning_type = 'RANGE' and upt.interval IS NOT NULL
      INNER JOIN user_part_key_columns upkc ON upkc.name = upt.table_name AND upkc.object_Type = 'TABLE' and upkc.column_position = 1
      INNER JOIN user_tab_columns utc ON utc.table_name = upkc.name AND utc.column_name = upkc.column_name
      INNER JOIN user_tab_partitions utp ON utp.table_name = upt.table_name AND utp.partition_name like 'SYS_P%'
    WHERE r.recname = 'JRNL_LN' AND r.rectype = 0
    AND (utc.data_type = 'DATE' OR utc.data_type like 'TIMESTAMP%')
  ) LOOP
    l_sql := 'SELECT '||i.high_value||'-'||i.interval_size||' FROM DUAL';
    EXECUTE IMMEDIATE l_sql INTO l_high_value;
    l_new_partition_name := i.recname||'_'||TO_CHAR(l_high_value,'YYYYMM');
    l_sql := 'ALTER TABLE '||i.table_name||' RENAME PARTITION '||i.partition_name||' TO '||l_new_partition_name;
    IF i.partition_name != l_new_partition_name THEN
      dbms_output.put_line(l_sql);
      EXECUTE IMMEDIATE l_sql;
    END IF;
  END LOOP;

  FOR i IN (
    select /*+LEADING(r upi upkc utc)*/ r.recname, upi.index_name, uip.partition_name, uip.high_value, upi.interval interval_size
    from sysadm.psrecdefn r 
      INNER JOIN user_part_indexes upi ON upi.table_name = DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) 
        AND upi.partitioning_type = 'RANGE' and upi.interval IS NOT NULL
      INNER JOIN user_part_key_columns upkc ON upkc.name = upi.index_name AND upkc.object_Type = 'INDEX' and upkc.column_position = 1
      INNER JOIN user_tab_columns utc ON utc.table_name = upi.table_name AND utc.column_name = upkc.column_name
      INNER JOIN user_ind_partitions uip ON uip.index_name = upi.index_name 
        AND (uip.partition_name like 'SYS_P%' OR SUBSTR(uip.partition_name,1+LENGTH(r.recname),1) != SUBSTR(upi.index_name,3,1))
    WHERE r.recname = 'JRNL_LN' AND r.rectype = 0
    AND (utc.data_type = 'DATE' OR utc.data_type like 'TIMESTAMP%')
  ) LOOP
    l_sql := 'SELECT '||i.high_value||'-'||i.interval_size||' FROM DUAL';
    EXECUTE IMMEDIATE l_sql INTO l_high_value;
    l_new_partition_name := i.recname||SUBSTR(i.index_name,3,1)||TO_CHAR(l_high_value,'YYYYMM');
    l_sql := 'ALTER INDEX '||i.index_name||' RENAME PARTITION '||i.partition_name||' TO '||l_new_partition_name;
    IF i.partition_name != l_new_partition_name THEN
      dbms_output.put_line(l_sql);
      EXECUTE IMMEDIATE l_sql;
    END IF;
  END LOOP;
  psft_ddl_lock.set_ddl_permitted(FALSE);
END;
/
spool off

Tuesday, June 24, 2025

Optimising Journal Line Queries: 2. Exadata System Statistics

This is the second of a series of five articles that examine the challenges posed by typical queries on the journal line table (PS_JRNL_LN).

  1. Problem Statement
  2. Exadata System Statistics
  3. Partitioning
  4. Compression
  5. Conclusion

Exadata System Statistics 

Many other people have written notes about how Oracle's optimizer costs a full table scan.  This is a selection:

Roughly speaking, the cost calculated by the optimizer that we see in an execution plan is an estimate of the time taken to perform an operation, where the unit of time is the duration of a single block read.  Although that statement is an oversimplification.  There are various guesses and assumptions built into the optimizer's calculation.  The cost-based optimizer looks for the cheapest plan, that ought to be the fastest to execute.  However, in many cases, cost does not correspond to execution time.

Full Scan Cost

The cost of a full table scan is made up of an I/O cost (the time taken to read the blocks from disk) and a CPU cost (the time taken to process the rows).  The I/O cost is the number of multi-block read operations, multiplied by the ratio of the duration of a multi-block read to a single-block read.

  • IO Cost = (HWM / MBRC) . (MREADTIM / SREADTIM)

Where

  • HWM = the high water mark of the segment expressed as a number of blocks
  • MBRC = average multi-block read count for sequential read, in blocks (see parameter DB_FILE_MULTIBLOCK_READ_COUNT).
  • MREADTIME = average time to perform a multi-block read at once (sequential read), in milliseconds
  • SREADTIME = average time to read a single block (random read), in milliseconds

See PL/SQL Packages and Types Reference: DBMS_STATS

The single and multi-block read times are derived from two system statistics, the block size and the multi-block read count.

  • SREADTIM = IOSEEKTIM + DB_BLOCK_SIZE / IOTFRSPEED
  • MREADTIM = IOSEEKTIM + (DB_BLOCK_SIZE * MBRC) / IOTFRSPEED

Where

  • IOSEEKTIM = Seek time + latency time + operating system overhead time, in milliseconds (default 10ms).
  • IOTFRSPEED = I/O transfer speed in bytes per millisecond (or if you prefer KBytes/second)
  • DB_BLOCK_SIZE = block size of the segment (usually 8Kb)

System statistics can be gathered based on actual system behaviour using DBMS_STATS, or set to pre-defined values using DBMS_STATS.GATHER_SYSTEM_STATS.  Over the years many blogs, forums and presentations have discussed the merits or otherwise of collecting or setting system statistics.  

Oracle's position is set out in the Oracle Optimizer Blog: Should You Gather System Statistics?  It can be summarised as: 

  • Do not gather your own system statistics.
  • Use the Oracle-provided defaults.
  • Except on Exadata, where you can consider using the Exadata defaults, and perhaps not even then on a mixed workload. You will have to test this for yourself.

On any Oracle system, the default system statistics can be reset with 

exec DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD');

This sets the system statistics as follows:

  • MBRC=8
  • IOSEEKTIM=10
  • IOTFRSPEED=10

Thus:

SREADTIM = IOSEEKTIM + DB_BLOCK_SIZE / IOTFRSPEED
         = 10 + 8192 / 4096
         = 12 (ms)

MREADTIM = IOSEEKTIM + (DB_BLOCK_SIZE * MBRC) / IOTFRSPEED
         = 10 + (8192 * 8 ) / 4096
         = 10 + 16 
         = 26 (ms)

However, on Exadata, you can set 'system statistics take into account the unique capabilities of Oracle Exadata, such as large I/O size and high I/O throughput

exec DBMS_STATS.GATHER_SYSTEM_STATS('EXADATA');

Some system statistics are then set differently:

  • MBRC=128
  • IOSEEKTIM=10
  • IOTFRSPEED=204800

Thus

SREADTIM = IOSEEKTIM + DB_BLOCK_SIZE / IOTFRSPEED
         = 10 + 8192 / 204800
         = 10.04 (ms)

MREADTIM = IOSEEKTIM + (DB_BLOCK_SIZE * MBRC) / IOTFRSPEED
         = 10 + (8192 . 128 ) / 204800
         = 10 + 5.1200 
         = 15.12000 (ms)

Now, I can put these numbers back into the formula Oracle uses to calculate the I/O cost of a full scan.

  • IO Cost = (HWM / MBRC) . (MREADTIM / SREADTIM)

Let us suppose that we are going to read 100M blocks.  The I/O cost of that scan will be very different with Exadata system statistics rather than the normal default system statistics.

Normal IO Cost  = (100000000/8) . (26/12)
                = 27,083,333.3
 
Exadata IO Cost = (100000000/128) . (15.12/10.04)
                = 1,176,543.8

Thus, introducing Exadata system statistics significantly reduces the cost of the full scan, making the database more likely to use a full scan than index lookups.  That may or may not be desirable.

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                |   428K|    93M|       |  2834K  (8)| 00:01:51 |       |       |
|   1 |  HASH GROUP BY                 |                |   428K|    93M|   111M|  2834K  (8)| 00:01:51 |       |       |
|*  2 |   HASH JOIN                    |                |   428K|    93M|    46M|  2819K  (8)| 00:01:51 |       |       |
|   3 |    JOIN FILTER CREATE          | :BF0001        |   428K|    41M|       |  1476   (7)| 00:00:01 |       |       |
|   4 |     PART JOIN FILTER CREATE    | :BF0000        |   428K|    41M|       |  1476   (7)| 00:00:01 |       |       |
|*  5 |      TABLE ACCESS STORAGE FULL | PS_JRNL_HEADER |   428K|    41M|       |  1476   (7)| 00:00:01 |       |       |
|   6 |    JOIN FILTER USE             | :BF0001        |  1120K|   136M|       |  2817K  (8)| 00:01:51 |       |       |
|   7 |     PARTITION RANGE JOIN-FILTER|                |  1120K|   136M|       |  2817K  (8)| 00:01:51 |:BF0000|:BF0000|
|*  8 |      TABLE ACCESS STORAGE FULL | PS_JRNL_LN     |  1120K|   136M|       |  2817K  (8)| 00:01:51 |:BF0000|:BF0000|
-------------------------------------------------------------------------------------------------------------------------

If I look at the same example query that I used earlier, then with the Exadata default system statistics, the cost has come down significantly (from 66M to 2817K).  It is a significant improvement, but it is still greater than the cost of the nested loop (1730K).  Therefore, for this query, I still only get this execution plan if I hint the statement to force it.  I still need to make the full scan cheaper.

Different queries will have different costs and will flip between the nested loop and Full scan/Bloom filter/hash join at different points.

Non-Exadata System Statistics 

If you are not running on Exadata, then the advice from Oracle is clear and unambiguous: use the default system statistics that can be reset with 
exec DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD');
You will have to consider other techniques to reduce the cost of the full table scan.