Monday, July 6, 2015

Beware of empty partitions!

Hey crew,

Oracle partitioning is a great feature to use for your data warehouse - particularly for your fact table.  A common, best practice approach is to RANGE partition your fact table by a DATE column - choosing a range with a Year, Quarter, Month, Week, or Day interval.

One thing that we learned the "hard way" was that having a large number of empty partitions in our fact table(s) caused very poor performance in our Exadata queries that ran MUCH better in a case where there were no empty partitions.

As mentioned in my previous post - we typically copy only a few weeks of transaction data down to our development environments because we do not have nearly as much space or compute resources there to house/process it.

One mistake we used when exporting our fact table (while filtering down to subsets of dates) - using Oracle's Data Pump Export tool (expdp) - is that we exported the entire table DDL - meaning all partitions got created when we imported into development.

While all partitions were there - only a small number were actually populated.  We partition by DATE range - typically with partitions that hold one week of point-of-sale transactional data.

So - in our case - we had like 8 populated partitions, with over 100 empty ones in development (all 108 would be populated in Production).

We did all of the stuff you are supposed to do - we ensured that stats were up to date - but developers still complained that "Dev was too slow...  We must work in Production!".

I was immediately suspicious - Dev had way LESS data than Production - so even with reduced resources - it shouldn't be that bad...

While investigating SQL Monitor reports from Dev and comparing them to similar reports from Production - I noticed that in Development - the Cost Based Optimizer estimated VERY LOW row counts from the fact table - while Production got it much closer.

Our queries spanned multiple partitions - so they used global stats - which were accurate in both places.

What I deduced from this was that Oracle was actually taking the total row count of the table (Global stats) divided by the total number of partitions to "guess" the cardinality that would be returned from the each partition of the fact table.

In development - that formula had a much smaller numerator (due to smaller overall row count) - but a denominator that was just as large (108) as production's.

When I dropped the empty partitions - something magical happened - Oracle got the cardinality estimates much closer to reality - and performance picked up because some bad nested loop joins turned back into hash joins (as they should have been based upon the actual cardinality) - and queries performed in a manner consistent with Production.

I wanted something in my toolbox to help prevent this in the future - so I developed this script to search the partitions of the fact table for data, as well as to generate "ALTER TABLE x DROP PARTITION y;" commands to drop those empty partitions.  Please be VERY CAREFUL with this - test it in development to ensure it works properly for you...

-- Generate statements to drop empty partitions from the fact table(s)
SELECT   table_owner
       , table_name
       , partition_name
                                                             || DBMS_ASSERT.enquote_name (str => table_owner)
                                                             || '.'
                                                             || DBMS_ASSERT.enquote_name (str => table_name)
                                                             || ' PARTITION ('
                                                             || DBMS_ASSERT.enquote_name (str => partition_name)
                                                             || ') WHERE ROWNUM <= 1'
                                , '/ROWSET/ROW/ROWS_EXIST'
                   ) AS rows_exist
    FROM all_tab_partitions
   WHERE table_owner = 'WH'
     AND table_name IN ('POINT_OF_SALE_FACTS')
ORDER BY table_owner
       , table_name
       , partition_position
     || DBMS_ASSERT.enquote_name (str => table_owner)
     || '.'
     || DBMS_ASSERT.enquote_name (str => table_name)
     || ' DROP PARTITION '
     || DBMS_ASSERT.enquote_name (str => partition_name)
     || ';' AS stmt
    , t.*
WHERE rows_exist = 0

 Basically - this script performs dynamic SQL from SQL (pretty cool) - using DBMS_XMLGEN.  It lets you "probe" each partition for data - it does NOT rely on stats which can get stale...

I hope this is helpful for your efforts in finding/dropping empty partitions...

God bless... 

Finding Contiguous Date Ranges in your Data - using Recursive SQL

Hey team,

I found myself needing to write a query to detect the contiguous date ranges available in my warehouse's Point of Sale fact data table (in Development).

While we have data for every day in Production, in Development we use a technique that only copies a few weeks from the same month of two consecutive years - this lets us calculate "Year-over-Year" growth - while limiting the data segment size.

In such cases - finding the MIN/MAX is not acceptible, because it will look like I have data for about 13 months - while in actuality I only have data for two months (April 2014 and April 2015 for example).

I decided not to "google" my problem because I wanted a brain-teaser SQL problem (I love those) - but I think I came up with something kind of cool - a recursive SQL solution using Oracle's CONNECT BY syntax.  I haven't yet tackled it with ANSI recursive SQL - but I plan to convert this approach soon.

Anyway - here goes:
WITH dates AS (
 TRUNC (date_id, 'DD') AS date_id
 FROM (SELECT TO_DATE ('01-APR-2014', 'DD-MON-YYYY') + LEVEL - 1 AS date_id
         FROM dual
        CONNECT BY LEVEL <= 60
       UNION ALL
       SELECT TO_DATE ('01-APR-2015', 'DD-MON-YYYY') + LEVEL - 1 AS date_id
         FROM dual
        CONNECT BY LEVEL <= 90
      ) fact_table /* Here we are simulating a fact table having unbalanced date ranges available */
 , recursive_dates AS (
 SELECT dates.date_id
 /* CONNECT_BY_ISLEAF tells me that I am at the bottom of the hierarchy in the recursive join for a branch */
 , CONNECT_BY_ISLEAF AS date_id_is_leaf_flag
 /* CONNECT_BY_ROOT always returns the top node in the recursive join - where "LEVEL" = 1 */
 , CONNECT_BY_ROOT date_id AS root_date_id
 /* The LEVEL expression tells me how many levels deep I am in the recursive join */
 , LEVEL AS date_id_level
 FROM dates
 /* Here we are finding any dates which do NOT have a preceding date (one day prior) in our "DATES" data set */
 START WITH date_id NOT IN (SELECT date_id + 1
 FROM dates
 /* Here we are recursively joining the data set to itself to find the next date that is one day in the future from the current row's date */
 CONNECT BY PRIOR date_id = /* NEXT */ (date_id - 1)
 SELECT root_date_id AS min_date_id
 , date_id AS max_date_id
 FROM recursive_dates rd
 WHERE date_id_is_leaf_flag = 1
 ORDER BY min_date_id 


I hope this is helpful - can you think of other ways to tackle it?