pink cylinders

More than Four

...there's an axis for that.

Selecting a timestamp AT TIME ZONE - a gotcha

So there's this Oracle database instance in the midwest, in US/Central time zone.

So if I use PuTTY to open a terminal window on the host and open a SQL*Plus session, if I execute this statement I'll get the following results:

SQL> select
max( CHANGE_TS ) as MAX_CHANGE,
max( CHANGE_TS at time zone 'US/Central' ) as MAX_CST_CHANGE
from tRetired_Mst;

MAX_CHANGE MAX_CST_CHANGE
-------------------------------- ---------------------------------------------
05-JAN-17 02.33.17.613000 PM 05-JAN-17 02.33.17.613000 PM US/CENTRAL

As expected, they are the same, because the local time zone of the session (terminal window) is the same as the DB,  US/Central.

However:

If I run PL/SQL Developer on my Windows Laptop here in California, and connect to the same Oracle instance, then the results are different:

SQL> select
max( CHANGE_TS ) as MAX_CHANGE,
max( CHANGE_TS at time zone 'US/Central' ) as MAX_CST_CHANGE
from tRetired_Mst;

MAX_CHANGE MAX_CST_CHANGE
----------------------------- ----------------------------------------------
2017-01-05 14:33:17 05-JAN-17 04.33.17.613000 PM US/CENTRAL

Observe thatthe MAX_CST_CHANGE value is off by two hours! Because the timestamp value was evaluated as if it were in the local session time of my laptop, which is US/Pacific.

For code executing in PL/SQL packages in the database, this is probably not an issue... but if you are using this technique to convert values in ad-hoc SQL, you should be aware of this trap.

Using recursive Common Table Expressions to generate lists

Say we want create a generated list of values - sequential dates, for example - as a set of rows. We can do this using recursive common table expressions.

Here's an example of a common table expression to generate a set of dates:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select date '2012-01-02' as DATE_VALUE from dual
union all
select date '2012-01-03' as DATE_VALUE from dual
)
select * from dates;

Using this technique, we could create a set of rows with sequential date values... but it is not very flexible. Ideally, we want a technique that expresses the intent algorithmically.

In 11g R2 and later, we can use a reference to the outer table recursively from inside the definition! This sounds crazy but it works:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

In this example, we are UNIONING the first row with a sub-select against itself, adding a new row that increments the date value by 1.

The first time I tried this technique, I got the following error:

ORA-32039: recursive WITH clause must have column alias list

Some research on the web gave me a page that explained what the problem was. The error message is actually very clear: we need to add a list of columns at the top of the WITH <alias> clause:

with dates ( DATE_VALUE ) as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

And now, it just works:
 
SQL>
DATE_VALUE
-----------
2012-01-01
2012-01-02
2012-01-03
2012-01-04
2012-01-05
2012-01-06
2012-01-07
2012-01-08
2012-01-09
2012-01-10
2012-01-11
:

The nice thing about this technique is that we can control the end date, and the increment size, by variables:

define _increment = '5';
define _start = date '2012-01-01';
define _end = date '2012-02-01';

with dates ( DATE_VALUE ) as
(
select &_start as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '&_increment' day as DATE_VALUE
from dates d
where d.DATE_VALUE < &_end
)
select * from dates;

This gives us a list of dates 5 days apart.

 

Using this technique in a PL/SQL block

If you're in a PL/SQL block, trying to generate a list dynamically based on parameter values, the code will need to look slightly different:

declare
p_start_time date;
p_interval_days integer;
p_end_date date;
--
l_interval interval day to second;
begin
l_interval := numtodsinterval( p_interval_days, 'day' );
insert into My_List_Of_Dates
with
dates ( DATE_VALUE ) as
(
select
p_start_time as DATE_VALUE
from dual
union all
select
DATE_VALUE + l_interval as DATE_VALUE
from dates d
where d.END_TIME < p_end_date
)
select DATE_VALUE from dates;
end;

 

 


 

Dates and Time Zone conversion

It's time to take another look at some date/time conversion techniques...

Consider reading a DATE value from some source system, in a known time zone, and needing to convert it into some other time zone before saving into the destination table:

with
src_date as (
select
-- ---------------------------------------------------------------------
-- Here's a date value from the source system. From the value itself
-- we can't tell anything significant about it:
-- ---------------------------------------------------------------------
to_date('2016-10-03 23:20','yyyy-mm-dd hh24:mi') as INPUT_DT
from dual
),
assume_data as (
select
INPUT_DT,
-- --------------------------------------------------------------------------
-- If we know the value is a US/Pacific time zone value, we can declare that:
-- --------------------------------------------------------------------------
from_tz( cast( INPUT_DT as TIMESTAMP) , 'US/Pacific' ) as PACIFIC_TSTZ,
-- --------------------------------------------------------------------------
-- Alternatively, what if we know it is in China Standard Time ?
-- --------------------------------------------------------------------------
from_tz( cast( INPUT_DT as TIMESTAMP) , '+8:00' ) as CHINA_TSTZ
from src_date
)
select
-- ----------------------------------------------------------------
-- Now we have values that declare what time zone they belong to,
-- we can easily transform the value into datetimes for a different time zone:
-- ----------------------------------------------------------------
INPUT_DT,
PACIFIC_TSTZ,
CHINA_TSTZ,
-- ----------------------------------------------------------------
-- convert the China Standard Time value into US/Central:
-- ----------------------------------------------------------------
CHINA_TSTZ at time zone 'US/Central' as CENTRAL_TSTZ ,
-- ----------------------------------------------------------------
-- convert the US/Pacific time into UTC:
-- ----------------------------------------------------------------
PACIFIC_TSTZ at time zone 'UTC' as UTC_TSTZ,
PACIFIC_TSTZ at time zone '+0:00' as UTC_TSTZ_ALT
from assume_data;

Results:

image

For example, when loading DATE values from C2R (Pacific time) and storing them in EDW (UTC) we could use the following expression:

cast( from_tz(cast( INPUT_DT as TIMESTAMP), 'US/Pacific') at time zone 'UTC' as DATE )

That is, if we really needed a DATE data type. I believe Oracle DB would handle the outer cast automatically if we attempted to save the TIMESTAMP+TZ value into a destination table's DATE column

For more fun:

An exercise in Interval Partitioning

Consider a table, LOG_DATA, with lots of rows:

select trunc( LOG_TIME, 'MM' ) as MONTH, count(1) as Count
from Log_Data
group by trunc(LOG_TIME,'MM')
order by trunc(LOG_TIME,'MM');

image

Perhaps this would be better served by a table with RANGE partitioning, using the INTERVAL syntax of 11g to automatically create partitions as data accumulates?

Unfortunately, LOG_TIME is data type timestamp with time zone, and we can't create an interval partition on timestamp columns (yet!)

Also, we can't use an expression in the interval definition. So, we need a DATE column:

create table Log_Data_P
(
ID integer,
PARENT_ID integer,
LOG_TIME timestamp with time zone default systimestamp,
STACK_LEVEL integer,
PACKAGE_NAME varchar2(30),
PROCEDURE_NAME varchar2(30),
LOG_LABEL varchar2(100),
LOG_VALUE varchar2(1000),
LOG_DURATION number,
PROC_DURATION number,
STEP_DURATION number,
LOG_CLOSED integer,LOG_DATE date default sysdate
)
partition by range ( LOG_DATE )
interval ( numtoyminterval( 1, 'MONTH' ) )
(
partition part_01 values less than ( date '2016-05-01' )
);

Here, we have added a LOG_DATE column to the end of the table structure. I've given it a default value and checked the code that writes to the table, to ensure there are no conflicts (there aren't).

Now let's insert into the new table, with the rows from the original (spanning several months):

insert into Log_Data_P
select
ID, PARENT_ID, LOG_TIME, STACK_LEVEL, PACKAGE_NAME, PROCEDURE_NAME, LOG_LABEL, LOG_VALUE,
LOG_DURATION, PROC_DURATION, STEP_DURATION, LOG_CLOSED, cast( LOG_TIME as DATE )
from Log_Data;
commit;

No errors... let's see what partitions we have, at the end of this process:

select PARTITION_NAME, INTERVAL, HIGH_VALUE
from DBA_Tab_Partitions
where TABLE_NAME = 'LOG_DATA_P'
order by PARTITION_POSITION;

PARTITION_NAME INTERVAL HIGH_VALUE
------------------ ---------- -----------------------------------------------------------
PART_01 NO TO_DATE(' 2016-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15277 YES TO_DATE(' 2016-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15278 YES TO_DATE(' 2016-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15280 YES TO_DATE(' 2016-08-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15281 YES TO_DATE(' 2016-09-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15279 YES TO_DATE(' 2016-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )
SYS_P15282 YES TO_DATE(' 2016-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS' )

7 rows selected.

Interesting... we have 6 new partitions over and above the original "PART_01" partition that we created. Obviously, Oracle has allocated some automatically-generated names to the partitions.

OK, so now the time has come to perform some pruning on our log table. Let's delete the oldest partition:

SQL> alter table Log_Data_P drop partition PART_01;

ORA-14758: Last partition in the range section cannot be dropped

Uh-oh. It's not as simple as we'd hoped. More about dropping partitions in this excellent post here:

Follow up #1 - OK, so try something sneaky, and drop the second-to-last partition

I had an idea. What if we define the table with an artificially small initial partition, with a start date that is earlier than any data in the table?

If we take a look at our Log_Data table today:

SQL> select count(1), min(LOG_TIME) from Log_Data;

COUNT(1) MIN(LOG_TIME)
---------- -----------------------------------
1172660 12-OCT-16 10.57.17.516528 PM -05:00

1.1 million rows, since October 12.

Now let's define the Log_Data_P table again, with a start date of 2016-05-01 as described above, and insert the rows from our Log_Data table.

What do the partitions look like now?

select PARTITION_NAME, INTERVAL, HIGH_VALUE
from DBA_Tab_Partitions
where TABLE_NAME = 'LOG_DATA_P'
order by PARTITION_POSITION;

PARTITION_NAME INTERVAL HIGH_VALUE
---------------- -------- -------------------------------------
PART_01 NO TO_DATE(' 2016-05-01 00:00:00', ...
SYS_P17402 YES TO_DATE(' 2016-11-01 00:00:00', ...
SYS_P17403 YES TO_DATE(' 2016-12-01 00:00:00', ...
SYS_P17404 YES TO_DATE(' 2017-01-01 00:00:00', ...

"PART_01" won't have any rows in it, but it defines the interval for future partitions.  Now imagine some time goes by, and we don't want the oldest month of data... so, let's drop October, 'SYS_P17402':

SQL> alter table Log_Data_P drop partition SYS_P17402;

Table altered

select PARTITION_NAME, INTERVAL, HIGH_VALUE
from DBA_Tab_Partitions
where TABLE_NAME = 'LOG_DATA_P'
order by PARTITION_POSITION;

PARTITION_NAME INTERVAL HIGH_VALUE
---------------- -------- ------------------------------------
PART_01 NO TO_DATE(' 2016-05-01 00:00:00', ...
SYS_P17403 YES TO_DATE(' 2016-12-01 00:00:00', ...
SYS_P17404 YES TO_DATE(' 2017-01-01 00:00:00', ...

What's the earliest row in the table?

SQL> select count(1), min(LOG_TIME) from Log_Data_P;

COUNT(1) MIN(LOG_TIME)
---------- -----------------------------------
859906 01-NOV-16 12.00.02.397712 AM -05:00

Success!

Now we just need a way to calculate the name of the partition based on, say, SYSDATE-180 (or whatever our aging specification is) and we're golden.

Converting a TIMESTAMP WITH TZ in UTC to Central

At ABACAB we have - very correctly - a TIMESTAMP WITH TIME ZONE column called ONLINE_ENROLLMENT_TIME that has been populated - via an ETL process - with values in UTC.

This is good, because it is a completely unambiguous value. Oh, if only all our date times were stored this way.

Unfortunately, the convention up until now has been:

  • assume date times are in Central time zone
  • convert to integer "date key" and "time key" data types (primary key for date and time dimensions).

This loses any time zone information, which is not great. It means that all client processes using this data also need to assume Central time zone.

The following SQL takes our time zone-aware column and transforms it:

select
ID,
ONLINE_ENROLLMENT_TIME,
(ONLINE_ENROLLMENT_TIME at time zone 'US/Central') as ONLINE_ENROLLMENT_TIME_C,
trunc( cast( (ONLINE_ENROLLMENT_TIME at time zone 'US/Central') as DATE )) as ONLINE_ENROLLMENT_DATE
from ZYXX.D_Person
where ONLINE_ENROLLMENT_TIME
is not null;

Results:

image

More pitfalls with NVARCHAR2 vs VARCHAR2

Consider the following POSTAL_CODE column:

create table Test1 (
POSTAL_CODE NVARCHAR2(10)
);

insert into Test1 values ('12345-1234');
commit;

select * from Test1;

POSTAL_CODE
--------------------
12345-123

That's right - the NVARCHAR2(10) column appears to lop of 10th character and doesn't store it.

This behavior must be related to multi-byte character, and Unicode, somehow. The Oracle DB docs on Globalization Support can get quite detailed, but I still haven't found a concise explanation of this silent truncation behavior.

So, just make the column larger, right?

We tried that, altering the column from 10 to 15 characters. Our column can now correctly store the full 10 characters of a postal code string.

But then, another problem showed up when we needed to match on the first 5 characters in the ZIP code.

Consider:

create table Test2 ( V1 varchar2(15), N1 nvarchar2(15) ) ;
insert into Test2 values ( '1234567', '1234567' );
insert into Test2 values ( '123456', '123456' );
insert into Test2 values ( '12345', '12345' );
insert into Test2 values ( '1234', '1234' );
insert into Test2 values ( '123', '123' );
commit;

SQL> select V1, substr( V1, 1,5 ), N1, substr( N1, 1, 5 ) from Test2;

V1 SUBSTR(V1,1,5) N1 SUBSTR(N1,1,5)
--------------- -------------- ------------------------------ --------------
1234567 12345 1234567 1234
123456 12345 123456 1234
12345 12345 12345 1234
1234 1234 1234 1234
123 123 123 123

Yup, that's right. SUBSTR( NVATCHAR2 ) does not return the full 5 characters!

As a workaround, we can use SUBSTRC():

SQL> select V1, substrc( V1, 1,5 ), N1, substrc( N1, 1, 5 ) from Test2;

V1 SUBSTRC(V1,1,5) N1 SUBSTRC(N1,1,5)
--------------- --------------- ------------------------------ ------------------------------
1234567 12345 1234567 12345
123456 12345 123456 12345
12345 12345 12345 12345
1234 1234 1234 1234
123 123 123 123

Now the values retrieved from the NVARCHAR2 column are working as expected.

Bottom Line:

  • Don't use NVARCHAR2 for size-limited columns, unless you absolutely have to
  • If you have to use NVARCHAR2, us SUBSTRC() for partials.

Enterprise Architect: Importing or Refreshing DB schemas

You will need:

  • a DB instance to read the schema from
  • an Oracle ODBC driver that EA can use
  • an ODBC configuration pointing to the DB source schema
  • user credentials for the DB source schema
  • an EA project to import into

Oracle DB instance

We could point at the Production system, but this is not a great idea, especially when you're testing. I use a local virtual machine (running Oracle Linux 6.6) with an instance of Oracle Database 12c. This is where I do all my test builds, pulled from the latest branch in source control.

image

Oracle ODBC Driver

I'm going to assume you've got this covered... otherwise, this is a good starting reference:

(We could try using the default Microsoft ODBC Driver for Oracle, but I've never got it to work.)

ODBC and Enterprise Architect

My local operating system is 64-bit Windows. However, my installation of Enterprise Architect is 32-bit. This means that it will use the 32-bot ODBC components, and I need to use the 32-bit ODBC Data Source Administrator to configure a data source (DSN).

And if Life weren't complicated enough:

But, the bottom line is:

On a 64bit machine when you run “ODBC Data Source Administrator” and created an ODBC DSN, actually you are creating an ODBC DSN which can be reachable by 64 bit applications only.

But what if you need to run your 32bit application on a 64 bit machine ? The answer is simple, you’ll need to run the 32bit version of “odbcad32.exe” by running “c:\Windows\SysWOW64\odbcad32.exe” from Start/Run menu and create your ODBC DSN with this tool.

Got that?

  • 64-bit ODBC Administrator:    c:\windows\System32\odbcad32.exe
  • 32-bit ODBC Administrator:    c:\windows\SysWOW64\odbcad32.exe

It's mind-boggling. The UI looks identical too.

Configuring a Data Source (DSN)

  • Run the 32-bit ODBC Administrator
  • Select User DSN
  • Click on "Add..."
  • Select the Oracle driver:

image

  • Click "Finish"

We're not finished. We need to enter some details:

image

The only critical parameter here is the TNS Service Name which needs to match whatever you've set up in the TNSNAMES.ORA config file, for your target DB instance.

Here, I've used a user name of SYSTEM because this is my test Oracle instance. Also, it will allow me to read from any schema hosted by the DB, which means I can use the DSN for any test schema I build on the instance.

Now that the DSN is created, we can move on to working in Enterprise Architect.

Case Study: Importing into a clean project

Note: I'm using images from a Company-Internal How-To guide that I authored. I feel the need to mask out some of the details. Alas I am not in a position to re-create the images from scratch. I debated omitting the images entirely but that might get confusing.

In Enterprise Architect, we have a clean, empty project, and we didn't use Wizards to create template objects. It is really just a simple folder hierarchy:

image

Right-click on the folder and select from the cascading drop-down menu of options:

  • Code Engineering  >  Import DB schema from ODBC

This will bring up the Import DB schema from ODBC source dialog.

Click on the chooser button on the right side of the "Database" field to bring up the ODBC Data Source chooser:

image

The DSN we created should be available under the Machine Data Source tab. Select it, and click OK. We should be prompted to enter a password for our pre-entered User Name, after which Enterprise Architect will show us all the schemas to which we have access on the DB:

image

Previously, on this instance, I ran the database build scripts and created a set of test schemas using the T_ name prefix.

We're going to import the contents of the T_IHD schema into our project, so we check the "T_IHD" schema name.

Our intention is to import (create) elements under the package folder, for each table in the T_IHD schema.

Review the filter options carefully!

The default settings are probably correct, if we are only interested in creating elements for each table. Note the Synchronization options:

image

The package folder is empty, so you might think that we need to change this to "Import as New objects". Don't worry. New objects will be created if they don't already exist.

Now click on "Import" button at the top right of the dialog. After a sort wait, during which EA is retrieving metadata from the database, the contents of the T_IHD schema will be presented to us in the "Select Database Objects to Import" dialog:

image

We are only interested in the tables, so check the [x] Tables checkbox and all the contained tables will be selected.

Are there objects we don't need to import?

Often there are objects we're not interested in including in the Data Model. For example, the TEMP_EVENT table might only be used as an interim location for data, during some business process, and not worth complicating the model with.

We can clear the checkbox next to the TEMP_EVENT table name to skip it.

Now, having validated our selection, we can press the OK button to start the import.

image

The process may take some time... when it completes, we can close the Import dialog, or select another schema and destination folder and import a different schema.

The table elements should now be visible in the Project hierarchy:

image

Case Study: Refreshing a schema

We need to refresh the RIX schema tables in the model. This schema is already in the model, with lots of additional element and attribute-specific notes that we need to retain.

Two important things to note:

  • The schema contents are organized differently (temp tables were imported into the model but then moved into a sub-folder);
  • There are some minor structure changes to columns, in the source database

image

We need to refresh the schema in the model, with the new structure from the DB. The process to follow is almost exactly the same as the clean import described earlier:

Select the package folder

  • Right Click > Code Engineering > Import the DB schema
  • select the T_RIX schema
  • Make sure the Synchronization options are "(o) Synchronize existing classes" and "[  ] Overwrite object comments". We do not want to remove any existing notes entered against elements and attributes in the model.

Now, in this example we are only interested in refreshing those tables that are already in the model, in the current folder. That represents a subset of the total set of tables that are in the current DB schema:

image

In the image above, LOG_CTL is a table we don't care about (not in the model); and the T_* tables are temporary tables that, in the model, are in a different folder. We'll do those next (see below).

After the import process is completed, we should be able to drill down and see the schema changes reflected in the updated model.

We can repeat the process for the Temp Tables sub-package, this time selecting only the T_* temporary tables in the import selection.

Summary:

  • Table objects in the model might be moved around into different folders, per type, for documentation purposes.
  • The schema import process wants to import all selected tables into the current folder
  • Therefore, if you have separated out tables in a single schema, you should refresh them on a folder-by-folder basis, selecting only the objects you want to be in each folder in the model.

With careful set-up, it is possible to import and refresh table structures from database schemas into a project in Enterprise Architect, without over-writing existing documentation and attribute notes.

Crashing a server process with Error 14508

We have a SQL query that accepts  WHERE clause parameters of date range. 1 year, 2 years, all work predictably. When I use a very large date range (i.e. "everything"), the Oracle dedicated server process crashes.

From the trace file:

*** 2016-05-20 22:06:12.735
*** SESSION ID:(630.42649) 2016-05-20 22:06:12.735
*** CLIENT ID:() 2016-05-20 22:06:12.735
*** SERVICE NAME:(SYS$USERS) 2016-05-20 22:06:12.735
*** MODULE NAME:(rix) 2016-05-20 22:06:12.735
*** CLIENT DRIVER:() 2016-05-20 22:06:12.735
*** ACTION NAME:(prepare_series(a)) 2016-05-20 22:06:12.735

Block Checking: DBA = 8236992, Block Type = Unlimited undo segment header
ERROR: Undo Segment Header Corrupted. Error Code = 14508
ktu4shck: starting extent(0xffff8000) of txn slot #0x6b is invalid.

Searching Google brought up a Russian blog page, that seems to be citing exactly the same issue:

*** MODULE NAME:(e:AR:bes:oracle.apps.xla.accounting.extract) 2015-10-01 06:37:00.659 -- станд.модуль OEBS
...

Block Checking: DBA = 14597504, Block Type = Unlimited undo segment header -- проверка UNDO
ERROR: Undo Segment Header Corrupted. Error Code = 14508 -- падает со специфичной для ANALYZE ошибкой (***)
ktu4shck: starting extent(0xffff8000) of txn slot #0x21 is invalid.
valid value (0 - 0x8000)

The gist of the post seems to indicate that they are using Oracle 12.1.0.2 and had set TEMP_UNDO_ENABLED=TRUE (a new feature in 12c).

This is interesting because we set this parameter true also, in our code.

Google Translate failed to come through with a useful translation, but it could be that they set this parameter FALSE in order to resolve the problem.

Previously, our test failed after ~2 hours. So far, having made the change to the parameter, it is still running after 4 hours. So, we're optimistic.