ORA-07445 error in a merge query results in terminated connection and server-side core dump

ORA-07445 Error in a MERGE query

ORA-03113: end-of-file on communication channel. This was the error that greeted me after many hours work converting some PL/SQL scripts to run as a package of functions, views and stored procedures.

My first thought was actually that something or someone had terminated my connection to the database. Given that I was working on a DEV/TEST instance, I wouldn’t necessarily have expected to be notified about a momentary disruption to service. After reconnecting and repeating the same and other queries, it became clear the only thing causing this was one of the procedures in a package I had written.

I tried a few changes without any success and then decided to call it a day, with the intention of speaking to the one of the DBAs the following day to see if they could provide any further information.

As it turned out, I didn’t have get in touch with the DBAs… The next morning they were already looking for me!

It turns out that the error which had been unexpectedly terminating my connection had been causing an access violation followed by a core dump on the server. Of course this is the kind of thing that raises all kind of alerts even on a development/test instance.

The DBA who got in touch provided me with a section of the server alert log, which contained the following error:

When running from PL/SQL developer, the error reported on the client side was just ORA-03113: end-of-file on communication channel (together with the process and session IDs).
Running the same from other clients gave the even less helpful message: “Error report: No more data to read from socket”.
Even with the ORA-07445 exception alert message from the server logs, the cause was by no means obvious.
My web searches didn’t turn up any useful suggestions either. It was beginning to look like I would have to get the DBA to raise a support incident with Oracle and wait to hear back.

At the customer’s site, I had been working on an Oracle 11g Standard Edition (11.2.0.4.0, Windows, 64bit).
Back home in the evening, I decide to write a test based on the same query, but simplified as much as possible while still retaining the essential aspects of what the original had been doing.
I ran it on a personal instance of Oracle Database 12c (12.1.0.1.0, Windows, 64bit).

The response was exactly the same:

With the freedom to experiment more freely on my own instance, I could try as many variations as I needed without risk of incurring the wrath of the DBAs over the CPU and disk load that occurs during a core dump.

I managed to reduce the code to a simple example which still produced the same error. With a further bit of trial and error, I was able discover the cause (and thus the workaround).

Before showing you what I had done wrong, let’s take a look at the fixed version which demonstrates a very simple use of the MERGE statement.

First, a table and initial test data:

In case you’re not familiar with the MERGE statement, the TL;DR on it is that it allows you to perform UPDATE, INSERT, and DELETE actions on the same table in single SQL statement.

For this example, I chose only to insert and update.
Using a list of Ids, the example will either insert a new row if the value does not exist in the column ut_id, and if it does exist then it will update the value of some_text.

The input for the merge will be:

That simple sequence can be produced with the following query:

Our desired output for this test will be:

All of which can be done in a single SQL statement:

So, now that we have a simple working example, lets look at what I did differently and what caused it to fail.

The first difference is that my query was coded in a package, using a refcursor passed in as a parameter, and then performing the merge in in a loop using a bulk collect.
Here is an example using a local cursor which, though simplified from the original code, is sufficient to illustrate the issue:

This code works, and performs the merge with the same results as the first example.

Now, to the problem:

When writing the code that failed, I had been converting it from legacy code that performed the complete set of inserts and then the updates in separate loops and separate transactions.

The insert statement had also included query hints for a Direct-Path insert, which I retained on the MERGE statement:
(the highlighted line)

This was what caused the exception.

From the client side, the only error information you receive is:

It is only the alert logs and incident report that show the ORA-07445 error code and associated details.

What happens if we try the same hint on our original single-statement merge query?

It works, without an error:

If, prior to committing those changes we attempt to run the same statement again, or even just select from the table, then we hit the following error:

Beware of query hints!
(…and the Oracle bugs you haven’t found yet)

Leave a Reply