Ask the Expert

NULL columns in a LEFT OUTER JOIN

When I run the following query I am getting 10 rows:

SELECT A.COL1,A.COL_BEG_DT,A.COL_END_DT
FROM SCHEMA1.TABL_1 A
INNER JOIN
SCHEMA1.TABL_2 B
ON A.ID = B.ID
LEFT OUTER JOIN
SCHEMA1.TABL_C
ON A.ID = C.ID
LEFT OUTER JOIN
SCHEMA1.TABL_D
ON A.ID = D.ID
WHERE A.COL_BEG_DT < 1999
AND B.COL_CD = 'X'
AND C.ID IS NULL
AND D.ID IS NULL

If I change the NULL checking clause as below, then my result set is 33 rows:

SELECT A.COL1,A.COL_BEG_DT,A.COL_END_DT
FROM SCHEMA1.TABL_1 A
INNER JOIN
SCHEMA1.TABL_2 B
ON A.ID = B.ID
LEFT OUTER JOIN
SCHEMA1.TABL_C
ON A.ID = C.ID

AND C.ID IS NULL
LEFT OUTER JOIN
SCHEMA1.TABL_D
ON A.ID = D.ID
AND D.ID IS NULL
WHERE A.COL_BEG_DT < 1999
AND B.COL_CD = 'X'

I am not sure why the difference in result sets due to the change in position of NULL checking. Your help regarding this is much appreciated.

    Requires Free Membership to View

To understand why this is happening, you have to know how a left outer join works. In a left outer join, all the rows from the left table(s) will be returned, together with matching rows from the right table, if any.

It is the "matching" specifications that are important here.

In your first query, the matching specifications are given by the ON conditions. You are specifically looking for rows from the C and D tables with A.ID=C.ID and A.ID=D.ID. Any A and B rows which do not have matching C or D rows will be returned, but all columns in the result set which would have come from C or D will be NULL. Then, when you test for C.ID IS NULL and D.ID IS NULL in the WHERE clause, you are in effect asking for only the result set rows which had neither a matching C nor a matching D row. It appears that there are 10 A and B rows which have no matching C or D row.

In your second query, the matching specifications given by the ON conditions are slightly different. In order to satisfy the ON conditions, the C.ID and D.ID columns must be NULL. This is highly unlikely, given that they are probably keys. Furthermore, it means that in order for there to be a match, A.ID would have to be equal to NULL. And as we all know, nothing is equal to NULL, not even another NULL. So since there is no matching row in the C or D table with a NULL ID, then there is no match. This will occur for every A and B, not just for those which have no non-null match, but for all of them. This is why the second query returns more rows. But note that for all of the returned rows, the columns in the result set that would've come from C and D will be NULL, because no matching rows were found.

This was first published in November 2004

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: