0

I've been working on building some indexes for a DB2 LUW database. We've implemented some new queries for a landing page and I'm trying to get performance up. I've found a few indexes on some tables that do not appear to be optimal in their ordering, ie, columns with very low selectivity are coming earlier than those with high selectivity. I'm looking to replace them with better versions, but I'm having a bit of confusion on join indexes.

For a bit of background, the queries aren't anything complicated, although they can be a bit large:

SELECT 
--About a dozen fields from TABLE A--
--A few fields from joined tables--
FROM
TABLE A
--A few inner join/left joins, mostly on A.ID1 and A.ID2, BIGINT generated keys--
WHERE
A.ONE = :x
AND A.TWO IN (:y)
AND A.THREE IN (--uncorrelated suquery--)
AND A.FOUR IS NULL
AND (A.FIVE BETWEEN :date1 AND :date2
OR
A.SIX = 'STUFF')
ORDER BY A.SEVEN

You get the idea. The cardinality on most of these columns is pretty apparent and it's easy to structure the index in terms of selectivity. Indexing on all of the fields used in the WHERE clause with the proper order has been quite successful in speeding things up. However, the join columns are a bit confusing.

A number of columns have already been indexed by themselves, including A.ID1 and A.ID2, which together form the primary key of the table. I presume that this is a clustered index. There are also some foreign key ID pairs indexed by themselves as well. What I'm wondering is if it is necessary or even useful to include these columns used in the joins within the index that covers the WHERE clause fields. I've heard it said plenty that joined columns should be indexed, WHERE clause columns should be indexed, and they are, but separately. I haven't really been able to find anything definitive (or "usually a good idea, but not always") on the subject. What is the general practice for this sort of thing? Separate indexes or put them all together if the query is important?

In addition, A.SEVEN is a column with unique values, but we're only using it in ORDER BY. Again, I haven't exactly found anything definitive, but does the fact that it's only being used in ORDER BY (well, and in the SELECT statement) affect its placement within the index regardless of cardinality (ie, it is placed at the end of the index as it will not be used for filtering, only sorting, or place it at the beginning due to uniqueness), or should it also be left in a separate index?

And as an afterthought, the column A.FOUR is only checked for null. Would this mean that the cardinality of any non-null data is irrelevant and it should be placed late in the index as we're only looking for null values? A.FOUR is likely to be mostly nulls, but will be largely unique when it isn't null.

user1017413
  • 2,023
  • 4
  • 26
  • 41
  • Just for fun, take "and a.four is null" out of your query and see if it helps. – Dan Bracuk Apr 19 '13 at 16:31
  • As a side issue, please [don't use `BETWEEN` (especially with date/time/timestamps)](https://sqlblog.org/2011/10/19/what-do-between-and-the-devil-have-in-common). Although the blog post is about SQL Server, LUW can run into roughly the same issue, as you can specify how many decimal places to store seconds to. It's really also the wrong way to think about ranges in general - for example, an account is active so long as its status isn't some form of 'cancelled' yet: always query ranges as if they can be infinitely precise. – Clockwork-Muse Apr 19 '13 at 16:32
  • Taking out "a.four is null" made the explain plan considerably more costly. – user1017413 Apr 19 '13 at 16:41

1 Answers1

0

In general, database indexes are like book indexes: when you want to find something, you start looking from the left-hand side of the search term, not in the middle. So if you have a compound Index (last name, first name), for example, you can expect the compound index to work properly on last name only, but not first name only. If you want to join first name only, you would need to index first name separately.

See Also
https://stackoverflow.com/a/2228233

Community
  • 1
  • 1
Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • I'm aware of this. All fields in the index will be utilized. It's more a question of whether to combine indexes based on various criteria, leave them separated, and how to determine cardinality based on how the column is filtered. – user1017413 Apr 19 '13 at 16:02
  • Then I've apparently missed the point of your question. You've got a lot in one post; perhaps you should try tackling each issue in a separate question. – Robert Harvey Apr 19 '13 at 16:03