Teradata Top 10 Tips #2

#2 – Keep It Simple

We all know that Teradata SQL is easy to write, right? Sadly, bad SQL is even easier to write. By ‘bad’ SQL we typically mean SQL that generates an inefficient query plan.

No matter how we submit an SQL query, whether we use SQL Assistant or BTEQ or some other SQL tool, we are telling the Teradata system what we want.

The Teradata optimiser has the often unenviable task of turning the submitted SQL query into an execution plan. This describes how the Teradata system will physically execute the SQL query against the database.

To see the execution plan for any query simply prefix the SQL with the ‘EXPLAIN’ modifier and re-submit to Teradata. The response is how Teradata will execute the query. But you knew that, right?

It can be quite a challenge for the optimiser to turn a piece of arbitrary SQL into an efficient execution plan. The Teradata optimiser is ‘cost based’ and aims to generate the most efficient query plan, as you might expect.

Although the Teradata optimiser is undoubtedly one of the best in the market, sometimes even Teradata generates bad query plans. The difference between a ‘good’ (efficient) plan and a ‘bad’ (inefficient) plan can be the difference between a query that takes seconds/minutes and one that takes hours…or even days! A bad plan will simply consume resources and keep running until it hits a limit of some sorts – often spool.

A lot of what the Teradata optimiser has to do is to decide the sequence of table joins in the query. This is largely a function of the number of tables referenced in the query.

Consider the following relationship between the number of tables in a query and the number of ways in which they can be joined:

Number of Tables	Possible Join Orders
4	24
5	120
6	720
7	5,040
8	40,320
9	362,880
10	3,628,800

The number of possible join orders is the factorial of the number or tables in the query. As more tables are added, even one at a time, the number of possible join orders that the optimiser has to consider soon becomes very large.

We can help the optimsier in its quest to generate efficient query plans by not submitting massive SQL queries that join lots and lots of tables together just because we can – you know who you are ;-)

With this in mind, may we suggest the following:

• DO NOT join more than 6 fact (non-reference) tables together in a single SQL statement
• DO use work tables to store intermediate results – break that massive join into multiple smaller joins with intermediate results stored in tables…it’s what the optimiser does anyway under the covers via spool tables

It is not big or clever to write monster SQL queries, just because we can, and ‘let the optimiser figure it out’. That will only work up to a point, after which the chances of an inefficient plan become too high.

As the man said:

“Simplicity is the ultimate sophistication”
Leonardo Da Vinci (1452-1519)

Who are we to argue?

The post Teradata Top 10 Tips #2 appeared first on VLDB Solutions Blog.