{ET }
Since the terminal is used most of the time to access the database, take advantage of it and use the terminal for your HELP commands.
Tools like Queryman also have a variety of HELP commands and individual menus. Always look for ways to make the task easier.
SHOW commands
There are times when you need to recreate a table, view, or macro that you already have, or you need to create another object of the same type that is either identical or very similar to an object that is already created. When this is the case, the SHOW command is a way to accomplish what you need.
We will be discussing all of these object types and their associated Data Definition Language (DDL) commands later in this course.
The intent of the SHOW command is to output the CREATE statement that could be used to recreate the object of the type specified.
Figure 3-3
To see the CREATE TABLE command for the Employee table, we use the command:
SHOW TABLE Employee ; 13 Rows Returned
CREATE SET TABLE MJL.Employee ,NO FALLBACK , NO BEFORE JOURNAL,
NO AFTER JOURNAL (
Emp_Nbr INTEGER,
Last_Name CHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, First_Name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, Social_Sec_Nbr INTEGER,
Birth_Date DATE NOT NULL, Department_Nbr SMALLINT, Job_Nbr INTEGER,
Salary DECIMAL(10,2) NOT NULL)
UNIQUE PRIMARY INDEX ( Emp_Nbr );
To see the CREATE VIEW command, we use a command like:
SHOW VIEW TODAY ; 3 Rows Returned
CREATE VIEW MJL.TODAY AS
SELECT * FROM SYS_CALENDAR.CALENDAR WHERE CALENDAR_DATE = '2001-09-21';
To see the CREATE MACRO command for the macro called MYREPORT, we use a command like:
SHOW MACRO MYREPORT ; 9 Rows Returned
CREATE MACRO MJL01.MYREPORT (INPARM1 INTEGER, INPARM2 CHAR(10)) AS (SELECT DEPT,
DAY_OF_WEEK, AVG(SAL)
FROM SYS_CALENDAR.CALENDAR SC, MYTABLE
WHERE CALENDAR_DATE = :INPARM2 (DATE, FORMAT 'YYYYMMDD') AND DEPT = :INPARM1
GROUP BY 1,2; );
To see the CREATE TRIGGER command for AVG_SAL_T, we use:
SHOW TRIGGER AVG_SAL_T ; 20 Rows Returned
CREATE TRIGGER MJL.AVG_SAL_T
AFTER UPDATE OF (SALARY) ON MJL.EMPLOYEE REFERENCING OLD AS OLDROW
NEW AS NEWROW FOR EACH ROW
WHEN (NEWROW.SALARY >
(SELECT AVG(BUDGET) * .10 (DECIMAL(10,2)) FROM MJL01.DEPARTMENT ) )
(INSERT INTO MJL01.GREATER_10_PERCENT (EMP_NUM
,SAL_DATE ,OLDSAL ,NEWSAL
,PERC_OF_BUDGET )
VALUES (NEWROW.EMP_NBR ,CURRENT_DATE
,OLDROW.SALARY ,NEWROW.SALAR);
) ;
Since the SHOW command returns the DDL, it can be a real time saver. It is a very helpful tool when a database object needs to be recreated, a copy of an existing object is needed, or another object is needed that has similar characteristics to an existing object. Plus, what a great way to get a reminder on the syntax needed for creating a table, view, macro, or trigger.
It is a good idea to save the output of the SHOW command in case it is needed at a later date. However, if the object's structure changes, the SHOW command should be reexecuted and the new output saved. It returns the DDL that can be used to create a new table exactly the same as the current table. Normally, at a minimum, the table name is changed before executing the command
EXPLAIN
The EXPLAIN command is a powerful tool provided with the Teradata database. It is designed to provide an English explanation of what steps the AMP must complete to satisfy the SQL request. The EXPLAIN is based on the PE's execution plan.
The Parsing Engine (PE) does the optimization of the submitted SQL, the creation of the AMP steps and the dispatch to any AMP involved in accessing the data. The EXPLAIN is an SQL modifier; it modifies the way the SQL operates.
When an SQL statement is submitted using the EXPLAIN, the PE still does the same optimization step as normal. However, instead of building the AMP steps, it builds the English explanation and sends it back to the client software, not to the AMP. This gives users the ability to see resource utilization, use of indices, and row and time estimates.
Therefore, it can predict a Cartesian product join in seconds, instead of hours later when the user gets suspicious that the request should have been finished. The EXPLAIN should be run every time changes to an object's structure occur, when a request is first put into production and other key times during the life of an application. Some companies require that the EXPLAIN always be run before execution of any new queries.
The syntax for using the EXPLAIN is simple: just type the EXPLAIN keyword preceding your valid SQL statement. For example:
EXPLAIN
<SQL-command>
;
The EXPLAIN can be used to translate the actions for all valid SQL. It cannot provide a translation when syntax errors are present. The SQL must be able to execute in order to be explained.
Figure 3-4
Figure 3-4
Once you attain more experience with Teradata and SQL, these terms lead you to a more detailed understanding of the work involved in any SQL request. However, at this stage, there are two primary pieces of information on which to concentrate.
The first is the estimated number of rows that will be returned. This number is an educated guess that the PE has made based on information available at the time of the EXPLAIN.
This number may or may not be accurate. If there are current STATISTICS on the table, the numbers are more accurate. Otherwise, the PE calculates a guess by asking a random AMP for the number of rows it contains. Then, it multiples the answer by the number of AMPs to guess a "total row count." At the same time, it lets you know how accurate the number provided might be using the terms in the next chart.
Figure 3-5
The second area to check in the output of the EXPLAIN is the estimated cost, expressed in time, to complete the SQL request. Although it is expressed in time, do not confuse it with either wall-clock or CPU time. It is strictly a cost factor calculated by the optimizer for comparison purposes only. It does not take the number of users, the current workload or other system related factors into account. After looking at the potential execution plans, the plan with the lowest cost value is selected for execution. Once these two values are checked, the question that should be asked is: Are these values reasonable?
For instance, if the table contains one million rows and the estimate is one million rows in 45 seconds, that is probably reasonable if there is not a WHERE clause. However, if the table
contains a million rows and is being joined to a table with two thousand rows and the estimate is that two hundred trillion rows will be returned and it will take fifty days, this is not reasonable.
The following EXPLAIN is for a full table scan of the Student Table:
EXPLAIN
SELECT * FROM Student_table
;
12 Rows Returned Explanation _
1. First, we lock a distinct MIKEL."pseudo table" for read on a RowHash to prevent global deadlock for MIKEL.Student_table.
2. Next, we lock MIKEL.Student_table for read.
3. We do an all-AMPs RETRIEVE step from MIKEL.Student_table by way of an all-rows scan with no residual conditions into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 8 rows.
The estimated time for this step is 0.15 seconds.
4. Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.
The total estimated time is 0.15 seconds.
The EXPLAIN estimates, 8 rows and .15 seconds. Since there are 10 rows in the table, the EXPLAIN is slightly off in its estimate. However, this is reasonable based on the contents of the table and the SELECT statement submitted.
The next EXPLAIN is for a join that has an error in it, can you find it?:
EXPLAIN SELECT *
FROM Student_table S, Course_table C, Student_Course_table SC WHERE s.student_id = sc.student_id
;
Explanation _
1. First, we lock a distinct MIKEL."pseudo table" for read on a RowHash to prevent global deadlock for MIKEL.SC.
2. Next, we lock a distinct MIKEL."pseudo table" for read on a RowHash to prevent global deadlock for MIKEL.C.
3. We lock a distinct MIKEL."pseudo table" for read on a RowHash to prevent global deadlock for MIKEL.S.
4. We lock MIKEL.SC for read, we lock MIKEL.C for read, and we lock MIKEL.S for read.
5. We do an all-AMPs JOIN step from MIKEL.SC by way of a RowHash match scan with no residual conditions, which is joined to MIKEL.S. MIKEL.SC and MIKEL.S are joined using a merge join, with a join condition of
("MIKEL.S.Student_ID = MIKEL.SC.Student_ID"). The result goes into Spool 2, which is duplicated on all AMPs. The size of Spool 2 is estimated with low confidence to be 128 rows. The estimated time for this step is 0.19 seconds.
6. We do an all-AMPs JOIN step from MIKEL.C by way of an all-rows scan with no residual conditions, which is joined to Spool 2 (Last Use). MIKEL.C and Spool 2 are joined using a product join, with a join condition of ("(1=1)"). The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is
estimated with low confidence to be 512 rows. The estimated time for this step is 0.20 seconds.
7. Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.
The total estimated time is 0.39 seconds.
The EXPLAIN estimates nearly 512 rows will be returned and it will take .39 seconds.
Although the time estimate sounds acceptable, this is a very small table. Looking at the number of rows returned as 512 with only 14 rows in the largest of these tables. This is not reasonable based on the contents of the tables.
Upon further examination, the product join in step 6 is using (1=1) as the join condition where it should be a merge join. Therefore, this is a Cartesian product join. A careful
analysis of the SELECT shows a single join condition in the WHERE clause. However, this is a three-table join and should have two join conditions. The WHERE clause needs to be fixed and by using the EXPLAIN we have saved valuable time.
If you can get to the point of using the EXPLAIN in this manner, you are way ahead of the game. No one will ever have to slap your hand for writing SQL that runs for days, uses up large amounts of system resources and accomplishes absolutely nothing. You say, "Doctor, it hurts when I do this." The Doctor says, "Don't do that." We are saying, "Don't put extensive SELECT requests into production without doing an EXPLAIN on it.
Remember, always examine the EXPLAIN for reasonable results. Then, save the EXPLAIN output as a benchmark against any future EXPLAIN output. Then, if the SQL starts
executing slower or using more resources, you have a basis for comparison. You might also use the benchmark if you decide to add a secondary index. This prototyping allows you to see exactly what your SQL is doing.
Some users have quit using the EXPLAIN because they have gotten inaccurate results.
From our experience, when the numbers are consistently different than the actual rows being returned and the cost estimate is completely wrong, it is normally an indicator that STATISTICS should be collected or updated on the involved tables.