Although the original syntax still works, there is a newer version of the join using the INNER JOIN syntax. It works exactly the same as the original join, but is written slightly different.
The following syntax is for a two-table INNER JOIN:
SELECT [<table-name1>.]<column-name>
[[,<table-name2>.]<column-name> ]
FROM <table-name1> [AS <alias-name>] [INNER] JOIN <table-name2> [AS <alias-name>]
ON [<table-name1>.]<column-name> = [<table-name2>.]<column-name>
[ WHERE <condition-test> ]
;
There are two primary differences between the new INNER JOIN and the original join syntax. The first difference is that a comma (,) no longer separates the table names. Instead of a comma, the words INNER JOIN are used. As shown in the above syntax format, the word INNER is optional. If only the JOIN appears, it defaults to an INNER JOIN.
The other difference is that the WHERE clause for the join condition is changed to an ON to declare an equal comparison on the common domain columns. If the ON is omitted, a syntax error is reported and the SELECT does not execute. So, the result is not a Cartesian product join as seen in the original syntax. Therefore, it is safer to use.
Although the INNER JOIN is a slightly longer SQL statement to code, it does have advantages. The first advantage, mentioned above, is fewer accidental Cartesian product joins because the ON is required. In the original syntax, when the WHERE is omitted the syntax is still correct. However, without a comparison, all rows of both tables are joined with each other and results in a Cartesian product.
The last and most compelling advantage of the newer syntax is that the INNER JOIN and OUTER JOIN statements can easily be combined into a single SQL statement. The OUTER JOIN syntax, explanation and significance are covered in this chapter.
The following is the same join that was performed earlier using the original join syntax. Here, it has been converted to use an INNER JOIN:
SELECT cust.Customer_number ,Customer_name
,Order_number
,Order_total (FORMAT'$$$,$$9.99' )
FROM Customer_table AS cust INNER JOIN Order_table AS ord ON cust.customer_number = ord.customer_number
ORDER BY 2 ;
Like the original syntax, more than two tables can be joined in a single INNER JOIN. Each consecutive table name follows an INNER JOIN and associated ON clause to tell which columns to match. Therefore, a ten-table join has nine JOIN and nine ON clauses to identify each table and the columns being compared. There is always one less JOIN / ON
combination than the number of tables referenced in the FROM.
The following syntax is for an INNER JOIN with more than two tables:
SELECT [<table-name1>.]<column-name>
[, <table-name2>.]<column-name> ]]
[, <table-nameN>.]<column-name> ]]
FROM <table-name1> [AS <alias-name1> ]
[INNER] JOIN <table-name2> [AS <alias-name2> ]
ON [<table-name1>.]<column-name> = [<table-name2>.]<column-name>
[INNER] JOIN <table-nameN> [AS <alias-nameN> ]
ON [<table-name2>.]<column-name> = [<table-nameN>.]<column-name>
[WHERE <condition-test> ]
;
The <table-nameN> reference above is intended to represent a variable number of tables. It might be a 3-table, a 10-table or up to a 64-table join. The same approach is used
regardless of the number of tables being joined together in a single SELECT.
The other difference between these two join formats is that regardless of the number of tables in the original syntax, there was only a single WHERE clause. Here, each additional INNER JOIN has its own ON condition.
If one ON is omitted from the INNER JOIN, an error code of 3706 will be returned. This error keeps the join from executing, unlike the original syntax, where a forgotten join condition in the WHERE is allowed, but creates an accidental Cartesian product join.
The next INNER JOIN is converted from the 3-table join seen earlier:
SELECT Last_name (Title 'Last Name') ,First_name AS First
,S.Student_ID
,Course_name AS Course
FROM Student_table AS S INNER JOIN Student_Course_table AS SC ON S.student_id = SC.student_id
INNER JOIN Course_table AS C ON C.course_id = SC.course_id AND course_name LIKE '%V2R3%' ORDER BY Course, Last_name;
3 Rows Returned Last
Name First Student_ID Course
Bond Jimmy 322133 V2R3
SQL Feature s
Hanson Henry 125634 V2R3
SQL Feature
Last
The INNER JOIN syntax can use a WHERE clause instead of a compound ON comparison.
It can be used to add one or more residual conditions. A residual condition is a comparison that is in addition to the join condition. When it is used, the intent is to potentially eliminate rows from one or more of the tables.
In other words, as rows are read the WHERE clause compares each row with a condition to decide whether or not it should be included or eliminated from the join processing. The WHERE clause is applied as rows are read, before the ON clause. Eliminated rows do not participate in the join against rows from another table. For more details, read the section on WHERE clauses at the end of this chapter.
The following is the same SELECT using a WHERE to compare the Course name as a residual condition instead of a compound (AND) comparison in the ON:
SELECT Last_name (Title 'Last Name') ,First_name AS First
,S.Student_ID ,Course_name AS Course
FROM Student_table AS S INNER JOIN Student_Course_table AS SC ON S.student_id = SC.student_id
INNER JOIN Course_table AS C ON C.course_id = SC.course_id WHERE course_name LIKE '%V2R3%' ORDER BY Course, Last_name;
As far as the INNER JOIN processing is concerned, the PE will normally optimize both of these last two joins exactly the same. The EXPLAIN is the best way to determine how the optimizer uses specific Teradata tables in a join operation.