Monday, June 18, 2007

[DB]The difference between ON and WHERE

"Note: The join condition of an inner join can be written either in the WHERE clause or in the JOIN clause. For example, these table expressions are equivalent: FROM a, b WHERE a.id = b.id AND b.val > 5 and FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5 or perhaps even FROM a NATURAL JOIN b WHERE b.val > 5 Which one of these you use is mainly a matter of style. The JOIN syntax in the FROM clause is probably not as portable to other SQL database management systems. For outer joins there is no choice in any case: they must be done in the FROM clause. An ON/USING clause of an outer join is not equivalent to a WHERE condition, because it determines the addition of rows (for unmatched input rows) as well as the removal of rows from the final result. "

The above document comes from Postgresql's documentation. In short, that means for INNER JOIN, the conditions under ON or WHERE have the same effect; but for OUTER JOIN, the conditions under ON and WHERE are different.

After some thought, I think conditions under ON will be applied during or before table-joining, but conditions under WHERE will be applied after the table-joining. The time when conditions are applied is the key point.

For example: FROM a, b WHERE a.id = b.id AND b.val > 5 [1] and FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5 [2] are equivalent,but FROM a LEFT JOIN b ON a.id = b.id AND b.val > 5 [3] and FROM a LEFT JOIN b ON a.id = b.id WHERE b.val > 5 [4] are different.

For query [3], condition b.val > 5 will be applied to table b before table a joins table b;For query [4], condition b.val > 5 will be applied to the joined table after table a joins table b;

PS: After passing the WHERE filter, the derived input table may be subject to grouping, using the GROUP BY clause, and elimination of group rows using the HAVING clause.

No comments: