• No results found

TRY IT OUT Counting NULL(s) and Zeroes

In document Discovering SQL (Page 172-176)

---50.4875

The AVG function takes into consideration only records for which data exist. It is a good idea to ignore the record and not implicitly convert NULL into zero.

TRY IT OUT

Counting NULL(s) and Zeroes

In order to explore how AVG function treats NULL, we will update some of our records with NULL values and then calculate averages. We will use Microsoft SQL Server 2008 to run the queries, but the behavior is identical across all RDBMSs in this book.

1.

Open Microsoft SQL Server Management Studio and connect to your database using Windows authentication.

2.

In the upper-left corner click the New Query button.

3.

In the opened query window (the middle pane), enter the following SQL query:

SELECT AVG(bk_price) AS average_price FROM books;

average_price ---29.5758

4.

Update the BK_PRICE column for one of the records; for example, record with BK_ID=12 to NULL:

UPDATE books SET bk_price = NULL WHERE bk_id = 12;

5.

Run the query from Step 3 again by pressing the Execute button on the toolbar of SQL Server Management Studio:

SELECT AVG(bk_price) AS average_price FROM books;

average_price ---31.3372

c05.indd 138

c05.indd 138 3/15/2011 12:23:52 PM3/15/2011 12:23:52 PM

www.it-ebooks.info

6.

Let’s update the same record (BK_ID=12) with zero value:

UPDATE books SET bk_price = 0 WHERE bk_id = 12;

7.

Run the query from Step 3 again:

SELECT AVG(bk_price) AS average_price FROM books;

average_price ---28.7258

8.

Restore the original value for the record 12:

UPDATE books SET bk_price = 10.20 WHERE bk_id = 12;

How It Works

The mathematical average is calculated as the sum of all values, divided by their count. When all 12 records have numeric values in the BK_PRICE column, the average is 29.5758.

When the BK_ID ⫽ 12 record is updated with NULL value, it gets excluded both from the sum of the values (dividend) and from the record count (divisor); only 11 records and their values will be used to calculate the average. The result is therefore is bit higher (31.3372) because the price taken out of the equation is relatively small ($10.20).

When we update the record with the zero value, both the zero and the record are counted toward the averages. The result is different because now this book IS counted. In fact, it is counted as having cost nothing, which brings the average down some.

Finally, if we tried to compute the AVG on a column where there were no data in ANY rows (all NULLs, all the way down), we would get a NULL as our result. No data means no data!

This behavior is identical across all RDBMSs.

COUNT()

Counting the number of records in your data set is easy with the COUNT function. There are two primary ways this function can be used: counting everything or just specifi c columns:

SELECT COUNT(*) AS records_count FROM location;

record_count ---12

Used in this way, the function allows us to compute the number of rows in the table. The asterisk symbol (meaning “all”) can be substituted for a specifi c column name, say, BK_ID:

c05.indd 139

c05.indd 139 3/15/2011 12:23:52 PM3/15/2011 12:23:52 PM

www.it-ebooks.info

SELECT COUNT(bk_d) AS records_count FROM location;

record_count ---12

The results are identical because every row in the column BK_ID has a value in it. This is not the case when the column contains NULL (“no data here”), but if we NULL-out some column value in the BOOKS table and run our COUNT on it, the results would be quite different:

UPDATE books SET bk_price = NULL where bk_id = 12;

SELECT COUNT(bk_price)AS counting_prices FROM books;

counting_prices ---11

Now we have a different result! What’s the difference? This time we have asked the function to compute the number of rows, where there is a value in the given column (bk_price in this case; don’t forget to restore the value as in Step 8 of the fi rst Try It Out in this chapter).

Use of aggregate functions is not limited to the SELECT, UPDATE, and INSERT lists; they just as successfully can be used in the WHERE clause, though additional restrictions apply. (This is discussed later in this chapter.)

Again, we can put conditions on this query: for instance, how many books do we have in the right-most bookcase in the living room?

SELECT COUNT(*) AS counting_books FROM books

WHERE (bk_price > 30);

counting_books

---4

Looks like only four out of 12 books cost more than 30 dollars, which might indicate that either we do not buy expensive books or we just forgot to enter the prices. Remember that the NULL value is not counted.

MAX()

This function returns the highest value in the specifi ed column:

SELECT MAX(bk_price) AS max_price FROM books;

max_price

---69.26

c05.indd 140

c05.indd 140 3/15/2011 12:23:53 PM3/15/2011 12:23:53 PM

www.it-ebooks.info

What would happen if there were several books with identical prices? Only one number will be returned regardless of how many there might be. When the column is the numeric data type, the greatest number will be returned and applied to a text fi eld. It will return the value that sorts last in the alphabetical order, and it cannot be applied to either binary or XML data type columns. For instance, consider the following query:

SELECT MAX(bk_title) AS max_title FROM books;

max_title

---Wiley Pathways: Introduction to Database Management

The title was returned because it is last on the alphabetically sorted titles list.

MIN()

This function returns the lowest value in the specifi ed column:

SELECT MIN(bk_price) AS min_price FROM books;

min_price

---8.89

Applied to a text fi eld, it will return the value that sorts fi rst in the alphabetical order:

SELECT MIN(bk_title) AS min_title FROM books;

min_title

---A La Recherche du Temps Perdu

As with every other aggregate function, columns containing NULL values are simply ignored.

Both MAX and MIN functions work on single-value columns, and you cannot ask them to return, say, a greatest/least number from a list of numbers; you would have to use a different function.

Both Oracle and MySQL offer GREATEST and LEAST functions; for instance, the following query would be perfectly legal in Oracle (but would not work in IBM DB2, SQL Server, Microsoft Access, or PostgreSQL):

SELECT

LEAST(1,2,3,4,5,6,7,8) AS min_num ,GREATEST1,2,3,4,5,6,7,8) AS max_num FROM dual;

min_num max_num ---1 8

There are ways to emulate this functionality in other RDBMSs: by creating custom functions or by delving into advanced SQL data manipulation, which is outside the scope of this book.

c05.indd 141

c05.indd 141 3/15/2011 12:23:53 PM3/15/2011 12:23:53 PM

www.it-ebooks.info

SUM()

To calculate totals, SQL supplies the SUM function; this function returns the total of all numeric values in a column:

SELECTSUM(bk_price) total_costs FROM books

total_costs ---354.91

Similar to AVG, any rows without data in the given column (showing NULL in it) are ignored. An attempt to SUM the column that has no data in any row results in NULL. If you try to run SUM on a non-numeric (text) column, you will get an error.

Table 5-1 shows common aggregate functions in use.

TABLE 5-1: Commonly Used Aggregate Functions

SQL FUNCTION DESCRIPTION

AVG Returns the calculated average of the list of numbers; NULL values are ignored.

COUNT Returns the number of row s retrieved by the query; NULL values are ignored.

MAX Returns a single maximum value for a given column ; NULL values are ignored.

MIN Returns a single minimum value for a given column ; NULL values are ignored.

SUM Returns the sum of the numeric values in a column; NULL values are ignored.

In document Discovering SQL (Page 172-176)