Group by statement tells SQL server, pull out the distinct values for the columns specified after group by. The Select says, once you pull out the distinct values, give me this aggregate.
If I want to find the number of people from each department, what I will tell SQL server is to group by department.
Name
|
Department
|
Salary
|
Gender
|
Alan
|
electrical
|
1000
|
M
|
Sue
|
electrical
|
1500
|
F
|
Sam
|
electrical
|
1400
|
M
|
Jamie
|
computer
|
2500
|
F
|
John
|
computer
|
3000
|
M
|
Sandra
|
HR
|
1100
|
F
|
Sharon
|
HR
|
1200
|
F
|
SELECT department,
Count(*)
FROM employees
GROUP BY department
The above statement can be read as, fetch distinct values of department, then tell me the number of rows in which each department appears.
The results for the above statement are
department
|
(No column name)
|
computer
|
2
|
electrical
|
3
|
HR
|
2
|
Suppose I want the maximum salary for each department, my SQL statement would be
select department,MAX(salary)
from Employees
group by
department
department
|
(No column name)
|
computer
|
3000
|
electrical
|
1500
|
HR
|
1200
|
These are the commonly used aggregate functions in SQL
COUNT()
|
Returns the number of rows
|
AVG()
|
Returns
the average value
|
MAX()
|
Returns
the highest value
|
MIN()
|
Returns
the lowest value
|
SUM()
|
Returns
the sum
|
FIRST()
|
Returns
the first value
|
LAST()
|
Returns
the last value
|
Check this out for a complete reference
Having Clause
When you use Group by, you cannot use where on aggregate functions. For example
SELECT department,
Count(*)
FROM employees
GROUP BY department
where Count(*) >2
will fail. For aggregate functions you can use having instead
SELECT department,
Count(*)
FROM employees
GROUP BY department
having Count(*) >2