SQL GROUP BY Clause

What is the GROUP BY Clause?

The GROUP BY clause is a SQL statement that allows you to group rows in a table based on one or more columns. It is commonly used in combination with aggregate functions to perform calculations on grouped data. The GROUP BY clause helps in generating summary reports and analyzing data at a higher level of abstraction.

Syntax of the GROUP BY Clause : 

  • The syntax of the GROUP BY clause is as follows:

Syntax :

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1

The GROUP BY clause comes after the WHERE clause (if used) and before the ORDER BY clause (if used). It specifies the columns by which the result set should be grouped.

Example 1: Grouping Data by a Single Column

  • Let’s consider a table called employees with the following structure and data:

Table: Employees

employee_idfirst_namelast_namedepartment_id
1JohnDoe101
2JaneSmith102
3DavidJohnson101
4EmilyBrown103
5MichaelDavis102
6SarahMiller103
  • Now, let’s write a SQL query that groups the employees by the department_id column:

Syntax :

SELECT department_id, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id;

Output Table 1 :

department_id num_employees
101 2
102 2
103 2

Example 2: Grouping Data by Multiple Columns 

  • You can also group data by multiple columns. Let’s modify the previous query to group the employees by both department_id and first_name columns:

Syntax :

SELECT department_id, first_name, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id, first_name;

Output Table 2 :

department_idfirst_namenum_employees
101John1
101David1
102Jane1
102Michael1
103Emily1
103Sarah1

Example 3: Using Aggregate Functions with GROUP BY

  • Aggregate functions can be used along with the GROUP BY clause to perform calculations on grouped data. Let’s modify our previous query to include the MAX function to find the maximum employee ID in each department:

Syntax :

SELECT department_id, MAX(employee_id) AS max_employee_id
FROM employees
GROUP BY department_id;

Output Table  3 :

department_idmax_employee_id
1013
1025
1036

Example 4: Filtering Grouped Data with HAVING Clause

  • The HAVING clause is used to filter grouped data based on conditions. Let’s modify our previous query to include the HAVING clause and find departments with more than one employee:

Syntax :

SELECT department_id, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 1;

Output Table  4 :

department_idnum_employees
1012
1022
1032

Example 5: GROUP BY with JOINs

  • The GROUP BY clause can be used in combination with JOINs to group data from multiple tables. Let’s consider two tables, employees and departments, and write a query to group employees by department name:

 Table : departments

department_iddepartment_name
101Sales
102Marketing
103Finance

Syntax :

SELECT d.department_name, COUNT(*) AS num_employees
FROM employees e
JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name;

Output Table  5 :

department_namenum_employees
Sales2
Marketing2
Finance2

Most Frequently Asked Questions (FAQs) :

Question 1. What is the purpose of the GROUP BY clause in SQL?

The GROUP BY clause in SQL is used to group rows in a table based on one or more columns. It allows for the aggregation of data and the generation of summary reports.

Question 2. Can I use aggregate functions without the GROUP BY clause?

Yes, you can use aggregate functions without the GROUP BY clause. However, in such cases, the aggregate function will consider the entire result set as a single group.

Question 3. Can I use multiple aggregate functions in the same SELECT statement?

Yes, you can use multiple aggregate functions in the same SELECT statement. This allows you to perform different calculations on the grouped data.

Question 4. Can I use the WHERE clause with the GROUP BY clause?

Yes, you can use the WHERE clause to filter the rows before they are grouped by the GROUP BY clause.

Question 5.  Can I use the ORDER BY clause with the GROUP BY clause?

Yes, you can use the ORDER BY clause to sort the output of the GROUP BY clause based on specific columns.

Get over 200+ course One Subscription

Courses like AI/ML, Cloud Computing, Ethical Hacking, C, C++, Java, Python, DSA (All Languages), Competitive Coding (All Languages), TCS, Infosys, Wipro, Amazon, DBMS, SQL and others

Checkout list of all the video courses in PrepInsta Prime Subscription

Checkout list of all the video courses in PrepInsta Prime Subscription