Navigate by Range

SQL & MySQL Interview Questions 16-20 (Data Manipulation & Advanced Queries)

Hello! In our previous lessons, we covered the fundamentals of database structure, schemas, and the indexes that make queries fast. Now, we'll move into the practical day-to-day work of a developer: manipulating data and writing complex queries.

This lesson covers some of the most classic and frequently asked SQL interview questions. We'll start by understanding the different ways to delete data. Then we'll write queries to find the "Nth highest" record, aggregate data using `GROUP BY`, and finally, tackle the common problem of finding and removing duplicate data. Let's get started.

16. What is the difference between DELETE, TRUNCATE, and DROP commands?

This is a fundamental question about data manipulation and definition. All three commands remove data, but they work in very different ways and have different consequences.

Analogy: Think of a book in a library.

DELETE: This is like taking a pencil and an eraser to the book. You go row-by-row (line-by-line) and erase the specific sentences (`WHERE` clause) you don't want. It's slow, and because it's logged, you can (often) `ROLLBACK` (undo) the change.
TRUNCATE: This is like ripping all the pages out of the book at once and throwing them away, but keeping the book cover (the table structure). It's incredibly fast, but you can't undo it. It also resets any page numbering (auto-increment counters).
DROP: This is like taking the entire book, cover and all, and throwing it into a fire. The entire table (data, structure, indexes) is gone forever.

Here is a more technical breakdown:

Feature	DELETE	TRUNCATE	DROP
Type	DML (Data Manipulation)	DDL (Data Definition)	DDL (Data Definition)
`WHERE` Clause	Yes (Can delete rows)	No (Deletes all rows)	No (Deletes table)
Speed	Slow (Logs each row)	Very Fast	Very Fast
Rollback?	Yes	No (Implicit commit)	No (Implicit commit)
Resets Auto-Increment	No	Yes	N/A (Table is gone)

Code Examples:

-- Create a demo table
CREATE TABLE Demo (id INT, name VARCHAR(10));
INSERT INTO Demo VALUES (1, 'A'), (2, 'B'), (3, 'C');

-- 1. DELETE: Removes a specific row
DELETE FROM Demo WHERE id = 1;
-- Output: Query OK, 1 row affected

-- 2. TRUNCATE: Removes all remaining rows
TRUNCATE TABLE Demo;
-- Output: Query OK, 0 rows affected (or 2 rows)

-- 3. DROP: Deletes the table itself
DROP TABLE Demo;
-- Output: Query OK, 0 rows affected

17. Write SQL to find the nth highest salary.

This is a classic query problem that tests your knowledge of sorting, limiting, and (in the modern solution) window functions. Let's find the 3rd highest salary.

First, let's create our sample data. Notice we include a duplicate salary for 'Carol' and 'David'. This is important.

CREATE TABLE Employees (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    salary INT
);

INSERT INTO Employees VALUES
(1, 'Alice', 90000),
(2, 'Bob', 85000),
(3, 'Carol', 100000),
(4, 'David', 100000),
(5, 'Eve', 70000),
(6, 'Frank', 110000);

Sample Output (for INSERTs):

Query OK, 6 rows affected.

Method 1: Modern (Window Function) - Recommended

This is the best and most reliable method. We use `DENSE_RANK()`. `DENSE_RANK()` is perfect because it handles ties gracefully (Carol and David will both get rank 2) and doesn't skip numbers (the next rank will be 3).

-- Let's find the 3rd highest salary (N=3)
WITH SalaryRanks AS (
    SELECT 
        name,
        salary,
        DENSE_RANK() OVER (ORDER BY salary DESC) as salary_rank
    FROM Employees
)
SELECT name, salary
FROM SalaryRanks
WHERE salary_rank = 3;

Expected Output:

+-------+--------+
| name  | salary |
+-------+--------+
| Alice |  90000 |
+-------+--------+
1 row in set

Method 2: Classic (LIMIT and OFFSET)

This method is also common. We `SELECT` all distinct salaries, sort them, and use `LIMIT` to pick the Nth one. `OFFSET 2` means "skip the first 2," and `LIMIT 1` means "give me the next 1." (N-1 = 2).

-- Find the 3rd highest salary (N=3)
SELECT DISTINCT salary
FROM Employees
ORDER BY salary DESC
LIMIT 1 OFFSET 2;

Expected Output:

+--------+
| salary |
+--------+
|  90000 |
+--------+
1 row in set

18. Write SQL to count employees in each department.

This is the classic "Hello, World!" for SQL aggregation. The key phrase is "in each department," which tells you to use `GROUP BY department`.

-- Create a simple table
CREATE TABLE EmployeesDept (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50)
);

-- Insert sample data
INSERT INTO EmployeesDept VALUES
(1, 'Alice', 'Engineering'),
(2, 'Bob', 'Sales'),
(3, 'Carol', 'Engineering'),
(4, 'David', 'Marketing'),
(5, 'Eve', 'Sales'),
(6, 'Frank', 'Engineering');

-- Run the query
SELECT 
    department,
    COUNT(*) AS employee_count
FROM EmployeesDept
GROUP BY department;

Expected Output:

+-------------+----------------+
| department  | employee_count |
+-------------+----------------+
| Engineering |              3 |
| Sales       |              2 |
| Marketing   |              1 |
+-------------+----------------+
3 rows in set

19. Write SQL to find duplicate records in a table.

This builds directly on `GROUP BY`. After grouping, we use the `HAVING` clause to filter the groups (not the rows). We are looking for groups `HAVING` a count greater than 1.

Let's find users who have a duplicate `email`.

CREATE TABLE Users (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    email VARCHAR(50)
);

INSERT INTO Users VALUES
(1, 'Alice', 'alice@example.com'),
(2, 'Bob', 'bob@example.com'),
(3, 'Carol', 'carol@example.com'),
(4, 'David', 'alice@example.com'), -- Duplicate email
(5, 'Eve', 'eve@example.com'),
(6, 'Frank', 'bob@example.com');   -- Duplicate email

-- Find the emails that are duplicated
SELECT
    email,
    COUNT(*) AS email_count
FROM Users
GROUP BY email
HAVING email_count > 1;

Expected Output:

+-------------------+-------------+
| email             | email_count |
+-------------------+-------------+
| alice@example.com |           2 |
| bob@example.com   |           2 |
+-------------------+-------------+
2 rows in set

20. Write SQL to delete duplicate records while keeping only one.

This is an advanced problem that combines everything. How do you delete the duplicates found in the previous question, but keep one original?

The safest and most modern way is to use a Common Table Expression (CTE) with a window function like `ROW_NUMBER()`.

Analogy: We line up everyone who has the same email. We give them a number (`ROW_NUMBER()`) based on their `id` (e.g., the lowest `id` gets 1, the next gets 2, etc.). Then, we just delete everyone whose number is greater than 1.

-- We'll use the same 'Users' table from Q19.

-- First, let's SELECT the rows we *want* to delete.
-- This is a safe way to test our logic.
WITH RankedUsers AS (
    SELECT
        id,
        email,
        -- Create a "rank" for each email group,
        -- ordering by 'id' to keep the earliest one.
        ROW_NUMBER() OVER(
            PARTITION BY email 
            ORDER BY id
        ) AS row_num
    FROM Users
)
SELECT * FROM RankedUsers WHERE row_num > 1;

-- This is the delete query.
-- (Note: MySQL needs a slightly different syntax
-- to delete from a CTE)

DELETE u FROM Users u
INNER JOIN (
    SELECT
        id,
        ROW_NUMBER() OVER(
            PARTITION BY email 
            ORDER BY id
        ) AS row_num
    FROM Users
) AS RankedUsers ON u.id = RankedUsers.id
WHERE RankedUsers.row_num > 1;

Output (from the `SELECT` test):

+----+-------------------+---------+
| id | email             | row_num |
+----+-------------------+---------+
|  4 | alice@example.com |       2 |
|  6 | bob@example.com   |       2 |
+----+-------------------+---------+
2 rows in set

Output (from the `DELETE` query):

Query OK, 2 rows affected.

Tip: The Nth highest salary and the delete duplicates problem are very common. Practicing window functions like `DENSE_RANK()` and `ROW_NUMBER()` is one of the best ways to stand out in a SQL interview.

Previous Next

Chat about this topic?