Is the SQL query correctly written to display clients with active (unpaid) loans?
Task
It is necessary to output a list of clients with active (unpaid) loans using three tables:
- t1 (credit contracts)
- t2 (deposit contracts)
- t3 (clients)
Table Structure (MySQL 5.7)
Table t1 (credit contracts)
CREATE TABLE t1 (
CREDIT_CONTRACT_ID INT,
START_DATE DATE,
END_DATE DATE,
CUSTOMER_ID INT,
CREDIT_AMOUNT INT
);
Table t3 (clients)
CREATE TABLE t3 (
CUSTOMER_ID INT,
NAME VARCHAR(100),
BIRTH_DATE DATE,
CITY VARCHAR(100)
);
Sample Data
Data for table t1
INSERT INTO t1 VALUES
(1, '2015-10-15', '2015-12-30', 1, 1000),
(2, '2015-12-14', NULL, 2, 2000),
(3, '2015-10-20', NULL, 3, 15000),
(7, '2015-12-28', NULL, 4, 80000),
(8, '2015-10-07', NULL, 5, 3000),
(9, '2015-06-30', NULL, 6, 29000),
(10, '2015-04-09', NULL, 7, 8000),
(11, '2015-01-20', NULL, 8, 68000),
(12, '2015-10-27', NULL, 9, 7000);
Data for table t3
INSERT INTO t3 VALUES
(1, 'Серик', '1975-09-20', 'Алматы'),
(2, 'Динара', '1977-07-22', 'Алматы'),
(3, 'Марат', '1969-01-02', 'Астана'),
(4, 'Олег', '1967-09-27', 'Астана'),
(5, 'Марина', '1969-10-12', 'Алматы'),
(6, 'Дильшат', '1979-10-08', 'Караганда'),
(7, 'Ануар', '1982-04-17', 'Кызылорда'),
(8, 'Михаил', '1980-12-11', 'Атырау'),
(9, 'Екатерина', '1987-02-12', 'Актобе');
Proposed SQL Query
SELECT
t3.NAME,
t3.BIRTH_DATE,
t3.CITY,
t1.CREDIT_CONTRACT_ID,
t1.START_DATE,
t1.END_DATE,
t1.CREDIT_AMOUNT
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE t1.END_DATE IS NULL;
Question
Is the SQL query correctly written to display clients with active (unpaid) loans?
In theory, I’m looking for loans where END_DATE is not specified (meaning the loan hasn’t been paid off yet). The query result seems logical, but I want to confirm if this is a correct and optimal approach. Are there any nuances or alternative methods I should consider?
The SQL query is written correctly and effectively for retrieving clients with active loans. It properly uses the END_DATE IS NULL condition to identify active loans and correctly joins the loan contract tables with clients through JOIN.
Contents
- Analysis of the current SQL query
- Logic correctness check
- Performance optimization
- Alternative approaches
- Additional recommendations
Analysis of the current SQL query
Your query is functionally correct and solves the stated task:
SELECT
t3.NAME,
t3.BIRTH_DATE,
t3.CITY,
t1.CREDIT_CONTRACT_ID,
t1.START_DATE,
t1.END_DATE,
t1.CREDIT_AMOUNT
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE t1.END_DATE IS NULL;
Advantages of the current approach:
- Uses
INNER JOINto correctly link loan contracts with clients - Applies the
WHERE t1.END_DATE IS NULLcondition to filter active loans - Returns all necessary fields for analyzing active loans
Important: The
END_DATE IS NULLcondition is indeed the standard way to identify active loans in most databases, as a NULL value in the end date field indicates an incomplete contract.
Logic correctness check
To fully verify the correctness of the query, let’s analyze it step by step:
-
JOIN operation:
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_IDcorrectly links each active loan with its corresponding client -
Filtering active loans:
WHERE t1.END_DATE IS NULLselects only those loans that have not yet been repaid -
Data selection: The query returns all necessary information for analyzing active loans
Query results on the provided data:
The query will return clients with IDs 2, 3, 4, 5, 6, 7, 8, 9 - all those who have loans with END_DATE = NULL.
Potential issues:
- The query does not consider the
t2table (deposit contracts), but in this context, this is not an issue since we are only interested in loans - There is no check for possible duplicates if a client has multiple active loans
Performance optimization
Although the current query is functionally correct, it can be optimized for better performance:
1. JOIN type selection
-- INNER JOIN (current option)
-- Returns only clients with active loans, which makes sense for this task
-- LEFT JOIN (if all clients are needed, even those without loans)
-- But this is not required in this case
2. Indexes for improved performance
-- It is recommended to create indexes on fields used in JOIN and WHERE
CREATE INDEX idx_t1_customer_id ON t1(CUSTOMER_ID);
CREATE INDEX idx_t1_end_date ON t1(END_DATE);
3. Field selection optimization
-- If only main fields are needed, you can limit the selection
SELECT
t3.NAME,
t1.CREDIT_CONTRACT_ID,
t1.CREDIT_AMOUNT,
t1.START_DATE
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE t1.END_DATE IS NULL;
Alternative approaches
1. Using COALESCE to explicitly indicate active loans
SELECT
t3.NAME,
t1.CREDIT_CONTRACT_ID,
t1.CREDIT_AMOUNT
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE COALESCE(t1.END_DATE, CURRENT_DATE) > CURRENT_DATE;
This approach can be useful if the logic for determining active loans changes in the future.
2. Using NOT EXISTS for complex conditions
SELECT
t3.NAME,
t1.CREDIT_CONTRACT_ID,
t1.CREDIT_AMOUNT
FROM t3
JOIN t1 ON t3.CUSTOMER_ID = t1.CUSTOMER_ID
WHERE NOT EXISTS (
SELECT 1 FROM t1 t2
WHERE t2.CUSTOMER_ID = t3.CUSTOMER_ID
AND t2.END_DATE IS NOT NULL
);
3. Aggregation for clients with multiple loans
SELECT
t3.NAME,
COUNT(t1.CREDIT_CONTRACT_ID) AS active_credits_count,
SUM(t1.CREDIT_AMOUNT) AS total_credit_amount
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE t1.END_DATE IS NULL
GROUP BY t3.NAME, t3.CUSTOMER_ID;
Additional recommendations
1. Data integrity check
Before using the query, it is recommended to check data integrity:
-- Check for NULL values in CUSTOMER_ID
SELECT COUNT(*) FROM t1 WHERE CUSTOMER_ID IS NULL;
-- Check for uniqueness of CUSTOMER_ID in the client table
SELECT COUNT(*) FROM t3;
SELECT COUNT(DISTINCT CUSTOMER_ID) FROM t3;
2. Adding sorting for easier reading
SELECT
t3.NAME,
t1.CREDIT_CONTRACT_ID,
t1.CREDIT_AMOUNT,
t1.START_DATE
FROM t1
JOIN t3 ON t1.CUSTOMER_ID = t3.CUSTOMER_ID
WHERE t1.END_DATE IS NULL
ORDER BY t3.NAME, t1.START_DATE DESC;
3. Using aliases for better readability
SELECT
c.NAME AS customer_name,
cr.CREDIT_CONTRACT_ID,
cr.CREDIT_AMOUNT,
cr.START_DATE
FROM t1 cr
JOIN t3 c ON cr.CUSTOMER_ID = c.CUSTOMER_ID
WHERE cr.END_DATE IS NULL
ORDER BY c.NAME;
4. Security and edge case handling
-- Adding validation for correct dates
SELECT
c.NAME,
cr.CREDIT_CONTRACT_ID,
cr.CREDIT_AMOUNT
FROM t1 cr
JOIN t3 c ON cr.CUSTOMER_ID = c.CUSTOMER_ID
WHERE cr.END_DATE IS NULL
AND cr.START_DATE <= CURRENT_DATE
AND (cr.CREDIT_AMOUNT > 0 OR cr.CREDIT_AMOUNT IS NULL);
Recommended final query:
SELECT
c.NAME AS customer_name,
c.CITY,
cr.CREDIT_CONTRACT_ID,
cr.START_DATE,
cr.CREDIT_AMOUNT
FROM t1 cr
JOIN t3 c ON cr.CUSTOMER_ID = c.CUSTOMER_ID
WHERE cr.END_DATE IS NULL
AND cr.START_DATE <= CURRENT_DATE
AND cr.CREDIT_AMOUNT > 0
ORDER BY c.NAME, cr.START_DATE DESC;
Conclusion
-
The main query is completely correct for identifying clients with active loans by checking
END_DATE IS NULL -
It is recommended to add indexes on the
CUSTOMER_IDandEND_DATEfields to improve performance with large data volumes -
For better code readability, meaningful table aliases should be used instead of
t1,t2,t3 -
Data integrity checks and edge case handling should be added in production environments
-
For analytical tasks, aggregate functions can be used to get summary information for clients with multiple loans
Your initial approach is an excellent starting point, and the suggested optimizations will help make the query more reliable and performant.