A common problems in SQL are to find the data for the row with some max value in a column per some group identifier. Another similar problems are to find the first, max, least per group in SQL. This is fairly difficult task for newbie and trivial for experienced master in SQL.

For example let have this table( table creation code and data is at the end):

id username detail info column_date
1 John Done full 2017-12-04 16:30:57
2 Mike Done half 2016-12-03 16:30:57
3 Joe Process full 2016-12-03 16:30:58
4 Kate Pending none 2016-12-03 16:30:59
5 John Process half 2016-12-03 16:31:17
6 Mike Done full 2016-12-03 16:31:18
7 Joe Pending half 2016-12-03 16:31:18
8 Kate Pending full 2016-12-03 16:31:18
9 John Process half 2016-12-03 16:31:26
10 Mike Process none 2016-12-03 16:31:26

Find max per group and return all values

If we need all values for max date grouping by column1:

1 John Done full 2017-12-04 16:30:57 2017-12-04 16:30:57 John
7 Joe Pending half 2016-12-03 16:31:18 2016-12-03 16:31:18 Joe
8 Kate Pending full 2016-12-03 16:31:18 2016-12-03 16:31:18 Kate
10 Mike Process none 2016-12-03 16:31:26 2016-12-03 16:31:26 Mike

Inner join

This SQL query would do the job:

SELECT *
FROM table1 t1
INNER JOIN(
SELECT MAX(tab2.column_date) AS maxval,
 tab2.username
FROM table1 tab2
-- WHERE tab2.column = 'value'--if new need a constraint
GROUP BY tab2.username
) temp ON t1.username = temp.username AND t1.column_date = temp.maxval

The same query shorter form

SELECT *
FROM table1 t1
INNER JOIN(
SELECT username, MAX(column_date) maxval
FROM table1
GROUP BY username
) temp ON t1.username = temp.username AND t1.column_date = temp.maxval

Left join

Query using left outer join to do the same result. Return all values for a row with biggest values in col1.

SELECT t1.*
FROM table1 t1
LEFT OUTER
JOIN table1 t2 ON (t1.username = t2.username AND t1.column_date < t2.column_date)
WHERE t2.id IS NULL;

This two examples are tested in Oracle and MySQL but they should work with standard RDBMS(working with standard SQL).

Select top N rows per group

If you need to select top N (in query below 2) rows per group. This can be done by using subquery:

SELECT *
FROM table1
WHERE (
SELECT COUNT(*)
FROM table1 AS t1
WHERE t1.username = table1.username AND t1.column_date <= table1.column_date
) <= 2;
3 Joe Process full 2016-12-03 16:30:58
7 Joe Pending half 2016-12-03 16:31:18
5 John Process half 2016-12-03 16:31:17
9 John Process half 2016-12-03 16:31:26
4 Kate Pending none 2016-12-03 16:30:59
8 Kate Pending full 2016-12-03 16:31:18
2 Mike Done half 2016-12-03 16:30:57
6 Mike Done full 2016-12-03 16:31:18

Find top records from two tables

For example let say that we have two tables: games and predictions. And we want to get top teams(visitors) for a user for which he has 1 point from his predictions. The first column return how many times/rows this player has in table prediction:

Tables( below you have the creation code and part of this data sample):

  • Game
id teama teamb
99 Fiorentina AC Milan
75 Manchester City AFC Bournemouth
24 Malaysia Afghanistan
51 Gibraltar U19 Albania U19
22 Elche Alcorcon
  • Prediction
game_id scorea scoreb points
113 0 0 0
114 0 0 0
115 0 0 1
116 0 0 0
113 1 1 0

In table game we have information about the game and in table prediction we have information about the predictions for this team.

SELECT COUNT(*),tb1.team
FROM (
SELECT g.teamb AS 'team'
FROM game g, prediction p
WHERE g.id = p.game_id AND points = 1
) AS tb1
GROUP BY tb1.team
ORDER BY 1 DESC;
  • Result
COUNT(*) team
4 Manchester City
3 Granada
3 Arsenal
3 Real Madrid
2 SSC Napoli
1 Burnley
1 Fiorentina
1 Dynamo Kyiv
1 FC Rostov
1 Bayern Munich
1 Liverpool
1 Pro Vercelli

Using a union would give us the best team - visitor and host:

SELECT COUNT(*),tb1.team
FROM (
SELECT g.teama AS 'team'
FROM game g, prediction p
WHERE g.id = p.game_id AND points = 1 
UNION ALL
SELECT g.teamb AS 'team'
FROM game g, prediction p
WHERE g.id = p.game_id AND points = 1
) AS tb1
GROUP BY tb1.team
ORDER BY 1 DESC;

Grouping, count and concatenation

If we need to get all predictions and find which scores are most likely to happen best on our data. We could do:

SELECT concat(p.scorea, ' : ', p.scoreb, ' - '), count(*)
FROM prediction p
WHERE points = 3
group by p.scorea, p.scoreb
order by 2 desc
concat(p.scorea, ' : ', p.scoreb, ' - ') count(*)
0 : 1 - 3
3 : 3 - 2
3 : 2 - 2
0 : 0 - 1

Table used in the examples

First example table

CREATE TABLE `table1` (
	`id` INT(11) NOT NULL AUTO_INCREMENT,
	`username` VARCHAR(50) NULL DEFAULT NULL,
	`detail` VARCHAR(50) NULL DEFAULT NULL,
	`info` VARCHAR(50) NULL DEFAULT NULL,
	`column_date` DATETIME NULL DEFAULT NULL,
	PRIMARY KEY (`id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=11
;

INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	1,	'John',	'Done',	'full',	'2017-12-04 16:30:57');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	2,	'Mike',	'Done',	'half',	'2016-12-03 16:30:57');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	3,	'Joe',	'Process',	'full',	'2016-12-03 16:30:58');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	4,	'Kate',	'Pending',	'none',	'2016-12-03 16:30:59');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	5,	'John',	'Process',	'half',	'2016-12-03 16:31:17');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	6,	'Mike',	'Done',	'full',	'2016-12-03 16:31:18');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	7,	'Joe',	'Pending',	'half',	'2016-12-03 16:31:18');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	8,	'Kate',	'Pending',	'full',	'2016-12-03 16:31:18');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	9,	'John',	'Process',	'half',	'2016-12-03 16:31:26');
INSERT INTO `test`.`table1` (`id`, `username`,  `detail`,`info`, `column_date`) VALUES(	10,	'Mike',	'Process',	'none',	'2016-12-03 16:31:26');

Second example tables

CREATE TABLE `game` (
	`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
	`teama` VARCHAR(255) NOT NULL,
	`teamb` VARCHAR(255) NOT NULL
	PRIMARY KEY (`id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=37930
;

CREATE TABLE `prediction` (
	`game_id` BIGINT(20) NOT NULL,
	`points` INT(11) NOT NULL,
	`scorea` VARCHAR(255) NOT NULL,
	`scoreb` VARCHAR(255) NOT NULL,
	PRIMARY KEY (`id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=2569
;


INSERT INTO `game` VALUES (4386, 'Palermo', 'AC Milan');
INSERT INTO `game` VALUES (4759, 'Ludogorets Razgrad', 'Arsenal');
INSERT INTO `game` VALUES (3375, 'Sunderland', 'Arsenal');
INSERT INTO `game` VALUES (4763, 'Manchester City', 'Barcelona');
INSERT INTO `game` VALUES (4415, 'Sevilla', 'Barcelona');
INSERT INTO `game` VALUES (4767, 'Tottenham Hotspur', 'Bayer Leverkusen');
INSERT INTO `game` VALUES (4765, 'PSV Eindhoven', 'Bayern Munich');
INSERT INTO `game` VALUES (4277, 'Hamburger SV', 'Borussia Dortmund');
INSERT INTO `game` VALUES (4052, 'Hertha Berlin', 'Borussia Moenchengladbach');

INSERT INTO `prediction` VALUES (3375, '2', '3', 1);
INSERT INTO `prediction` VALUES (3376, '2', '0', 0);
INSERT INTO `prediction` VALUES (3380, '2', '0', 0);
INSERT INTO `prediction` VALUES (3381, '1', '2', 1);
INSERT INTO `prediction` VALUES (3450, '2', '2', 0);
INSERT INTO `prediction` VALUES (3451, '1', '0', 1);
INSERT INTO `prediction` VALUES (3452, '3', '2', 1);
INSERT INTO `prediction` VALUES (3470, '2', '3', 1);
INSERT INTO `prediction` VALUES (3472, '3', '0', 1);
;