Sunday, April 09, 2006

Sorting data by set of keys

This seems to be a popular problem, since I saw it twice in the past few weeks in two different newsgroups. Somebody complains that, having query with a set of keys like this
SELECT * from mytable where id IN (200, 2 ,100)
they get the results in a order that ius different from the one they specified. For example, they may get something like
select
*
from
main_table
where
id in (200,2,100);
+-----+----------+
| id | contents |
+-----+----------+
| 2 | b |
| 100 | aa |
| 200 | bb |
+-----+----------+
One of the posters complained that the result should come in the specified order [200,2,100], and that MySQL was arbitrarily sorting its resultset.
When I see such requests, I usually explain that there are two misconceptions. The first being that MySQL sorts results without asking. That it does not do, because a DBMS engine is basically lazy by design, and it won't do anything unless explicitly asked for. The second is that you are asking for a sort order in your set of keys. Such list is just an indication of which keys to consider for filtering the results, and it has no influence at all on ordering. If you want ordering, you need to say so explicitly, using the ORDER BY clause.
Once I clarified what the problem is, let's see the solutions.
You can sort it on-the-fly, if your values are reasonably few
select
*
from
main_table
where
id in (200, 2, 100)
order by
case id
when 200 then 1
when 2 then 2
when 100 then 3
end ;
+-----+----------+
| id | contents |
+-----+----------+
| 200 | bb |
| 2 | b |
| 100 | aa |
+-----+----------+
This solution looks like a hack, and in fact it can negatively affect performance, and it is difficult to handle when you need to change the sort order. The best way is to use a support table providing the required sort order.
create table sorting_table (
sort_order int not null auto_increment primary key,
fk_id int,
key (fk_id)
);
# here we insert the values in the order we want.
# the auto incremented key will insure that this is the correct one
insert into sorting_table (fk_id) values (200), (2), (100);
select * from sorting_table;
+------------+-------+
| sort_order | fk_id |
+------------+-------+
| 1 | 200 |
| 2 | 2 |
| 3 | 100 |
+------------+-------+
Now to get the wanted order we use a join between the main table and the secondary one.
select
main_table.*
from
main_table
inner join sorting_table on (id=fk_id)
where
id in (200,2, 100)
order by
sort_order ;
+-----+----------+
| id | contents |
+-----+----------+
| 200 | bb |
| 2 | b |
| 100 | aa |
+-----+----------+
This is a much better solution, which offers flexibility and efficiency at the same time.

No comments: