Wednesday, December 19, 2007

Data from nothing - solution to pop quiz

My latest post on generating one million records received many comments, with interesting solutions.
The challenge required the insertion of 1 million records in a simple table, with a few constraints.
create table t1 (
dt datetime not null,
primary key (dt)
);

The solution

The official solution is straightforward:
create view v3 as select null union all select null union all select null;
create view v10 as select null from v3 a, v3 b union all select null;
create view v1000 as select null from v10 a, v10 b, v10 c;
set @n = 0;
insert into t1 select now()-interval @n:=@n+1 second from v1000 a,v1000 b;
You can appreciate the title here. Data from nothing. Generate 1 million valid records from a view made of NULLs!
The principle is easy.
  1. First create a few values in a view, by means of UNION ALL queries.
  2. Then, using Cartesian products, generate a larger view.
  3. And a larger one, containing 1,000 self generating rows.
  4. The final step is a simple Cartesian product in a self join.
This is something that they told you not to do in Database 101 classes, but it can be handy sometimes!

It can be reduced to 4 lines:
create view v3 as select 1 n union all select 1 union all select 1;
create view v as select 1 n from v3 a, v3 b union all select 1;
set @n = 0;
insert t1 select now()-interval @n:=@n+1 second from v a,v b,v c,v d,v e,v;

The other solutions

The solution that comes closer to the intended one comes from Shane Bester.
set @a:=1167602400,@b:=1168602400;
create view v as select 1 union select 2 union select 3 union select 4;
create view x as select 1 from v,v a,v b,v c,v d;
insert t1 select(from_unixtime(@a:=@a+1))from x,x a where @a<@b;
Shane also produced the fastest solution.
create view v as select a.dt from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f,t1 g;
replace t1 values('990101'),('980101'),('970101'),(@a:=1);
replace t1 select (adddate(a.dt,@a:=@a+1)) from v,v a limit 999996;

B.Steinbrink produced the shortest one.
  INSERT t1 VALUES(@a:=72620211),(101),(102),(103),(104),(105),(106),(107);
replace t1 select@a:=adddate(@a,1)from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f;

Roland Bouman produced the most wicked one. And a method that caught me by surprise.
# WARNING! DON'T RUN IF YOUR SERVER DOES NOT HAVE AT LEAST 4 GB RAM !!!
set max_allowed_packet := 1073741824;
set @s=concat('insert into t1 values(@d:=10000101+@i:=1)');
set @s=concat(@s,repeat(',(date_add(@d,interval @i:=@i+1 day))',999999));
prepare s from @s;
execute s;
In 5 lines, he is generating a 37 MB query containing 1 million records!
Not very efficient, but it works, if you have a huge amount of RAM. Otherwise, it can crash your server!
Kai Voigt came up with the same idea, with a faster execution.
set @a=concat(repeat(concat(@s:="select 1 ","union all "),99),@s),@x:=0;
set @c=concat(@s,"from (",@a,")a join(",@a,")b join(",@a,")c"),@i="inser";
set @d=concat(@i,"t into t1 select from_unixtime(@x:=@x+1) from(",@c,")c");
prepare s from @d;
execute s;
Jedy solution was the fastest until Shane's arrived.
insert into t1 values (now()-2),(now()-1),(now()),(now()+(@a:=1));
insert into t1 select adddate(now(),@a:=@a+1) from t1 a,t1 b,t1 c,t1 d,t1;
insert into t1 select adddate(now(),@a:=@a+1) from t1,t1 b limit 998972;
You can see all the other solutions as comments to the previous post.
This post comes earlier than promised, because of an unexpected development. A new quiz is coming. Stay tuned!

4 comments:

B.Steinbrink said...

Hey, I thought I had some more time and decided to finish my solution today :-(

Anyway, here's my solution, just 2 statements:

INSERT t1 VALUES(@a:=72620211),(101),(102),(103),(104),(105),(106),(107);

replace t1 select@a:=adddate(@a,1)from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f;

r937 said...

i call shenanigans!!!!

your original challenge said "Use a maximum of 5 (five) SQL statements"

i would like to point out that anything involving the variable @a or the SET command are ~not~ SQL statements

they may be MySQL statements, but they are not SQL statements

shame on all solutions that did not follow the rules!!!!!

dbwiz said...

r937,
get a life!
The rules mentioned "use of MySQL client".
So in which language did you expect us to write the solution? Latin?
Take your shame comments elsewhere.

Andru said...

any chance of a MS SQL solution or a port of an existing solution. I almost got it for the winning solution except when i stmbled upon INTERVAL and SECOND cryptic functions ...

Vote on Planet MySQL