Wednesday, December 19, 2007

Data from nothing - solution to pop quiz

My latest post on generating one million records received many comments, with interesting solutions.
The challenge required the insertion of 1 million records in a simple table, with a few constraints.
create table t1 (
dt datetime not null,
primary key (dt)
);

The solution

The official solution is straightforward:
create view v3 as select null union all select null union all select null;
create view v10 as select null from v3 a, v3 b union all select null;
create view v1000 as select null from v10 a, v10 b, v10 c;
set @n = 0;
insert into t1 select now()-interval @n:=@n+1 second from v1000 a,v1000 b;
You can appreciate the title here. Data from nothing. Generate 1 million valid records from a view made of NULLs!
The principle is easy.
  1. First create a few values in a view, by means of UNION ALL queries.
  2. Then, using Cartesian products, generate a larger view.
  3. And a larger one, containing 1,000 self generating rows.
  4. The final step is a simple Cartesian product in a self join.
This is something that they told you not to do in Database 101 classes, but it can be handy sometimes!

It can be reduced to 4 lines:
create view v3 as select 1 n union all select 1 union all select 1;
create view v as select 1 n from v3 a, v3 b union all select 1;
set @n = 0;
insert t1 select now()-interval @n:=@n+1 second from v a,v b,v c,v d,v e,v;

The other solutions

The solution that comes closer to the intended one comes from Shane Bester.
set @a:=1167602400,@b:=1168602400;
create view v as select 1 union select 2 union select 3 union select 4;
create view x as select 1 from v,v a,v b,v c,v d;
insert t1 select(from_unixtime(@a:=@a+1))from x,x a where @a<@b;
Shane also produced the fastest solution.
create view v as select a.dt from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f,t1 g;
replace t1 values('990101'),('980101'),('970101'),(@a:=1);
replace t1 select (adddate(a.dt,@a:=@a+1)) from v,v a limit 999996;

B.Steinbrink produced the shortest one.
  INSERT t1 VALUES(@a:=72620211),(101),(102),(103),(104),(105),(106),(107);
replace t1 select@a:=adddate(@a,1)from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f;

Roland Bouman produced the most wicked one. And a method that caught me by surprise.
# WARNING! DON'T RUN IF YOUR SERVER DOES NOT HAVE AT LEAST 4 GB RAM !!!
set max_allowed_packet := 1073741824;
set @s=concat('insert into t1 values(@d:=10000101+@i:=1)');
set @s=concat(@s,repeat(',(date_add(@d,interval @i:=@i+1 day))',999999));
prepare s from @s;
execute s;
In 5 lines, he is generating a 37 MB query containing 1 million records!
Not very efficient, but it works, if you have a huge amount of RAM. Otherwise, it can crash your server!
Kai Voigt came up with the same idea, with a faster execution.
set @a=concat(repeat(concat(@s:="select 1 ","union all "),99),@s),@x:=0;
set @c=concat(@s,"from (",@a,")a join(",@a,")b join(",@a,")c"),@i="inser";
set @d=concat(@i,"t into t1 select from_unixtime(@x:=@x+1) from(",@c,")c");
prepare s from @d;
execute s;
Jedy solution was the fastest until Shane's arrived.
insert into t1 values (now()-2),(now()-1),(now()),(now()+(@a:=1));
insert into t1 select adddate(now(),@a:=@a+1) from t1 a,t1 b,t1 c,t1 d,t1;
insert into t1 select adddate(now(),@a:=@a+1) from t1,t1 b limit 998972;
You can see all the other solutions as comments to the previous post.
This post comes earlier than promised, because of an unexpected development. A new quiz is coming. Stay tuned!

4 comments:

Unknown said...

Hey, I thought I had some more time and decided to finish my solution today :-(

Anyway, here's my solution, just 2 statements:

INSERT t1 VALUES(@a:=72620211),(101),(102),(103),(104),(105),(106),(107);

replace t1 select@a:=adddate(@a,1)from t1,t1 a,t1 b,t1 c,t1 d,t1 e,t1 f;

Anonymous said...

i call shenanigans!!!!

your original challenge said "Use a maximum of 5 (five) SQL statements"

i would like to point out that anything involving the variable @a or the SET command are ~not~ SQL statements

they may be MySQL statements, but they are not SQL statements

shame on all solutions that did not follow the rules!!!!!

Unknown said...

r937,
get a life!
The rules mentioned "use of MySQL client".
So in which language did you expect us to write the solution? Latin?
Take your shame comments elsewhere.

White Kite said...

any chance of a MS SQL solution or a port of an existing solution. I almost got it for the winning solution except when i stmbled upon INTERVAL and SECOND cryptic functions ...