Tuesday, September 15, 2009

Partitioning with non integer values using triggers

Looking at Bug#47310, which is a feature request that I hear frequently when I talk about partitions, I wrote a comment, suggesting triggers to work around the limitation.
The reason for the limitation is that allowing arbitrary functions for partitioning was too complex and it was provoking crashes and other unpleasant side effects (see the discussion under bug#18198).
But if you use a trigger, the resulting column is a plain integer, and many of the side effects disappear. The drawback is that you need to add a column to your table, and you need to use that column when searching for data. With that in mind, you can implement the workaround quite easily.

USE test;
DROP TABLE IF EXISTS users;

CREATE TABLE users (
user_id int(10) NOT NULL,
username varchar(25) DEFAULT NULL,
dummy INT not null,
PRIMARY KEY (user_id, dummy),
UNIQUE KEY username(username,dummy)
) ;

CREATE TRIGGER users_bi
BEFORE INSERT ON users
FOR EACH ROW
SET NEW.dummy = ASCII(LOWER(LEFT(NEW.username,1)));

ALTER TABLE users PARTITION BY RANGE (dummy) (
PARTITION p0 VALUES LESS THAN (96), #being f
PARTITION p1 VALUES LESS THAN (109), #being m
PARTITION p2 VALUES LESS THAN (115), #being s
PARTITION p3 VALUES LESS THAN (122) #being z
);

INSERT INTO users (user_id, username)
VALUES (1,'Joe'), (2,'Sam'),(3,'Abe'),(4,'Rich');

EXPLAIN PARTITIONS SELECT * FROM users
where username = 'Abe';
# This simple query doesn't use partition pruning.
# This is to be expected.

EXPLAIN PARTITIONS SELECT * FROM users
where dummy = ASCII('a') and username = 'Abe';
# Here, the partition pruning kicks in, at the price of an extra
# condition in the query.

1 comment:

Anonymous said...

The functionality I'd like to see is renaming without reorganizing it. If you have a range part'd table and your last one is a maxvalue partition, it seems like you should be able to something like this

ALTER TABLE ex RENAME PARTITION ex_maxvalue TO ex_20090915, ADD PARTITION ex_maxvalue VALUES LESS THAN MAXVALUE;

In other words, it should work like table renaming. Doing a REORGANIZE locks the table while the data is being shuffled, which is operationally disruptive.