tag:blogger.com,1999:blog-16959946.post235444998470257614..comments2023-12-09T16:44:47.897+01:00Comments on The Data Charmer: Finding tables without primary keysGiuseppe Maxiahttp://www.blogger.com/profile/15801583338057324813noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-16959946.post-2325930304103670102017-01-24T12:50:43.227+01:002017-01-24T12:50:43.227+01:00SELECT t.table_schema,
t.table_name
FROM ...SELECT t.table_schema, <br /> t.table_name <br />FROM information_schema.tables t <br />WHERE NOT EXISTS (SELECT * <br /> FROM information_schema.columns c <br /> WHERE t.table_schema = c.table_schema <br /> AND t.table_name = c.table_name <br /> AND c.column_key = 'PRI') <br /> AND t.table_schema NOT IN ( 'mysql', 'information_schema', <br /> 'performance_schema', <br /> 'sys', 'common_schema' ) <br /> AND table_type = 'base table'; Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-16959946.post-29716615804223499822016-07-05T12:31:03.602+02:002016-07-05T12:31:03.602+02:00It does not handle views.It does not handle views.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-16959946.post-24162200732220653862011-09-09T08:11:04.575+02:002011-09-09T08:11:04.575+02:00@Anonymous
I compare execution times by running th...@Anonymous<br />I compare execution times by running the query after restarting the database. That will take care of caches.<br />Your query is extremely inefficient. It takes 2 minutes and 38 seconds to run.Giuseppe Maxiahttps://www.blogger.com/profile/15801583338057324813noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-2330276840383794962011-09-09T07:16:51.452+02:002011-09-09T07:16:51.452+02:00Hi Guiseppe,
How you compare execution times? Doe...Hi Guiseppe,<br /><br />How you compare execution times? Doesn't it depend on table/FS cache?<br /><br />Please also check this query (IN should be faster than outer join if result is few tables):<br /><br />SELECT table_schema, table_name<br /> FROM information_schema.tables<br /> WHERE (table_catalog, table_schema, table_name) NOT IN<br /> (SELECT table_catalog, table_schema, table_name<br /> FROM information_schema.table_constraints<br /> WHERE constraint_type in ('PRIMARY KEY', 'UNIQUE'))<br /> AND table_schema NOT IN ('information_schema', 'mysql');Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-16959946.post-79096233470276161712011-09-05T17:44:38.974+02:002011-09-05T17:44:38.974+02:00Roland,
Well done!
Your query is 3 times faster th...Roland,<br />Well done!<br />Your query is 3 times faster than the previous winner (< 3 seconds, against 11) and filters off the views.<br /><br />ThanksGiuseppe Maxiahttps://www.blogger.com/profile/15801583338057324813noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-29593999742531627682011-09-05T17:39:07.930+02:002011-09-05T17:39:07.930+02:00Hi Giuseppe,
here's my proposed solution:
sel...Hi Giuseppe,<br /><br />here's my proposed solution:<br />select tables.table_schema<br />, tables.table_name<br />, tables.engine<br />from information_schema.tables<br />left join (<br /> select table_schema<br /> , table_name<br /> from information_schema.statistics<br /> group by table_schema<br /> , table_name<br /> , index_name<br /> having <br /> sum(<br /> case <br /> when non_unique = 0 <br /> and nullable != 'YES' then 1 <br /> else 0 <br /> end<br /> ) = count(*)<br /> ) puks<br />on tables.table_schema = puks.table_schema<br />and tables.table_name = puks.table_name<br />where puks.table_name is null<br />and tables.table_type = 'BASE TABLE'<br /><br />The heart of the query is the puks subquery in the from clause. This query selects one row for each unique index that has the same number of columns as the number of columns that are not nullable. This is the set of primary keys and unique constraints that have only non-nullable columns.<br /><br />The outer query matches the tables (but only base tables) against this subquery using an outer join in order to pinpoint those tables that do not have a corresponding row in the subquery, that is - tables without a primary key or non-nullable unique constraint.rpboumanhttps://www.blogger.com/profile/13365137747952711328noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-89758966834655694272011-09-05T11:51:56.648+02:002011-09-05T11:51:56.648+02:00Roland,
Point taken on both counts.
The query gets...Roland,<br />Point taken on both counts.<br />The query gets more complex, with probably a bigger hit on the server.<br /><br />I wanted to find a solution that does not require external scripts or stored routines, but it seems that I will have to consider one of these ways.Giuseppe Maxiahttps://www.blogger.com/profile/15801583338057324813noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-77092749132393671052011-09-05T11:41:58.834+02:002011-09-05T11:41:58.834+02:00Hi Giusesppe!
just noticed this:
"I came up...Hi Giusesppe!<br /><br />just noticed this:<br /><br />"I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):"<br /><br />This query has two flaws: <br /><br />#1<br />You check all columns, regardless of the table type. This means that VIEWs will be flagged as not having a primary key. Strictly speaking this is correct, but this is probably not the intention.<br /><br />#2<br />If you can settle for a UNIQUE constraint instead of a PRIMARY KEY, you should also check that all columns of that constraint are NOT NULL. This complicates the query considerable, because you have to check nullability *per index*, and can't just sum it for all table columns that happen to be in an index.<br /><br />Maybe if i have some time later today, I may send in a solution.rpboumanhttps://www.blogger.com/profile/13365137747952711328noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-21485272087671279492011-09-05T08:29:11.304+02:002011-09-05T08:29:11.304+02:00Shlomi,
I didn't know about common_schema, and...Shlomi,<br />I didn't know about common_schema, and I should have a deeper look at it. It seems to be a useful addition to every DBA's box of tricks.<br /><br />In this particular task of mine, though, no_pk_innodb_tables is not helpful, as I need to find all tables without PK, and my preliminary data suggests that many of them may be MyISAM.<br />Regarding your project of exporting I_S tables to regular ones, I had <a href="http://www.oreillynet.com/databases/blog/2006/04/dumping_mysql_information_sche_1.html" rel="nofollow">the same thought myself</a> a few years ago.Giuseppe Maxiahttps://www.blogger.com/profile/15801583338057324813noreply@blogger.comtag:blogger.com,1999:blog-16959946.post-19754039124020411922011-09-05T06:57:02.924+02:002011-09-05T06:57:02.924+02:00BTW, except for my excellent advertising, my query...BTW, except for my excellent advertising, my query does not run faster than yours. However, I do have a plan to make INFORMATION_SCHEMA tables "clones". That it, create a INFORMATION_SCHEMA_GHOST schema, with TABLES, COLUMNS, STATISTICS etc. (schema-related tables), which are updated by a script using SHOW commands. <br />Such a schema does not need to be evaluated all the time (one's schema does not change all the time), so really once in a while, upon your decision.<br />The tables in that schema would be your standard MyISAM tables, so queries will be very fast.<br /><br />If you're interested to help out, I'll be very happy.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-16959946.post-44268553642699595092011-09-05T06:51:58.729+02:002011-09-05T06:51:58.729+02:00Ah!
I can see you have not used common_schema as y...Ah!<br />I can see you have not used common_schema as yet.<br />Take a look at <a href="http://common-schema.googlecode.com/svn/trunk/common_schema/doc/html/no_pk_innodb_tables.html" rel="nofollow">no_pk_innodb_tables</a>.<br /><br />It strictly checks for InnoDB tables (probably to be removed later on).<br /><br />You should really take a look at common_schema (and spread the word, while at it). There are some interesting queries in there (and by tomorrow, I believe, a new version as well).Anonymousnoreply@blogger.com