Comments on The Data Charmer: Finding tables without primary keys

SELECT t.table_schema, t.table_name FROM ...

2017-01-24T12:50:43.227+01:00

SELECT t.table_schema,
t.table_name
FROM information_schema.tables t
WHERE NOT EXISTS (SELECT *
FROM information_schema.columns c
WHERE t.table_schema = c.table_schema
AND t.table_name = c.table_name
AND c.column_key = 'PRI')
AND t.table_schema NOT IN ( 'mysql', 'information_schema',
'performance_schema',
'sys', 'common_schema' )
AND table_type = 'base table';

It does not handle views.

2016-07-05T12:31:03.602+02:00

It does not handle views.

@Anonymous I compare execution times by running th...

2011-09-09T08:11:04.575+02:00

@Anonymous
I compare execution times by running the query after restarting the database. That will take care of caches.
Your query is extremely inefficient. It takes 2 minutes and 38 seconds to run.

Hi Guiseppe, How you compare execution times? Doe...

2011-09-09T07:16:51.452+02:00

Hi Guiseppe,

How you compare execution times? Doesn't it depend on table/FS cache?

Please also check this query (IN should be faster than outer join if result is few tables):

SELECT table_schema, table_name
FROM information_schema.tables
WHERE (table_catalog, table_schema, table_name) NOT IN
(SELECT table_catalog, table_schema, table_name
FROM information_schema.table_constraints
WHERE constraint_type in ('PRIMARY KEY', 'UNIQUE'))
AND table_schema NOT IN ('information_schema', 'mysql');

Roland, Well done! Your query is 3 times faster th...

2011-09-05T17:44:38.974+02:00

Roland,
Well done!
Your query is 3 times faster than the previous winner (< 3 seconds, against 11) and filters off the views.

Thanks

Hi Giuseppe, here's my proposed solution: sel...

2011-09-05T17:39:07.930+02:00

Hi Giuseppe,

here's my proposed solution:
select tables.table_schema
, tables.table_name
, tables.engine
from information_schema.tables
left join (
select table_schema
, table_name
from information_schema.statistics
group by table_schema
, table_name
, index_name
having
sum(
case
when non_unique = 0
and nullable != 'YES' then 1
else 0
end
) = count(*)
) puks
on tables.table_schema = puks.table_schema
and tables.table_name = puks.table_name
where puks.table_name is null
and tables.table_type = 'BASE TABLE'

The heart of the query is the puks subquery in the from clause. This query selects one row for each unique index that has the same number of columns as the number of columns that are not nullable. This is the set of primary keys and unique constraints that have only non-nullable columns.

The outer query matches the tables (but only base tables) against this subquery using an outer join in order to pinpoint those tables that do not have a corresponding row in the subquery, that is - tables without a primary key or non-nullable unique constraint.

Roland, Point taken on both counts. The query gets...

2011-09-05T11:51:56.648+02:00

Roland,
Point taken on both counts.
The query gets more complex, with probably a bigger hit on the server.

I wanted to find a solution that does not require external scripts or stored routines, but it seems that I will have to consider one of these ways.

Hi Giusesppe! just noticed this: "I came up...

2011-09-05T11:41:58.834+02:00

Hi Giusesppe!

just noticed this:

"I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):"

This query has two flaws:

#1
You check all columns, regardless of the table type. This means that VIEWs will be flagged as not having a primary key. Strictly speaking this is correct, but this is probably not the intention.

#2
If you can settle for a UNIQUE constraint instead of a PRIMARY KEY, you should also check that all columns of that constraint are NOT NULL. This complicates the query considerable, because you have to check nullability *per index*, and can't just sum it for all table columns that happen to be in an index.

Maybe if i have some time later today, I may send in a solution.

Shlomi, I didn't know about common_schema, and...

2011-09-05T08:29:11.304+02:00

Shlomi,
I didn't know about common_schema, and I should have a deeper look at it. It seems to be a useful addition to every DBA's box of tricks.

In this particular task of mine, though, no_pk_innodb_tables is not helpful, as I need to find all tables without PK, and my preliminary data suggests that many of them may be MyISAM.
Regarding your project of exporting I_S tables to regular ones, I had the same thought myself a few years ago.

BTW, except for my excellent advertising, my query...

2011-09-05T06:57:02.924+02:00

BTW, except for my excellent advertising, my query does not run faster than yours. However, I do have a plan to make INFORMATION_SCHEMA tables "clones". That it, create a INFORMATION_SCHEMA_GHOST schema, with TABLES, COLUMNS, STATISTICS etc. (schema-related tables), which are updated by a script using SHOW commands.
Such a schema does not need to be evaluated all the time (one's schema does not change all the time), so really once in a while, upon your decision.
The tables in that schema would be your standard MyISAM tables, so queries will be very fast.

If you're interested to help out, I'll be very happy.

Ah! I can see you have not used common_schema as y...

2011-09-05T06:51:58.729+02:00

Ah!
I can see you have not used common_schema as yet.
Take a look at no_pk_innodb_tables.

It strictly checks for InnoDB tables (probably to be removed later on).

You should really take a look at common_schema (and spread the word, while at it). There are some interesting queries in there (and by tomorrow, I believe, a new version as well).