MYSQL: excluding multiple lines if an attached line matches a condition

I’ve got two tables. One is users, which has fields uid and name. The other table is users_roles, which has fields uid and rid (role id).

I’d like to retrieve a list of users that don’t have any of a set of provided roles.

For example:

uid | name
----------
1   | BOB
2   | DAVE
3   | JOHN

USERS_ROLES:

uid | rid
---------
1   | 1
1   | 2
1   | 3
2   | 1
3   | 1

I want to be able to query for just users that don’t have a certain set of rids. For instance, a query that excludes rids 2 and 3 should return DAVE and JOHN, or a query that excludes rids (1,3) should return nobody.

A query using an anti-join pattern is sometimes the most efficient:

SELECT u.uid
     , u.name
  FROM users u
  LEFT
  JOIN users_roles r
    ON r.uid = u.uid
   AND r.rid IN (2,3)
 WHERE r.uid IS NULL

The anti-join pattern is do a LEFT [outer] JOIN the user_roles
table to pull back all the matching rows, AND to get rows from users
that don’t have a matching row. The “trick” is to exclude all the matching rows with a predicate in the WHERE clause that eliminates all the rows from users that had a match.

An equivalent resultset can be obtained using a NOT EXISTS correlated subquery:

SELECT u.uid
     , u.name
  FROM users u
 WHERE NOT EXISTS
       ( SELECT 1
           FROM users_roles r
          WHERE r.uid = u.uid
            AND r.rid IN (2,3)
       )

Another approach is to use a NOT IN
, although that is sometimes less efficient. Performance depends on a whole host of factors. It’s possible for the optimizer to generate different execution plans for each of these queries.

In any case, for best performance, you’ll need an index … ON users_roles (uid)
or ON users_roles (uid,rid)
.

Followup:

Performance testing on my MySQL 5.1.34 server reveals that an anti-join query is almost twice as fast as equivalent NOT EXISTS and NOT IN queries. (1.091 sec vs. 2.066 sec and 2.020 sec)

-- setup and populate test tables
CREATE TABLE t_users
( uid    INT UNSIGNED NOT NULL PRIMARY KEY
, `name` VARCHAR(50)
) ENGINE=INNODB DEFAULT CHARSET=latin1 ;

CREATE TABLE t_users_roles
( uid INT UNSIGNED NOT NULL
, rid INT UNSIGNED NOT NULL
, PRIMARY KEY (uid,rid)
) ENGINE=INNODB DEFAULT CHARSET=latin1;

ALTER TABLE t_users_roles ADD CONSTRAINT FK_t_users_roles_t_users FOREIGN KEY (uid) REFERENCES t_users (uid);

CREATE INDEX t_users_ix1 ON t_users (uid,`name`);

CREATE INDEX t_users_roles_ix1 ON t_users_roles (rid,uid);

INSERT INTO t_users (uid,`name`)
SELECT d.d*100000+e.d*10000+u.d*1000+h.d*100+t.d*10+o.d+1 AS uid
     , CONCAT('NAME',LPAD(d.d*100000+e.d*10000+u.d*1000+h.d*100+t.d*10+o.d+1,8,'-')) AS `name`
  FROM (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) o
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) h
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) u
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) e
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
;

-- 1000000 row(s) affected.

INSERT INTO t_users_roles (uid,rid)
SELECT d.d*100000+e.d*10000+u.d*1000+h.d*100+t.d*10+o.d+1 AS uid
     , r.rid
  FROM (SELECT 1 AS rid UNION ALL SELECT 2 UNION ALL SELECT 3) r
 CROSS
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) o
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) h
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) u
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) e
  JOIN (SELECT 0 AS d UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
 WHERE (d.d*100000+e.d*10000+u.d*1000+h.d*100+t.d*10+o.d+1) % 100000  0
;
-- 2999970 row(s) affected

OPTIMIZE TABLE t_users;
OPTIMIZE TABLE t_users_roles;

SHOW STATUS LIKE 'Qcache_hits' ;
-- Variable_name  Value
-- -------------  ---------
-- Qcache_hits    1117342  

SHOW VARIABLES LIKE 'version' ;
-- Variable_name  Value
-- -------------  ------------
-- version        5.1.53-log

-- table size from the file system
$ du -sh DATA/test/t_users*.ibd
72M     DATA/test/t_users.ibd
133M    DATA/test/t_users_roles.ibd

-- anti-join query
SELECT SQL_NO_CACHE u.uid
     , u.name
  FROM t_users u
  LEFT
  JOIN t_users_roles r
    ON r.uid = u.uid
   AND r.rid IN (2,3)
 WHERE r.uid IS NULL ;

-- Exec: 1.095 sec
-- Exec: 1.090 sec
-- Exec: 1.091 sec
-- Exec: 1.087 sec
-- Exec: 1.090 sec
-- avg 5 executions: 1.091 sec    

-- not exists query
SELECT SQL_NO_CACHE u.uid
     , u.name
  FROM t_users u
 WHERE NOT EXISTS
       ( SELECT 1
           FROM t_users_roles r
          WHERE r.uid = u.uid
            AND r.rid IN (2,3)
       ) ;

-- Exec: 2.071 sec
-- Exec: 2.066 sec
-- Exec: 2.059 sec
-- Exec: 2.065 sec
-- Exec: 2.070 sec
-- avg 5 executions: 2.066 sec

-- not in query
SELECT SQL_NO_CACHE u.uid
     , u.name
  FROM t_users u
 WHERE u.uid NOT IN
       ( SELECT r.uid
           FROM t_users_roles r
          WHERE r.uid IS NOT NULL
            AND r.rid IN (2,3)
       ) ;

-- Exec: 2.022 sec
-- Exec: 2.023 sec
-- Exec: 2.014 sec
-- Exec: 2.026 sec
-- Exec: 2.016 sec
-- avg 5 executions: 2.020 sec

SHOW STATUS LIKE 'Qcache_hits' ;
-- Variable_name  Value
-- -------------  ---------
-- Qcache_hits    1117342  

EXPLAIN output for three statements:

-- ANTI JOIN
id  select_type         table   type            possible_keys              key          key_len  ref       rows   filtered  Extra
--  ------------------  ------  --------------  -------------------------  -----------  -------  -----  -------  --------  ------------------------------------
 1  SIMPLE              u       index                                      t_users_ix1  57              1000423    100.00  Using index
 1  SIMPLE              r       ref             PRIMARY,t_users_roles_ix1  PRIMARY      4        u.uid        1    100.00  Using where; Using index; Not exists

-- NOT EXISTS
id  select_type         table   type            possible_keys              key          key_len  ref       rows  filtered  Extra
--  ------------------  ------  --------------  -------------------------  -----------  -------  -----  -------  --------  --------------------------
 1  PRIMARY             u       index                                      t_users_ix1  57              1000423    100.00  Using where; Using index
 2  DEPENDENT SUBQUERY  r       ref             PRIMARY,t_users_roles_ix1  PRIMARY      4        u.uid        1    100.00  Using where; Using index

-- NOT IN
id  select_type         table   type            possible_keys              key          key_len  ref        rows  filtered  Extra
--  ------------------  ------  --------------  -------------------------  -----------  -------  ------  -------  --------  --------------------------
 1  PRIMARY             u       index                                      t_users_ix1  57               1000423    100.00  Using where; Using index
 2  DEPENDENT SUBQUERY  r       index_subquery  PRIMARY,t_users_roles_ix1  PRIMARY      4        func          1    100.00  Using index; Using where
Hello, buddy!稿源:Hello, buddy! (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 后端存储 » MYSQL: excluding multiple lines if an attached line matches a condition

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录