高性能MySql学习笔记(一)

来源：互联网发布：淘宝客服推荐的尺码编辑：程序博客网时间：2024/06/03 02:22

高性能MySql学习笔记

1.针对应用建立自己的索引

URL查找例子

select * from tUrl where url='http://www.163.com';

以url（字符串）作行为索引会使得作为索引结构的B-Tree变大，可以移除url列上的索引，并添加一个url_crc索引列，

先建立表：

create tables tUrl(

id int unsigned NOT NULL auto_increment,

url varchar(255) NOT NULL,

url_crc int unsigned NOT NULL DEFAULT 0,

primary key(id)
);

通过以下方式查询：

mysql> select * from queryHash where url_crc=CRC32( 'http://www.163.com' ) and url='http://www.163.com' ;

1.url_crc列的索引性能较高

2.在有冲突的时候，再通过精确比较得到所需要列

3.如果冲突过多可以改由64位的hash, MD5()/SHA1(), 截取前16示例如下：

mysql> select conv( right ( md5('http://www.csdn.net'), 16), 16, 10) as hash64;

select conv( right ( md5('http://www.csdn.net'), 16), 16, 10) as hash64;
+---------------------+
| hash64 |
+---------------------+
| 9270541833456602998 |
+---------------------+
注：Maatkit( http://www.maatkit.org/)包含了一个用户自定义函数(UDF)，实现了Fowler/Noll/Vo64位哈希函数，速度非常快。

上述方法缺点是要手工维护hash值，可以通过使用触发器维护,分别为insert/update添加触发器：

delimiter |

create trigger url_crc_ins before insert on tUrl for each row begin

set NEW.url_crc=crc32(NEW.url);

end;

create trigger url_crc_upd before update on tUrl for each row begin

set NEW.url_crc=crc32(NEW.url);

end;

delimiter ;

验证一下：

mysql> insert into tUrl(url) values( 'http://www.csdn.net' );

mysql> select * from tUrl;
+----+---------------------+------------+
| id | url                 | url_crc    |
+----+---------------------+------------+
| 2 | http://www.csdn.net | 3032599713 |
+----+---------------------+------------+
mysql> update tUrl set url='http://blog.csdn.net' where id=2;
mysql> select * from tUrl;
+----+----------------------+------------+
| id | url                  | url_crc    |
+----+----------------------+------------+
| 2 | http://blog.csdn.net | 1255720221 |
+----+----------------------+------------+

由于我是在虚拟机的测试环境上，测试结果可能不太准确：随机生成了单个400多万的url记录，结果是没有带crc索引的直接比较url列，需要2.xs的时间，而还索引的话下降大概1s。随着记录量的增大，到2kw，查询就变得异常慢了，而且CPU占用率非常高，无论还不还crc优化都需要30s多,对于这么大的数据量，就需要通过其它手段优化查询了，比如水平切割，减少表的大小。

mysql> select count(*) from queryHash;
+----------+
| count(*) |
+----------+
| 27733889 |
+----------+
1 row in set (0.01 sec)

mysql> select * from queryHash where url='http://www.ui.com';
+--------+-------------------+------------+
| id | url | url_crc |
+--------+-------------------+------------+
| 121292 | http://www.ui.com | 3732015786 |
| 132385 | http://www.ui.com | 3732015786 |
| 135030 | http://www.ui.com | 3732015786 |
| 136391 | http://www.ui.com | 3732015786 |
| 138530 | http://www.ui.com | 3732015786 |
| 138555 | http://www.ui.com | 3732015786 |
| 145059 | http://www.ui.com | 3732015786 |
| 147331 | http://www.ui.com | 3732015786 |
| 150356 | http://www.ui.com | 3732015786 |
+--------+-------------------+------------+
9 rows in set (36.58 sec)

mysql> select * from queryHash where url_crc=crc32('http://www.ui.com') and url='http://www.ui.com';
+--------+-------------------+------------+
| id | url | url_crc |
+--------+-------------------+------------+
| 121292 | http://www.ui.com | 3732015786 |
| 132385 | http://www.ui.com | 3732015786 |
| 135030 | http://www.ui.com | 3732015786 |
| 136391 | http://www.ui.com | 3732015786 |
| 138530 | http://www.ui.com | 3732015786 |
| 138555 | http://www.ui.com | 3732015786 |
| 145059 | http://www.ui.com | 3732015786 |
| 147331 | http://www.ui.com | 3732015786 |
| 150356 | http://www.ui.com | 3732015786 |
+--------+-------------------+------------+
9 rows in set (32.43 sec)

2.高性能索引策略

(1) 如果查询中没有隔离索引的列，MySql通常不会使用索引。 “隔离”列意味着它不是表达式的一部分，也没有位于函数中。

下列查询不能使用到actor_id索引

select actor_id from sakila.actaor where actor_id + 1 = 5;

(2)在函数中也不会使用索引

因为使用了TO_DAYS函数

select ... where TO_DAYS( CURRENT_DATE) - TO_DAYS( date_col ) <= 10;

一种比较好的方式：

select ... where date_col >= DATE_SUB( CURRENT_DATE, INTERVAL 10 DAY);

这样就可以使用到索引查询了，但是还可以改进：

select ... where date_col >= DATE_SUB( '2010-5-11', INTERVAL 10 DAY);

因为CURRENT_DATE将会阻止查询缓存，可以用常量代替！