对于50万级别的数据, 应该有必要研究一下如何增加mysql的性能了, 当然了mysql处理50万数据的表就算按默认设置也是ok的, 不过肉眼能感受到速度的下降, 但在接受范围内. 还是研究下来对付百万级别的数据量:
1. 开启默认的mysql缓存
大多数情况下这个好像是开启的, 我们来确认一下:
[cce] #mysql localhost -u root -p #enter passwd mysql> mysql>use database1 [/cce]
[cce] mysql> show VARIABLES LIKE '%cache%'; +------------------------------+----------------------+ | Variable_name | Value | +------------------------------+----------------------+ | binlog_cache_size | 32768 | | have_query_cache | YES | | key_cache_age_threshold | 300 | | key_cache_block_size | 1024 | | key_cache_division_limit | 100 | | max_binlog_cache_size | 18446744073709547520 | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 16777216 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | | table_definition_cache | 256 | | table_open_cache | 64 | | thread_cache_size | 8 | +------------------------------+----------------------+ [/cce]
[cce] mysql> show status LIKE '%Qcache%'; +-------------------------+----------+ | Variable_name | Value | +-------------------------+----------+ | Qcache_free_blocks | 1 | | Qcache_free_memory | 32759656 | | Qcache_hits | 48678002 | | Qcache_inserts | 11871037 | | Qcache_lowmem_prunes | 7482829 | | Qcache_not_cached | 178368 | | Qcache_queries_in_cache | 0 | | Qcache_total_blocks | 1 | +-------------------------+----------+ mysql> SET GLOBAL query_cache_size = 32777216 [/cce]
query_cache_size: 是缓存占用内存的大小query_cache_type:ON 表示启用了缓存
2. 由”备份-删除-恢复”机制带来的性能加速
注: 本方法对于”不活跃”的”海量数据”具有高效的访问性能
用户案例: 100万数据, 两周内活跃的仅为10万数据, 而且其余90%两周后再次活跃的概率也极低.
显而易见, 该方法适用于 海量数据存储, 但轻量级访问频率的数据库
- 编写shell脚本 自动备份数据,然后删除已备份数据:
[cce] #!/bin/bash # author: Wang # date: 2013-02-12 # find all the user needed to be backup # define the Constants Filename="/pathTo/BackupList.txt" #Filename="/pathTo/BackupList.txt" MediaRoot="/pathTo/media" User="user" Passwd="passwd" Host="localhost" DB="DBname" # the current time ctime="$(date +"%Y-%m-%d %H:%M:%S")" # the backup time for Linux #btime="$(date -d '-1 weeks' +%Y-%m-%d\ %H:%M:%S)" # the backup time for Mac # btime="$(date -v-1w +%Y-%m-%d\ %H:%M:%S)" btime="$(date -v-2m +%Y-%m-%d\ %H:%M:%S)" #btime="$(date -d '-2 weeks' +%Y-%m-%d\ %H:%M:%S)" # the last backup time # ltime="$(date -v-4m +%Y-%m-%d\ %H:%M:%S)" # ltime="$(date -d '-4 months' +%Y-%m-%d\ %H:%M:%S)" # get the list of backup ids mysql -u$User -p$Passwd -h$Host -D $DB -e "select id from users where active=True and last_login < \"$btime\"" > $Filename # for every id, output all the sql and delete it. for id in `cat $Filename | grep -v id` do #echo $id if [ -e "$MediaRoot/$id" ]; then echo "dump for profile: "$id else echo "mkdir for profile: "$id mkdir "$MediaRoot/$id" fi if [ -e "$MediaRoot/$id/$id.sql" ]; then echo "$id.sql already exists" else mysqldump -u$User -p$Passwd -h$Host $DB paws --where "source_id in (select id from datasources where owner_id=$id)" --no-create-info --lock-all-tables > $MediaRoot/$id/$id.sql mysql -u$User -p$Passwd -h$Host -D $DB -e "DELETE from paws where source_id in (select id from datasources where owner_id=$id)" >> /tmp/log.txt mysql -u$User -p$Passwd -h$Host -D $DB -e "UPDATE users SET active=False where id=$id" >> /tmp/log.txt fi done [/cce]
- 当用户2周后再次登录时, 使用python在其登录时调用恢复脚本
[cce] if p.active == False: file_time = datetime.now().strftime("%Y-%m-%d_%H_%M_%S") restore_cmd = shlex.split("mysql -uuser -ppasswd -hlocalhost -D DBname -e 'SOURCE "+MediaRoot+"/"+str(p.id)+"/"+str(p.id)+".sql'") backup_cmd = shlex.split("mv "+MediaRoot+"/"+str(p.id)+"/"+str(p.id)+".sql "+MediaRoot+"/"+str(p.id)+"/"+str(p.id)+"_"+file_time+".sql") subprocess.Popen(restore_cmd).wait() subprocess.Popen(backup_cmd).wait() p.active = True p.last_login = now_time p.save() [/cce]