本文共 8680 字,大约阅读时间需要 28 分钟。
本文主要介绍并测试一下PostgreSQL 在中高端x86服务器上的数据插入速度,帮助企业用户了解PostgreSQL在这种纯插入场景的性能。
(例如运营商网关数据,金融行业数据,产生量大,并且要求快速插入大数据库中持久化保存。)
另外, 用户如果需要流式实时处理,可以参考基于PostgreSQL的流式处理方案,一天处理1万亿的实时流式处理是如何实现的?
测试结果写在前面:
.1.平均每条记录长度约360字节。
.2.测试十轮,每轮持续100秒,每轮测试的结果在1200万行每秒以上,约1210万左右。
.3.每秒约入库1200 万条记录,相当于4.2GB数据。
.4.换算成天,入库 10368 亿记录,相当于363TB数据。
测试的硬件环境
.1. X86服务器
.2. 3?核。
.3. 5??G 内存
.4. 几块SSD,15TB容量
软件环境
.1. CENTOS 6.x x64
.2 .xfs
.3. PostgreSQL 9.5
系统配置参考
数据库配置
./configure --prefix=/home/digoal/pgsql9.5.1 --with-blocksize=32 --with-segsize=128 --with-wal-blocksize=32 --with-wal-segsize=64 make && make install
PostgreSQL支持hugepage的方法请参考:
参数
listen_addresses = '0.0.0.0' # what IP address(es) to listen on;fsync=onport = 1921 # (change requires restart)max_connections = 600 # (change requires restart)superuser_reserved_connections = 13 # (change requires restart)unix_socket_directories = '.' # comma-separated list of directoriesunix_socket_permissions = 0700 # begin with 0 to use octal notationtcp_keepalives_idle = 60 # TCP_KEEPIDLE, in seconds;tcp_keepalives_interval = 10 # TCP_KEEPINTVL, in seconds;tcp_keepalives_count = 10 # TCP_KEEPCNT;shared_buffers = 256GB # min 128kBhuge_pages = on # on, off, or trywork_mem = 512MB # min 64kBmaintenance_work_mem = 1GB # min 1MBautovacuum_work_mem = 1GB # min 1MB, or -1 to use maintenance_work_memdynamic_shared_memory_type = posix # the default is the first optionbgwriter_delay = 10ms # 10-10000ms between roundsbgwriter_lru_maxpages = 1000 # 0-1000 max buffers written/roundbgwriter_lru_multiplier = 2.0 synchronous_commit = off # synchronization level;full_page_writes = on # recover from partial page writeswal_buffers = 2047MB # min 32kB, -1 sets based on shared_bufferswal_writer_delay = 10ms # 1-10000 millisecondscheckpoint_timeout = 55min # range 30s-1hmax_wal_size = 512GBcheckpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0effective_cache_size = 40GB log_destination = 'csvlog' # Valid values are combinations oflogging_collector = on # Enable capturing of stderr and csvloglog_directory = 'pg_log' # directory where log files are written,log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,log_file_mode = 0600 # creation mode for log files,log_truncate_on_rotation = on # If on, an existing log file with thelog_checkpoints = offlog_connections = offlog_disconnections = offlog_error_verbosity = verbose # terse, default, or verbose messageslog_timezone = 'PRC'log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions anddatestyle = 'iso, mdy'timezone = 'PRC'lc_messages = 'C' # locale for system error messagelc_monetary = 'C' # locale for monetary formattinglc_numeric = 'C' # locale for number formattinglc_time = 'C' # locale for time formattingdefault_text_search_config = 'pg_catalog.english'autovacuum=off
测试方法, 并发插入数据测试,每条记录约360字节。
postgres=# select string_agg(i,'') from (select md5(random()::text) i from generate_series(1,10) t(i)) t(i); string_agg ---------------------------------------------------------------------- 75feba6d5ca9ff65d09af35a67fe962a4e3fa5ef279f94df6696bee65f4529a4bbb03ae56c3b5b86c22b447fc48da894740ed1a9d518a9646b3a751a57acaca1142ccfc945b1082b40043e3f83f8b7605b5a55fcd7eb8fc1d0475c7fe465477da47d96957849327731ae76322f440d167725d2e2bbb60313150a4f69d9a8c9e86f9d79a742e7a35bf159f670e54413fb89ff81b8e5e8ab215c3ddfd00bb6aeb4(1 row)create unlogged table test(crt_time timestamp, info text default '75feba6d5ca9ff65d09af35a67fe962a4e3fa5ef279f94df6696bee65f4529a4bbb03ae56c3b5b86c22b447fc48da894740ed1a9d518a9646b3a751a57acaca1142ccfc945b1082b40043e3f83f8b7605b5a55fcd7eb8fc1d0475c7fe465477da47d96957849327731ae76322f440d167725d2e2bbb60313150a4f69d9a8c9e86f9d79a742e7a35bf159f670e54413fb89ff81b8e5e8ab215c3ddfd00bb6aeb4');alter table test alter column info set storage plain;postgres=# insert into test select now() from generate_series(1,1000);select ctid from test limit 1000;
每32K的block存储89条记录, 每条记录约360字节。
分别在3个物理块设备上创建3个表空间目录,同时在数据库中创建表空间。
tbs1, tbs2, tbs3.
创建多个分表,用于减少测试时block extend 冲突。
do language plpgsql $$declarei int;sql text;begin for i in 1..128 loop sql := 'create unlogged table test'||i||' (like test including all)'; execute sql; end loop;end; $$;do language plpgsql $$declarei int;sql text;begin for i in 1..42 loop sql := 'alter table test'||i||' set tablespace tbs1'; execute sql; end loop; for i in 43..84 loop sql := 'alter table test'||i||' set tablespace tbs2'; execute sql; end loop; for i in 85..128 loop sql := 'alter table test'||i||' set tablespace tbs3'; execute sql; end loop;end; $$;
生成测试脚本:
vi test.sqlinsert into test(crt_time) values (now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now()),(now());for ((i=1;i<=192;i++)) do sed "s/test/test$i/" test.sql > ./test$i.sql; done
每次测试前清除数据:
do language plpgsql $$ declarei int;sql text;begin for i in 1..128 loop sql := 'truncate test'||i; execute sql; end loop;end; $$;
测试,使用128个并行。
for ((i=1;i<=128;i++)) do pgbench -M prepared -n -r -P 1 -f ./test$i.sql -h xxx.xxx.xxx.xxx -p 1921 -U postgres postgres -c 1 -j 1 -T 100 >./$i.log 2>&1 & done
测试了十几轮,每轮100秒,每轮测试的结果在1200万行每秒以上,约1210万左右。
取测试结果的方法,以下的SUM值就是qps,每条QUERY插入100条记录。
for ((i=1;i<=128;i++)) do tail -n 4 $i.log|head -n 1|awk '{ print $3}'; done
每秒约插入1200万条记录,相当于4.2GB数据。
换算成天,一天入库10368亿记录,相当于363TB数据。
小结
1. 这个CASE主要的应用场景是实时的大数据入库,例如 物联网 的应用场景,大量的 传感器 会产生庞大的数据。又比如传统的 运营商网关 ,也会有非常庞大的流量数据或业务数据需要实时的入库。
除了实时入库,用户如果需要流式实时处理,可以参考基于PostgreSQL的流式处理方案,一天处理1万亿的实时流式处理是如何实现的?
其他
此次测试发现, 这种硬件环境,1200万每秒远没有达到上限, 还有非常大的性能提升空间:
.1. 测试过程中,通过perf观察到系统调用的spin lock较高,有内核代码优化空间。
115092.00 43.0% mutex_spin_on_owner [kernel.kallsyms] 10658.00 4.0% _spin_lock [kernel.kallsyms] 9161.00 3.4% __mutex_lock_slowpath [kernel.kallsyms] 7304.00 2.7% __memset_sse2 /lib64/libc-2.12.so 4994.00 1.9% put_into_wait_copy_list [xxxxxxxx] 4448.00 1.7% LWLockAcquire /home/digoal/pgsql9.5.1/bin/postgres 3861.00 1.4% clear_page_c_e [kernel.kallsyms] 3759.00 1.4% copy_user_generic_string [kernel.kallsyms]
.2. PG仅仅占用40%左右的CPU , 系统占用30%左右CPU , 还有30%左右的CPU空闲。
.3. 数据库刚启动时,shared buffer hash table还没有初始化好, mutex lock较多,如果你用TOP会发现有大量进程处于D状态 。
w: S -- Process Status The status of the task which can be one of: ’D’ = uninterruptible sleep ’R’ = running ’S’ = sleeping ’T’ = traced or stopped ’Z’ = zombie
代码的性能优化不在此展开,有兴趣的朋友可以自己去玩一下看看。
有几个工具你可能用得上,perf, systemtap, goprof.
如果要较全面的分析,建议把--enable-profiling打开用于诊断。
长时间测试(数据下午补上):
.1.测试24轮,每轮持续1000秒,每轮测试的结果在760万行每秒以上,约765万左右。
#!/bin/bashclean() {sleep 610psql -h xxx.xxx.xxx.xxx -p 1921 -U postgres postgres <>./$i.log 2>&1 & donecleandone
.2.每秒约入库760万条记录,相当于2.7GB数据。
.3.换算成天,入库 6566 亿记录,相当于233TB数据。
转载地址:http://wiczl.baihongyu.com/