香港服务器运行 CentOS 7,我是怎么把一套“高可用 PostgreSQL 节点”落地到链上数据存储的。
技术教程 2025-09-18 15:43 189


那天凌晨 1:40,我站在香港葵涌机房 7 楼,空调风把袖口吹得“哗啦啦”。跨境团队在视频里催我做最后一次切换演练——把区块索引任务从临时库迁到新的高可用 PostgreSQL 集群。UPS 的蜂鸣和 NVMe 指示灯在黑暗里一闪一闪,我心里很踏实:这套架构,已经被我在沙箱里“折腾”了三个礼拜。接下来,就是把它真的跑起来。

1. 目标与约束(现场视角)

业务目标

  • 写入:持续 3–5k 行/秒(区块、交易、日志分表落盘),峰值 10k 行/秒可承受
  • 查询:最近 24h 区块与交易检索 P95 < 80ms
  • RPO=0(不丢数据)、RTO < 30s(自动主备切换)

现场约束

  • 操作系统要求 CentOS 7(客户历史环境统一)
  • 机房只提供两台 ToR 交换机,L2 跨机柜可用
  • 备份必须落到园区内对象存储(我们自建 MinIO),且异地再备份每晚一次
  • 维护窗口短,不允许长时间停机

2. 拓扑与硬件实配(我用的,不空谈)

2.1 逻辑拓扑

结构:
      +------------------------+         +------------------------+
      |        LB01            | <VRRP>  |        LB02            |
      | keepalived + HAProxy   |         | keepalived + HAProxy   |
      | pgbouncer (pool)       |         | pgbouncer (pool)       |
      +-----------+------------+         +-----------+------------+
                  | VIP: 10.10.10.100                |
                  +---------------------+------------+
                                        |
                         +--------------+-------------------+
                         |         PostgreSQL 集群         |
                         |  (repmgr 主备 + 同步复制)       |
                         +--------------+-------------------+
                                        |
      +------------------+------------------+------------------+
      |                  |                  |                  |
   pg01(主)           pg02(同步备)       pg03(异步备)      minio(备份)

2.2 物理参数(真实可买得到的规格)

角色 型号/CPU 内存 系统盘 数据盘 网络 备注
pg01/pg02/pg03 AMD EPYC 7313P(16C/32T) 128GB 2×480GB SATA SSD(RAID1,/) 2×3.84TB U.2 NVMe(RAID1,/pgdata)+ 1×960GB NVMe(WAL 专盘 /pgwal) 2×10GbE(bond0,LACP) CentOS 7.9
LB01/LB02 Xeon Silver 4210 32GB 240GB SSD - 2×10GbE 运行 HAProxy+pgbouncer+keepalived
MinIO 任意 x86_64 64GB 4×8TB HDD(RAID10) 512GB SSD(元数据) 10GbE S3 兼容对象存储

说明:WAL 单独 NVMe 能显著稳住写延迟;数据盘做 RAID1 方便只读恢复和单盘更换。网络用 LACP 叠两个 10G 口,避免单链路抖动。

3. 系统初始化与内核/IO 调优(CentOS 7)

3.1 基础

# 基础包 & EPEL
yum install -y epel-release
yum install -y htop iotop numactl jq chrony lvm2 mdadm \
               net-tools policycoreutils-python firewalld

systemctl enable --now chronyd firewalld
timedatectl set-timezone Asia/Hong_Kong

3.2 关闭 THP(数据库通用必做)

cat >/etc/systemd/system/disable-thp.service <<'EOF'
[Unit]
Description=Disable Transparent Huge Pages
After=network.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
ExecStart=/bin/bash -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'

[Install]
WantedBy=multi-user.target
EOF
systemctl enable --now disable-thp

3.3 I/O 调度器(NVMe 用 none)

echo 'ACTION=="add|change", KERNEL=="nvme*n*", ATTR{queue/scheduler}="none"' \
  > /etc/udev/rules.d/60-nvme-io.rules
udevadm control --reload-rules && udevadm trigger

 

3.4 文件系统(XFS,禁 CoW 的是 Btrfs 概念;XFS 推荐默认 + noatime)

mkfs.xfs -f /dev/md/pgdata   # 或者 /dev/nvme0n1(视你是否做 mdadm RAID1)
mkfs.xfs -f /dev/nvme1n1     # WAL 盘
mkdir -p /pgdata /pgwal
echo '/dev/md/pgdata /pgdata xfs noatime,nodiratime,logbufs=8,logbsize=256k 0 0' >> /etc/fstab
echo '/dev/nvme1n1  /pgwal  xfs noatime,nodiratime 0 0' >> /etc/fstab
mount -a

3.5 sysctl & limits

cat >> /etc/sysctl.d/99-pg.conf <<EOF
vm.swappiness=1
vm.dirty_background_ratio=5
vm.dirty_ratio=20
vm.max_map_count=262144
fs.file-max=2000000
net.core.somaxconn=1024
net.core.netdev_max_backlog=250000
EOF
sysctl --system

cat >> /etc/security/limits.d/99-pg.conf <<EOF
postgres soft nofile 1024000
postgres hard nofile 1024000
EOF

4. 安装 PostgreSQL 14(PGDG)并迁移到专用目录

CentOS 7 已 EOL,但 PGDG 还提供 EL7 的 repo。强烈建议把 repo 包镜像缓存到内网。

yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum install -y postgresql14-server postgresql14-contrib
/usr/pgsql-14/bin/postgresql-14-setup initdb
systemctl stop postgresql-14
mv /var/lib/pgsql/14/data /pgdata/data
chown -R postgres:postgres /pgdata /pgwal
mkdir -p /pgwal/archive && chown -R postgres:postgres /pgwal/archive

4.1 postgresql.conf(我线上用过的“保守高性能”基线)

/pgdata/data/postgresql.conf

data_directory = '/pgdata/data'
hba_file = '/pgdata/data/pg_hba.conf'
ident_file = '/pgdata/data/pg_ident.conf'

listen_addresses = '*'
port = 5432
max_connections = 300

shared_buffers = 32GB                 # ~内存1/4
effective_cache_size = 96GB           # ~内存3/4
work_mem = 64MB                       # 结合并发控制
maintenance_work_mem = 2GB

wal_level = replica                   # 如需逻辑解码改 logical
synchronous_commit = remote_write     # 同步备可容忍的延迟/安全平衡
max_wal_senders = 16
max_replication_slots = 16
wal_compression = on
wal_writer_delay = 20ms
checkpoint_timeout = 15min
max_wal_size = 48GB
min_wal_size = 4GB

hot_standby = on
effective_io_concurrency = 256        # NVMe
random_page_cost = 1.1
seq_page_cost = 1.0

autovacuum = on
autovacuum_naptime = 10s
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02
log_autovacuum_min_duration = 0

log_checkpoints = on
log_lock_waits = on
log_min_duration_statement = 1000
log_line_prefix = '%m [%p] %u@%d %r '

4.2 pg_hba.conf(主从复制 + 业务网段)

/pgdata/data/pg_hba.conf

# 业务应用(VIP 段)
host    all             appuser         10.10.10.0/24         md5

# 复制
host    replication     replicator      10.10.10.0/24         md5
host    replication     replicator      10.10.20.0/24         md5

4.3 安全策略与防火墙

semanage port -a -t postgresql_port_t -p tcp 5432 || true
firewall-cmd --permanent --add-port=5432/tcp
firewall-cmd --reload

systemd:我让 PostgreSQL 服务开机不自动启动,交由 repmgr 管理:

systemctl disable postgresql-14

5. 用 repmgr 打造“自动主备 + 同步复制 + 自动接管”

为什么不用 Patroni?CentOS 7 的 Python 太老,不是不能搞,是得折腾(我在另一个项目里用 Docker 跑过 Patroni)。这套环境客户要求尽量少容器,我选 repmgr,稳定好用。

5.1 安装与库内用户

yum install -y repmgr_14
sudo -iu postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'Repl@123';"
sudo -iu postgres psql -c "CREATE USER repmgr SUPERUSER ENCRYPTED PASSWORD 'Mgr@123';"
sudo -iu postgres psql -c "CREATE DATABASE repmgr OWNER repmgr;"

5.2 主库(pg01)repmgr.conf

/etc/repmgr/14/repmgr.conf

node_id=1
node_name=pg01
conninfo='host=pg01 dbname=repmgr user=repmgr password=Mgr@123 port=5432'
data_directory='/pgdata/data'
use_replication_slots=yes
failover=automatic
promote_command='repmgr standby promote -f /etc/repmgr/14/repmgr.conf --log-to-file'
follow_command='repmgr standby follow -f /etc/repmgr/14/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'

把同步复制钉住(主库的 postgresql.auto.conf 由我们写入):

sudo -iu postgres psql -c "ALTER SYSTEM SET synchronous_standby_names='FIRST 1 (pg02)';"
sudo -iu postgres psql -c "ALTER SYSTEM SET primary_conninfo='user=replicator password=Repl@123 application_name=pg01';"

注册:

sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf primary register

5.3 备库(pg02/pg03)克隆与注册

/etc/repmgr/14/repmgr.conf(注意 node_id/node_name 因机而异)

node_id=2                 # pg02 -> 2, pg03 -> 3
node_name=pg02
conninfo='host=pg02 dbname=repmgr user=repmgr password=Mgr@123 port=5432'
data_directory='/pgdata/data'
use_replication_slots=yes
failover=automatic
follow_command='repmgr standby follow -f /etc/repmgr/14/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'

克隆(以 pg02 为例):

sudo -iu postgres repmgr -h pg01 -U repmgr -d repmgr -f /etc/repmgr/14/repmgr.conf standby clone --fast-checkpoint
# 启动
sudo -iu postgres pg_ctl -D /pgdata/data -w start
# 注册
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf standby register

pg03 一样做一遍(异步备,无需加入 synchronous_standby_names)。

5.4 开启 repmgrd 守护与自愈

systemctl enable --now repmgr14
# 某些版本包名是 repmgrd.service

我在演练里拉掉 pg01 的网线,30 秒内 pg02 自动接管,VIP 通过 HAProxy 指向新主库;应用零配置感知。

6. 连接层:HAProxy + keepalived + pgbouncer

6.1 keepalived(LB01/LB02)

/etc/keepalived/keepalived.conf

vrrp_instance VI_1 {
  state BACKUP
  interface bond0
  virtual_router_id 51
  priority 110                 # LB01=110, LB02=100
  advert_int 1
  authentication { auth_type PASS auth_pass 7c1b2d }
  virtual_ipaddress { 10.10.10.100/24 dev bond0 }
  track_script { chk_haproxy }
}

vrrp_script chk_haproxy {
  script "pidof haproxy"
  interval 2
  weight -20
}

systemctl enable --now keepalived

6.2 HAProxy(流向主库或只读池)

/etc/haproxy/haproxy.cfg

global
  daemon
  maxconn 100000
  log 127.0.0.1 local0

defaults
  mode tcp
  timeout client  30s
  timeout server  30s
  timeout connect 5s

# 直连主库(写)
frontend ft_pg_write
  bind *:5432
  default_backend bk_pg_master

backend bk_pg_master
  option tcp-check
  tcp-check connect port 5432
  server pg01 pg01:5432 check inter 1000 rise 2 fall 2
  server pg02 pg02:5432 check inter 1000 rise 2 fall 2 backup
  server pg03 pg03:5432 check inter 1000 rise 2 fall 2 backup

# 只读池(查)
frontend ft_pg_read
  bind *:15432
  default_backend bk_pg_replicas

backend bk_pg_replicas
  balance roundrobin
  option tcp-check
  server pg02 pg02:5432 check
  server pg03 pg03:5432 check

6.3 pgbouncer(连接池,建议跑在 LB 上)

/etc/pgbouncer/pgbouncer.ini

[databases]
chain = host=127.0.0.1 port=5432 dbname=chain_db pool_size=200 reserve_pool=50

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits
max_client_conn = 5000
default_pool_size = 200
reserve_pool_size = 50
server_idle_timeout = 60

userlist.txt:

"appuser" "md5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

防火墙:

firewall-cmd --permanent --add-port=6432/tcp && firewall-cmd --reload

7. 备份归档:pgBackRest + MinIO(RPO=0 的关键补丁)

7.1 安装与仓库配置

yum install -y pgbackrest
mkdir -p /etc/pgbackrest /var/log/pgbackrest
chown -R postgres:postgres /var/log/pgbackrest

/etc/pgbackrest/pgbackrest.conf(所有 PG 节点一致)

[global]
repo1-type=s3
repo1-s3-endpoint=minio.local
repo1-s3-bucket=pg-backup
repo1-s3-key=minio-access
repo1-s3-key-secret=minio-secret
repo1-s3-verify-tls=n
repo1-path=/pgbackrest
repo1-retention-full=7
start-fast=y
process-max=8
log-level-console=info
log-level-file=info

[pg14]
pg1-path=/pgdata/data
pg1-port=5432

7.2 开启 WAL 归档

postgresql.conf 补充:

archive_mode = on
archive_command = 'pgbackrest --stanza=pg14 archive-push %p'

初始化 & 首次全备:

sudo -iu postgres pgbackrest --stanza=pg14 --log-level-console=info stanza-create
sudo -iu postgres pgbackrest --type=full --stanza=pg14 backup

我在演练中故意删了主库 data 目录,10 分钟内从 最新全量 + 增量 WAL 做了点位恢复(PITR),数据对上号,心就踏实了。

8. “链上数据”表结构与分区策略(PostgreSQL 14 原生分区)

8.1 基础 schema

CREATE SCHEMA IF NOT EXISTS chain;

-- 分区父表:区块
CREATE TABLE chain.blocks (
  block_number     BIGINT NOT NULL,
  block_hash       BYTEA  NOT NULL,
  parent_hash      BYTEA  NOT NULL,
  miner            BYTEA  NULL,
  tx_count         INT    NOT NULL,
  ts               TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_blocks PRIMARY KEY (block_number)
) PARTITION BY RANGE (block_number);

-- 交易
CREATE TABLE chain.txs (
  block_number     BIGINT NOT NULL,
  tx_index         INT    NOT NULL,
  tx_hash          BYTEA  NOT NULL,
  from_addr        BYTEA  NOT NULL,
  to_addr          BYTEA  NULL,
  value_wei        NUMERIC(38,0) NOT NULL,
  gas_price_wei    NUMERIC(38,0) NOT NULL,
  gas_used         INT NOT NULL,
  status           SMALLINT NOT NULL,
  input            BYTEA NULL,
  ts               TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_txs PRIMARY KEY (block_number, tx_index)
) PARTITION BY RANGE (block_number);

-- 日志(事件)
CREATE TABLE chain.logs (
  block_number BIGINT NOT NULL,
  tx_index     INT    NOT NULL,
  log_index    INT    NOT NULL,
  address      BYTEA  NOT NULL,
  topic0       BYTEA  NULL,
  topic1       BYTEA  NULL,
  topic2       BYTEA  NULL,
  topic3       BYTEA  NULL,
  data         BYTEA  NULL,
  ts           TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_logs PRIMARY KEY (block_number, tx_index, log_index)
) PARTITION BY RANGE (block_number);

类型选择:哈希与地址我用 BYTEA(把 hex 用 decode($hex,'hex') 存),比 TEXT 更省空间也更快。常查字段建索引。

8.2 分区与索引

我以“每 1,000,000 个区块一张子表”为粒度(方便滚动归档):

DO $$
DECLARE start_block BIGINT := 0;   -- 依据链高度调整
DECLARE end_block   BIGINT := 20000000;
DECLARE step        BIGINT := 1000000;
DECLARE t BIGINT;
BEGIN
  t := start_block;
  WHILE t < end_block LOOP
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.blocks_%s PARTITION OF chain.blocks FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.txs_%s    PARTITION OF chain.txs    FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.logs_%s   PARTITION OF chain.logs   FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    t := t + step;
  END LOOP;
END $$;

-- 常用索引(示例)
CREATE INDEX ON chain.txs  (tx_hash);
CREATE INDEX ON chain.txs  (from_addr);
CREATE INDEX ON chain.txs  (to_addr);
CREATE INDEX ON chain.logs (address);
CREATE INDEX ON chain.blocks (ts);
-- 大表时间相关可加 BRIN,加速范围扫描又省空间
CREATE INDEX ON chain.blocks USING brin (block_number);

9. 摄取(ingest)示例:Python + web3 + COPY 流式写入

生产我用的是 Go 写的异步管线,这里给个 Python 思路,方便测试/PoC。

# ingest_eth.py
import os, time, psycopg2, binascii
from web3 import Web3
from psycopg2.extras import execute_values

PG_DSN = os.getenv("PG_DSN")  # 'host=10.10.10.100 port=5432 dbname=chain_db user=appuser password=...'
W3 = Web3(Web3.HTTPProvider(os.getenv("ETH_RPC")))  # 本地 geth/erigon 或云服务

def b(hexstr):  # '0x...' -> bytes
    return binascii.unhexlify(hexstr[2:]) if hexstr and hexstr.startswith("0x") else None

def upsert_blocks(blocks):
    with psycopg2.connect(PG_DSN) as conn:
        with conn.cursor() as cur:
            sql = """INSERT INTO chain.blocks (block_number, block_hash, parent_hash, miner, tx_count, ts)
                     VALUES %s ON CONFLICT (block_number) DO NOTHING"""
            vals = [(b.number, b(b.hash.hex()), b(b.parentHash.hex()), b(b.miner.hex()),
                     len(b.transactions), time.strftime("%Y-%m-%d %H:%M:%S%z", time.gmtime(b.timestamp)))
                    for b in blocks]
            execute_values(cur, sql, vals, page_size=1000)

def upsert_txs(block, txs):
    with psycopg2.connect(PG_DSN) as conn:
        with conn.cursor() as cur:
            sql = """INSERT INTO chain.txs
            (block_number, tx_index, tx_hash, from_addr, to_addr, value_wei, gas_price_wei, gas_used, status, input, ts)
            VALUES %s ON CONFLICT DO NOTHING"""
            vals = []
            for i, tx in enumerate(txs):
                receipt = W3.eth.get_transaction_receipt(tx.hash)
                vals.append((block.number, i, b(tx.hash.hex()),
                             b(tx['from']), b(tx['to']) if tx['to'] else None,
                             tx['value'], tx['gasPrice'], receipt['gasUsed'],
                             1 if receipt['status']==1 else 0,
                             b(tx['input']) if tx['input']!='0x' else None,
                             time.strftime("%Y-%m-%d %H:%M:%S%z", time.gmtime(block.timestamp))))
            execute_values(cur, sql, vals, page_size=1000)

def loop():
    head = W3.eth.block_number
    cur = head - 1000
    while True:
        if cur > head:
            time.sleep(1)
            head = W3.eth.block_number
            continue
        block = W3.eth.get_block(cur, full_transactions=True)
        upsert_blocks([block])
        upsert_txs(block, block.transactions)
        cur += 1

if __name__ == "__main__":
    loop()

线上建议:批量写(COPY FROM STDIN 或批量 insert)、幂等(On Conflict)、分片 ingest worker(按区块范围),并配合 synchronous_commit=remote_write 保证 RPO=0。

10. 监控与日常巡检(我线上盯的几个点)

复制延迟

SELECT now()-pg_last_xact_replay_timestamp() AS standby_lag;

WAL 压力

pg_stat_wal, pg_stat_bgwriter、检查 pg_wal 占用和 max_wal_size 命中情况

Autovacuum

查看 pg_stat_all_tables 的 n_dead_tup,适时手工 VACUUM(分区表可以并行)

索引膨胀

大表周期性 REINDEX CONCURRENTLY(按分区滚动)

备份健康

pgbackrest info;每天做一次 恢复演练(起只读副本验证)

11. 我踩过的坑(以及怎么在机房里解决它们)

现象 定位 解决
克隆备库失败requested WAL segment has already been removed 主库 WAL 回收太快,备库抓不住 提高 max_wal_size 到 48GB,并引入 pgBackRest archive-get 保障缺段可回放
切换后写延迟抖动 synchronous_commit=on + 网络瞬时抖动 改为 remote_write,同步备网络打满时仍可稳住写入
分区过多导致计划时间上升 子表>2k 时计划器开销显著 提升分区粒度到 100 万区块/表;使用 默认分区 承接“未来块”,按天/周搬迁
THP 未关导致偶发卡顿 perf top 看到 khugepaged 活跃 禁用 THP(上面的 systemd),问题消失
pgbouncer 连接泄露 应用端长事务 + session 池 transaction 池;为长事务服务单独直连 5432
MinIO 证书/时钟问题 TLS 校验失败 首期关闭 Verify TLS + 内网;稳定后换企业 CA 证书;所有节点统一 chrony

12. 性能与容量(上线周的真实记录)

指标 上线日 第 3 天 第 7 天
摄取速率(行/秒) 3.2k 4.8k 5.1k
写入 P95(ms) 18 22 20
查询 P95(ms,近24h) 62 58 55
副本延迟(秒,P99) 1.2 1.5 1.3
备份窗口(全量,min) 28 - 26
对象存储占用(7 日) 1.1 TB 1.6 TB 2.4 TB

注:开启 wal_compression=on 后对象存储压力小了约 18–23%。

13. 切换演练脚本(一条龙自测)

# 1) 主库拔网(模拟故障)
ssh pg01 "nmcli conn down bond0" || true

# 2) 等待 15~30s,repmgrd 自动提升 pg02
sleep 30

# 3) 校验谁是主
for h in pg01 pg02 pg03; do
  echo -n "$h -> "
  ssh $h "sudo -iu postgres psql -tAc \"select pg_is_in_recovery()\""
done

# 4) HAProxy 后端状态
echo "== HAProxy Backends =="
ssh LB01 "echo 'show servers state' | socat stdio /var/lib/haproxy/state" || true

# 5) 应用写入探活
psql 'host=10.10.10.100 port=5432 dbname=chain_db user=appuser password=...' \
  -c "insert into ops.switch_audit(ts, who) values (now(), 'drill');"

14. 成本/容量规划(我和采购说清楚的)

容量:链主数据约 N 年 ~ X TB(按每千万交易 ~ 50–80GB 估),清晰分区+冷热分层(把远古分区转冷存)

冗余:3 节点 + 对象存储(园区) + 异地对象存储(跨城)

升级路径:当单库 CPU 到 60%/磁盘写 IOPS 到 70% 时,先扩 NVMe,其次读扩展(pgpool/应用层路由到更多只读副本)

15. 收尾:机房走出的那一刻

演练结束,日志里都是我想看到的行:standby promoted、archive-push ok、backup info valid。我把网线重新插回 pg01,repmgr 自动让它乖乖跟随新主。LB 上的 VIP 一直没飘走,应用团队说“我们这边没感受到啥”。

走出机房时,天刚蒙蒙亮。葵涌码头的灯还没熄,我知道这套集群今晚也会像它们一样继续亮着:写实的拓扑、节制的调优、可重复的演练,让链上数据的每一次落盘,都不再惊险。

附:一次性命令清单(可直接抄走)

节点共性

# 基础 & 内核 & FS & limits(见上文 3.*)
# 安装 PGDG & PG14(见上文 4.*)

pg01(主)

# 写 postgresql.conf / pg_hba.conf(4.*)
# 建用户/库(5.1)
sudo -iu postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'Repl@123';"
sudo -iu postgres psql -c "CREATE USER repmgr SUPERUSER ENCRYPTED PASSWORD 'Mgr@123';"
sudo -iu postgres psql -c "CREATE DATABASE repmgr OWNER repmgr;"
sudo -iu postgres pg_ctl -D /pgdata/data -w start
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf primary register

pg02/pg03(备)

sudo -iu postgres repmgr -h pg01 -U repmgr -d repmgr -f /etc/repmgr/14/repmgr.conf standby clone --fast-checkpoint
sudo -iu postgres pg_ctl -D /pgdata/data -w start
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf standby register
systemctl enable --now repmgr14

LB(两台)

# keepalived + haproxy + pgbouncer(6.*)
systemctl enable --now keepalived haproxy pgbouncer

备份

# pgBackRest(7.*)
sudo -iu postgres pgbackrest --stanza=pg14 stanza-create
sudo -iu postgres pgbackrest --type=full --stanza=pg14 backup