香港服务器运行 CentOS 7，我是怎么把一套“高可用 PostgreSQL 节点”落地到链上数据存储的。

技术教程 2025-09-18 15:43 189

那天凌晨 1:40，我站在香港葵涌机房 7 楼，空调风把袖口吹得“哗啦啦”。跨境团队在视频里催我做最后一次切换演练——把区块索引任务从临时库迁到新的高可用 PostgreSQL 集群。UPS 的蜂鸣和 NVMe 指示灯在黑暗里一闪一闪，我心里很踏实：这套架构，已经被我在沙箱里“折腾”了三个礼拜。接下来，就是把它真的跑起来。

1. 目标与约束（现场视角）

业务目标

写入：持续 3–5k 行/秒（区块、交易、日志分表落盘），峰值 10k 行/秒可承受
查询：最近 24h 区块与交易检索 P95 < 80ms
RPO=0（不丢数据）、RTO < 30s（自动主备切换）

现场约束

操作系统要求 CentOS 7（客户历史环境统一）
机房只提供两台 ToR 交换机，L2 跨机柜可用
备份必须落到园区内对象存储（我们自建 MinIO），且异地再备份每晚一次
维护窗口短，不允许长时间停机

2. 拓扑与硬件实配（我用的，不空谈）

2.1 逻辑拓扑

结构：
      +------------------------+         +------------------------+
      |        LB01            | <VRRP>  |        LB02            |
      | keepalived + HAProxy   |         | keepalived + HAProxy   |
      | pgbouncer (pool)       |         | pgbouncer (pool)       |
      +-----------+------------+         +-----------+------------+
                  | VIP: 10.10.10.100                |
                  +---------------------+------------+
                                        |
                         +--------------+-------------------+
                         |         PostgreSQL 集群         |
                         |  (repmgr 主备 + 同步复制)       |
                         +--------------+-------------------+
                                        |
      +------------------+------------------+------------------+
      |                  |                  |                  |
   pg01(主)           pg02(同步备)       pg03(异步备)      minio(备份)

2.2 物理参数（真实可买得到的规格）

角色	型号/CPU	内存	系统盘	数据盘	网络	备注
pg01/pg02/pg03	AMD EPYC 7313P（16C/32T）	128GB	2×480GB SATA SSD（RAID1，/）	2×3.84TB U.2 NVMe（RAID1，/pgdata）+ 1×960GB NVMe（WAL 专盘 /pgwal）	2×10GbE（bond0，LACP）	CentOS 7.9
LB01/LB02	Xeon Silver 4210	32GB	240GB SSD	-	2×10GbE	运行 HAProxy+pgbouncer+keepalived
MinIO	任意 x86_64	64GB	4×8TB HDD（RAID10）	512GB SSD（元数据）	10GbE	S3 兼容对象存储

说明：WAL 单独 NVMe 能显著稳住写延迟；数据盘做 RAID1 方便只读恢复和单盘更换。网络用 LACP 叠两个 10G 口，避免单链路抖动。

3. 系统初始化与内核/IO 调优（CentOS 7）

3.1 基础

# 基础包 & EPEL
yum install -y epel-release
yum install -y htop iotop numactl jq chrony lvm2 mdadm \
               net-tools policycoreutils-python firewalld

systemctl enable --now chronyd firewalld
timedatectl set-timezone Asia/Hong_Kong

3.2 关闭 THP（数据库通用必做）

cat >/etc/systemd/system/disable-thp.service <<'EOF'
[Unit]
Description=Disable Transparent Huge Pages
After=network.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
ExecStart=/bin/bash -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'

[Install]
WantedBy=multi-user.target
EOF
systemctl enable --now disable-thp

3.3 I/O 调度器（NVMe 用 none）

echo 'ACTION=="add|change", KERNEL=="nvme*n*", ATTR{queue/scheduler}="none"' \
  > /etc/udev/rules.d/60-nvme-io.rules
udevadm control --reload-rules && udevadm trigger

3.4 文件系统（XFS，禁 CoW 的是 Btrfs 概念；XFS 推荐默认 + noatime）

mkfs.xfs -f /dev/md/pgdata   # 或者 /dev/nvme0n1（视你是否做 mdadm RAID1）
mkfs.xfs -f /dev/nvme1n1     # WAL 盘
mkdir -p /pgdata /pgwal
echo '/dev/md/pgdata /pgdata xfs noatime,nodiratime,logbufs=8,logbsize=256k 0 0' >> /etc/fstab
echo '/dev/nvme1n1  /pgwal  xfs noatime,nodiratime 0 0' >> /etc/fstab
mount -a

3.5 sysctl & limits

cat >> /etc/sysctl.d/99-pg.conf <<EOF
vm.swappiness=1
vm.dirty_background_ratio=5
vm.dirty_ratio=20
vm.max_map_count=262144
fs.file-max=2000000
net.core.somaxconn=1024
net.core.netdev_max_backlog=250000
EOF
sysctl --system

cat >> /etc/security/limits.d/99-pg.conf <<EOF
postgres soft nofile 1024000
postgres hard nofile 1024000
EOF

4. 安装 PostgreSQL 14（PGDG）并迁移到专用目录

CentOS 7 已 EOL，但 PGDG 还提供 EL7 的 repo。强烈建议把 repo 包镜像缓存到内网。

yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum install -y postgresql14-server postgresql14-contrib
/usr/pgsql-14/bin/postgresql-14-setup initdb
systemctl stop postgresql-14
mv /var/lib/pgsql/14/data /pgdata/data
chown -R postgres:postgres /pgdata /pgwal
mkdir -p /pgwal/archive && chown -R postgres:postgres /pgwal/archive

4.1 postgresql.conf（我线上用过的“保守高性能”基线）

/pgdata/data/postgresql.conf

data_directory = '/pgdata/data'
hba_file = '/pgdata/data/pg_hba.conf'
ident_file = '/pgdata/data/pg_ident.conf'

listen_addresses = '*'
port = 5432
max_connections = 300

shared_buffers = 32GB                 # ~内存1/4
effective_cache_size = 96GB           # ~内存3/4
work_mem = 64MB                       # 结合并发控制
maintenance_work_mem = 2GB

wal_level = replica                   # 如需逻辑解码改 logical
synchronous_commit = remote_write     # 同步备可容忍的延迟/安全平衡
max_wal_senders = 16
max_replication_slots = 16
wal_compression = on
wal_writer_delay = 20ms
checkpoint_timeout = 15min
max_wal_size = 48GB
min_wal_size = 4GB

hot_standby = on
effective_io_concurrency = 256        # NVMe
random_page_cost = 1.1
seq_page_cost = 1.0

autovacuum = on
autovacuum_naptime = 10s
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02
log_autovacuum_min_duration = 0

log_checkpoints = on
log_lock_waits = on
log_min_duration_statement = 1000
log_line_prefix = '%m [%p] %u@%d %r '

4.2 pg_hba.conf（主从复制 + 业务网段）

/pgdata/data/pg_hba.conf

# 业务应用（VIP 段）
host    all             appuser         10.10.10.0/24         md5

# 复制
host    replication     replicator      10.10.10.0/24         md5
host    replication     replicator      10.10.20.0/24         md5

4.3 安全策略与防火墙

semanage port -a -t postgresql_port_t -p tcp 5432 || true
firewall-cmd --permanent --add-port=5432/tcp
firewall-cmd --reload

systemd：我让 PostgreSQL 服务开机不自动启动，交由 repmgr 管理：

systemctl disable postgresql-14

5. 用 repmgr 打造“自动主备 + 同步复制 + 自动接管”

为什么不用 Patroni？CentOS 7 的 Python 太老，不是不能搞，是得折腾（我在另一个项目里用 Docker 跑过 Patroni）。这套环境客户要求尽量少容器，我选 repmgr，稳定好用。

5.1 安装与库内用户

yum install -y repmgr_14
sudo -iu postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'Repl@123';"
sudo -iu postgres psql -c "CREATE USER repmgr SUPERUSER ENCRYPTED PASSWORD 'Mgr@123';"
sudo -iu postgres psql -c "CREATE DATABASE repmgr OWNER repmgr;"

5.2 主库（pg01）repmgr.conf

/etc/repmgr/14/repmgr.conf

node_id=1
node_name=pg01
conninfo='host=pg01 dbname=repmgr user=repmgr password=Mgr@123 port=5432'
data_directory='/pgdata/data'
use_replication_slots=yes
failover=automatic
promote_command='repmgr standby promote -f /etc/repmgr/14/repmgr.conf --log-to-file'
follow_command='repmgr standby follow -f /etc/repmgr/14/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'

把同步复制钉住（主库的 postgresql.auto.conf 由我们写入）：

sudo -iu postgres psql -c "ALTER SYSTEM SET synchronous_standby_names='FIRST 1 (pg02)';"
sudo -iu postgres psql -c "ALTER SYSTEM SET primary_conninfo='user=replicator password=Repl@123 application_name=pg01';"

sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf primary register

5.3 备库（pg02/pg03）克隆与注册

/etc/repmgr/14/repmgr.conf（注意 node_id/node_name 因机而异）

node_id=2                 # pg02 -> 2, pg03 -> 3
node_name=pg02
conninfo='host=pg02 dbname=repmgr user=repmgr password=Mgr@123 port=5432'
data_directory='/pgdata/data'
use_replication_slots=yes
failover=automatic
follow_command='repmgr standby follow -f /etc/repmgr/14/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'

克隆（以 pg02 为例）：

sudo -iu postgres repmgr -h pg01 -U repmgr -d repmgr -f /etc/repmgr/14/repmgr.conf standby clone --fast-checkpoint
# 启动
sudo -iu postgres pg_ctl -D /pgdata/data -w start
# 注册
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf standby register

pg03 一样做一遍（异步备，无需加入 synchronous_standby_names）。

5.4 开启 repmgrd 守护与自愈

systemctl enable --now repmgr14
# 某些版本包名是 repmgrd.service

我在演练里拉掉 pg01 的网线，30 秒内 pg02 自动接管，VIP 通过 HAProxy 指向新主库；应用零配置感知。

6. 连接层：HAProxy + keepalived + pgbouncer

6.1 keepalived（LB01/LB02）

/etc/keepalived/keepalived.conf

vrrp_instance VI_1 {
  state BACKUP
  interface bond0
  virtual_router_id 51
  priority 110                 # LB01=110, LB02=100
  advert_int 1
  authentication { auth_type PASS auth_pass 7c1b2d }
  virtual_ipaddress { 10.10.10.100/24 dev bond0 }
  track_script { chk_haproxy }
}

vrrp_script chk_haproxy {
  script "pidof haproxy"
  interval 2
  weight -20
}

systemctl enable --now keepalived

6.2 HAProxy（流向主库或只读池）

/etc/haproxy/haproxy.cfg

global
  daemon
  maxconn 100000
  log 127.0.0.1 local0

defaults
  mode tcp
  timeout client  30s
  timeout server  30s
  timeout connect 5s

# 直连主库（写）
frontend ft_pg_write
  bind *:5432
  default_backend bk_pg_master

backend bk_pg_master
  option tcp-check
  tcp-check connect port 5432
  server pg01 pg01:5432 check inter 1000 rise 2 fall 2
  server pg02 pg02:5432 check inter 1000 rise 2 fall 2 backup
  server pg03 pg03:5432 check inter 1000 rise 2 fall 2 backup

# 只读池（查）
frontend ft_pg_read
  bind *:15432
  default_backend bk_pg_replicas

backend bk_pg_replicas
  balance roundrobin
  option tcp-check
  server pg02 pg02:5432 check
  server pg03 pg03:5432 check

6.3 pgbouncer（连接池，建议跑在 LB 上）

/etc/pgbouncer/pgbouncer.ini

[databases]
chain = host=127.0.0.1 port=5432 dbname=chain_db pool_size=200 reserve_pool=50

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits
max_client_conn = 5000
default_pool_size = 200
reserve_pool_size = 50
server_idle_timeout = 60

userlist.txt：

"appuser" "md5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

防火墙：

firewall-cmd --permanent --add-port=6432/tcp && firewall-cmd --reload

7. 备份归档：pgBackRest + MinIO（RPO=0 的关键补丁）

7.1 安装与仓库配置

yum install -y pgbackrest
mkdir -p /etc/pgbackrest /var/log/pgbackrest
chown -R postgres:postgres /var/log/pgbackrest

/etc/pgbackrest/pgbackrest.conf（所有 PG 节点一致）

[global]
repo1-type=s3
repo1-s3-endpoint=minio.local
repo1-s3-bucket=pg-backup
repo1-s3-key=minio-access
repo1-s3-key-secret=minio-secret
repo1-s3-verify-tls=n
repo1-path=/pgbackrest
repo1-retention-full=7
start-fast=y
process-max=8
log-level-console=info
log-level-file=info

[pg14]
pg1-path=/pgdata/data
pg1-port=5432

7.2 开启 WAL 归档

postgresql.conf 补充：

archive_mode = on
archive_command = 'pgbackrest --stanza=pg14 archive-push %p'

初始化 & 首次全备：

sudo -iu postgres pgbackrest --stanza=pg14 --log-level-console=info stanza-create
sudo -iu postgres pgbackrest --type=full --stanza=pg14 backup

我在演练中故意删了主库 data 目录，10 分钟内从最新全量 + 增量 WAL 做了点位恢复（PITR），数据对上号，心就踏实了。

8. “链上数据”表结构与分区策略（PostgreSQL 14 原生分区）

8.1 基础 schema

CREATE SCHEMA IF NOT EXISTS chain;

-- 分区父表：区块
CREATE TABLE chain.blocks (
  block_number     BIGINT NOT NULL,
  block_hash       BYTEA  NOT NULL,
  parent_hash      BYTEA  NOT NULL,
  miner            BYTEA  NULL,
  tx_count         INT    NOT NULL,
  ts               TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_blocks PRIMARY KEY (block_number)
) PARTITION BY RANGE (block_number);

-- 交易
CREATE TABLE chain.txs (
  block_number     BIGINT NOT NULL,
  tx_index         INT    NOT NULL,
  tx_hash          BYTEA  NOT NULL,
  from_addr        BYTEA  NOT NULL,
  to_addr          BYTEA  NULL,
  value_wei        NUMERIC(38,0) NOT NULL,
  gas_price_wei    NUMERIC(38,0) NOT NULL,
  gas_used         INT NOT NULL,
  status           SMALLINT NOT NULL,
  input            BYTEA NULL,
  ts               TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_txs PRIMARY KEY (block_number, tx_index)
) PARTITION BY RANGE (block_number);

-- 日志（事件）
CREATE TABLE chain.logs (
  block_number BIGINT NOT NULL,
  tx_index     INT    NOT NULL,
  log_index    INT    NOT NULL,
  address      BYTEA  NOT NULL,
  topic0       BYTEA  NULL,
  topic1       BYTEA  NULL,
  topic2       BYTEA  NULL,
  topic3       BYTEA  NULL,
  data         BYTEA  NULL,
  ts           TIMESTAMPTZ NOT NULL,
  CONSTRAINT pk_logs PRIMARY KEY (block_number, tx_index, log_index)
) PARTITION BY RANGE (block_number);

类型选择：哈希与地址我用 BYTEA（把 hex 用 decode($hex,'hex') 存），比 TEXT 更省空间也更快。常查字段建索引。

8.2 分区与索引

我以“每 1,000,000 个区块一张子表”为粒度（方便滚动归档）：

DO $$
DECLARE start_block BIGINT := 0;   -- 依据链高度调整
DECLARE end_block   BIGINT := 20000000;
DECLARE step        BIGINT := 1000000;
DECLARE t BIGINT;
BEGIN
  t := start_block;
  WHILE t < end_block LOOP
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.blocks_%s PARTITION OF chain.blocks FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.txs_%s    PARTITION OF chain.txs    FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    EXECUTE format('CREATE TABLE IF NOT EXISTS chain.logs_%s   PARTITION OF chain.logs   FOR VALUES FROM (%s) TO (%s);', t, t, t+step);
    t := t + step;
  END LOOP;
END $$;

-- 常用索引（示例）
CREATE INDEX ON chain.txs  (tx_hash);
CREATE INDEX ON chain.txs  (from_addr);
CREATE INDEX ON chain.txs  (to_addr);
CREATE INDEX ON chain.logs (address);
CREATE INDEX ON chain.blocks (ts);
-- 大表时间相关可加 BRIN，加速范围扫描又省空间
CREATE INDEX ON chain.blocks USING brin (block_number);

9. 摄取（ingest）示例：Python + web3 + COPY 流式写入

生产我用的是 Go 写的异步管线，这里给个 Python 思路，方便测试/PoC。

# ingest_eth.py
import os, time, psycopg2, binascii
from web3 import Web3
from psycopg2.extras import execute_values

PG_DSN = os.getenv("PG_DSN")  # 'host=10.10.10.100 port=5432 dbname=chain_db user=appuser password=...'
W3 = Web3(Web3.HTTPProvider(os.getenv("ETH_RPC")))  # 本地 geth/erigon 或云服务

def b(hexstr):  # '0x...' -> bytes
    return binascii.unhexlify(hexstr[2:]) if hexstr and hexstr.startswith("0x") else None

def upsert_blocks(blocks):
    with psycopg2.connect(PG_DSN) as conn:
        with conn.cursor() as cur:
            sql = """INSERT INTO chain.blocks (block_number, block_hash, parent_hash, miner, tx_count, ts)
                     VALUES %s ON CONFLICT (block_number) DO NOTHING"""
            vals = [(b.number, b(b.hash.hex()), b(b.parentHash.hex()), b(b.miner.hex()),
                     len(b.transactions), time.strftime("%Y-%m-%d %H:%M:%S%z", time.gmtime(b.timestamp)))
                    for b in blocks]
            execute_values(cur, sql, vals, page_size=1000)

def upsert_txs(block, txs):
    with psycopg2.connect(PG_DSN) as conn:
        with conn.cursor() as cur:
            sql = """INSERT INTO chain.txs
            (block_number, tx_index, tx_hash, from_addr, to_addr, value_wei, gas_price_wei, gas_used, status, input, ts)
            VALUES %s ON CONFLICT DO NOTHING"""
            vals = []
            for i, tx in enumerate(txs):
                receipt = W3.eth.get_transaction_receipt(tx.hash)
                vals.append((block.number, i, b(tx.hash.hex()),
                             b(tx['from']), b(tx['to']) if tx['to'] else None,
                             tx['value'], tx['gasPrice'], receipt['gasUsed'],
                             1 if receipt['status']==1 else 0,
                             b(tx['input']) if tx['input']!='0x' else None,
                             time.strftime("%Y-%m-%d %H:%M:%S%z", time.gmtime(block.timestamp))))
            execute_values(cur, sql, vals, page_size=1000)

def loop():
    head = W3.eth.block_number
    cur = head - 1000
    while True:
        if cur > head:
            time.sleep(1)
            head = W3.eth.block_number
            continue
        block = W3.eth.get_block(cur, full_transactions=True)
        upsert_blocks([block])
        upsert_txs(block, block.transactions)
        cur += 1

if __name__ == "__main__":
    loop()

线上建议：批量写（COPY FROM STDIN 或批量 insert）、幂等（On Conflict）、分片 ingest worker（按区块范围），并配合 synchronous_commit=remote_write 保证 RPO=0。

10. 监控与日常巡检（我线上盯的几个点）

复制延迟

SELECT now()-pg_last_xact_replay_timestamp() AS standby_lag;

WAL 压力

pg_stat_wal, pg_stat_bgwriter、检查 pg_wal 占用和 max_wal_size 命中情况

Autovacuum

查看 pg_stat_all_tables 的 n_dead_tup，适时手工 VACUUM（分区表可以并行）

索引膨胀

大表周期性 REINDEX CONCURRENTLY（按分区滚动）

备份健康

pgbackrest info；每天做一次恢复演练（起只读副本验证）

11. 我踩过的坑（以及怎么在机房里解决它们）

现象	定位	解决
克隆备库失败：`requested WAL segment has already been removed`	主库 WAL 回收太快，备库抓不住	提高 `max_wal_size` 到 48GB，并引入 pgBackRest archive-get 保障缺段可回放
切换后写延迟抖动	`synchronous_commit=on` + 网络瞬时抖动	改为 `remote_write`，同步备网络打满时仍可稳住写入
分区过多导致计划时间上升	子表>2k 时计划器开销显著	提升分区粒度到 100 万区块/表；使用默认分区承接“未来块”，按天/周搬迁
THP 未关导致偶发卡顿	`perf top` 看到 khugepaged 活跃	禁用 THP（上面的 systemd），问题消失
pgbouncer 连接泄露	应用端长事务 + session 池	改 transaction 池；为长事务服务单独直连 5432
MinIO 证书/时钟问题	TLS 校验失败	首期关闭 Verify TLS + 内网；稳定后换企业 CA 证书；所有节点统一 `chrony`

12. 性能与容量（上线周的真实记录）

指标	上线日	第 3 天	第 7 天
摄取速率（行/秒）	3.2k	4.8k	5.1k
写入 P95（ms）	18	22	20
查询 P95（ms，近24h）	62	58	55
副本延迟（秒，P99）	1.2	1.5	1.3
备份窗口（全量，min）	28	-	26
对象存储占用（7 日）	1.1 TB	1.6 TB	2.4 TB

注：开启 wal_compression=on 后对象存储压力小了约 18–23%。

13. 切换演练脚本（一条龙自测）

# 1) 主库拔网（模拟故障）
ssh pg01 "nmcli conn down bond0" || true

# 2) 等待 15~30s，repmgrd 自动提升 pg02
sleep 30

# 3) 校验谁是主
for h in pg01 pg02 pg03; do
  echo -n "$h -> "
  ssh $h "sudo -iu postgres psql -tAc \"select pg_is_in_recovery()\""
done

# 4) HAProxy 后端状态
echo "== HAProxy Backends =="
ssh LB01 "echo 'show servers state' | socat stdio /var/lib/haproxy/state" || true

# 5) 应用写入探活
psql 'host=10.10.10.100 port=5432 dbname=chain_db user=appuser password=...' \
  -c "insert into ops.switch_audit(ts, who) values (now(), 'drill');"

14. 成本/容量规划（我和采购说清楚的）

容量：链主数据约 N 年 ~ X TB（按每千万交易 ~ 50–80GB 估），清晰分区+冷热分层（把远古分区转冷存）

冗余：3 节点 + 对象存储（园区） + 异地对象存储（跨城）

升级路径：当单库 CPU 到 60%/磁盘写 IOPS 到 70% 时，先扩 NVMe，其次读扩展（pgpool/应用层路由到更多只读副本）

15. 收尾：机房走出的那一刻

演练结束，日志里都是我想看到的行：standby promoted、archive-push ok、backup info valid。我把网线重新插回 pg01，repmgr 自动让它乖乖跟随新主。LB 上的 VIP 一直没飘走，应用团队说“我们这边没感受到啥”。

走出机房时，天刚蒙蒙亮。葵涌码头的灯还没熄，我知道这套集群今晚也会像它们一样继续亮着：写实的拓扑、节制的调优、可重复的演练，让链上数据的每一次落盘，都不再惊险。

附：一次性命令清单（可直接抄走）

节点共性

# 基础 & 内核 & FS & limits（见上文 3.*）
# 安装 PGDG & PG14（见上文 4.*）

pg01（主）

# 写 postgresql.conf / pg_hba.conf（4.*）
# 建用户/库（5.1）
sudo -iu postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'Repl@123';"
sudo -iu postgres psql -c "CREATE USER repmgr SUPERUSER ENCRYPTED PASSWORD 'Mgr@123';"
sudo -iu postgres psql -c "CREATE DATABASE repmgr OWNER repmgr;"
sudo -iu postgres pg_ctl -D /pgdata/data -w start
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf primary register

pg02/pg03（备）

sudo -iu postgres repmgr -h pg01 -U repmgr -d repmgr -f /etc/repmgr/14/repmgr.conf standby clone --fast-checkpoint
sudo -iu postgres pg_ctl -D /pgdata/data -w start
sudo -iu postgres repmgr -f /etc/repmgr/14/repmgr.conf standby register
systemctl enable --now repmgr14

LB（两台）

# keepalived + haproxy + pgbouncer（6.*）
systemctl enable --now keepalived haproxy pgbouncer

备份

# pgBackRest（7.*）
sudo -iu postgres pgbackrest --stanza=pg14 stanza-create
sudo -iu postgres pgbackrest --type=full --stanza=pg14 backup

上一篇：在线课堂平台如何在香港服务器上部署WebRTC+TURN中继保障弱网环境下的稳定性？下一篇：短视频平台如何在香港服务器的 Debian 中落地 MinIO + Nginx，顶住热点内容洪峰