Pacemaker 无法在 postgres-11

如何解决Pacemaker 无法在 postgres-11

我有 2 个节点(命名为 node03 和 node04)在主从热备用设置中,使用起搏器来管理集群。在切换之前,node04 是主节点,03 是备用节点。 切换后,我一直试图将node04重新带回作为从节点,但无法做到。

在切换期间,我意识到有人更改了配置文件并将 ignore_system_indexes 参数设置为 true。我不得不删除它并手动重新启动 postgres 服务器。在此之后,集群开始出现异常行为。

可以手动将 node04 恢复为从节点,即,如果我手动启动 postgres 实例并使用 recovery.conf 文件。

以下是了解情况所需的文件:

sudo crm_mon -A1f
Stack: corosync
Current DC: node03 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum

Node node04: standby
Online: [ node03 ]

Active resources:

 Resource Group: master-group
     vip-repli  (ocf::heartbeat:IPaddr2):       Started node03
     vip-master (ocf::heartbeat:IPaddr2):       Started node03
 Master/Slave Set: pgsql-cluster [pgsqlins]
     Masters: [ node03 ]

Node Attributes:
* Node node03:
    + master-pgsqlins                   : 1000
    + pgsqlins-data-status              : LATEST
    + pgsqlins-master-baseline          : 00008820DC000098
    + pgsqlins-status                   : PRI
* Node node04:
    + master-pgsqlins                   : -INFINITY
    + pgsqlins-data-status              : DISCONNECT
    + pgsqlins-status                   : STOP

Migration Summary:
* Node node03:
* Node node04:

recovery.conf

primary_conninfo = 'host=1xx.xx.xx.xx port=5432 user=replica application_name=node04 keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
restore_command = 'rsync -a /Dxxxxx1/wal_archive/%f %p'
recovery_target_timeline = 'latest'
standby_mode = 'on'

集群CIB

sudo pcs cluster cib
<cib crm_feature_set="3.0.14" validate-with="pacemaker-2.10" epoch="269" num_updates="4" admin_epoch="0" cib-last-written="Mon Jun 28 15:13:35 2021" update-origin="node04" update-client="crmd" update-user="hacluster" have-quorum="1" dc-uuid="1">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
        <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="ignore"/>
        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.23-1.el7_9.1-9acf116022"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="pgcluster"/>
        <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1624860815"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="node03">
        <instance_attributes id="nodes-1">
          <nvpair id="nodes-1-pgsqlins-data-status" name="pgsqlins-data-status" value="LATEST"/>
        </instance_attributes>
      </node>
      <node id="2" uname="node04">
        <instance_attributes id="nodes-2">
          <nvpair id="nodes-2-pgsqlins-data-status" name="pgsqlins-data-status" value="DISCONNECT"/>
          <nvpair id="nodes-2-standby" name="standby" value="on"/>
        </instance_attributes>
      </node>
    </nodes>
    <resources>
      <group id="master-group">
        <primitive class="ocf" id="vip-repli" provider="heartbeat" type="IPaddr2">
          <instance_attributes id="vip-repli-instance_attributes">
            <nvpair id="vip-repli-instance_attributes-cidr_netmask" name="cidr_netmask" value="24"/>
            <nvpair id="vip-repli-instance_attributes-ip" name="ip" value="1xx.xx.xx.xx"/>
            <nvpair id="vip-repli-instance_attributes-nic" name="nic" value="eth2"/>
          </instance_attributes>
          <operations>
            <op id="vip-repli-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>
            <op id="vip-repli-start-interval-0s" interval="0s" name="start" timeout="20s"/>
            <op id="vip-repli-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>
          </operations>
        </primitive>
        <primitive class="ocf" id="vip-master" provider="heartbeat" type="IPaddr2">
          <instance_attributes id="vip-master-instance_attributes">
            <nvpair id="vip-master-instance_attributes-cidr_netmask" name="cidr_netmask" value="24"/>
            <nvpair id="vip-master-instance_attributes-ip" name="ip" value="1x.xx.xxx.xxx"/>
            <nvpair id="vip-master-instance_attributes-nic" name="nic" value="eth1"/>
          </instance_attributes>
          <operations>
            <op id="vip-master-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>
            <op id="vip-master-start-interval-0s" interval="0s" name="start" timeout="20s"/>
            <op id="vip-master-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>
          </operations>
        </primitive>
      </group>
      <master id="pgsql-cluster">
        <primitive class="ocf" id="pgsqlins" provider="heartbeat" type="pgsql11">
          <instance_attributes id="pgsqlins-instance_attributes">
            <nvpair id="pgsqlins-instance_attributes-master_ip" name="master_ip" value="1xx.xx.xx.xx"/>
            <nvpair id="pgsqlins-instance_attributes-node_list" name="node_list" value="node03 node04"/>
            <nvpair id="pgsqlins-instance_attributes-pgctl" name="pgctl" value="/usr/pgsql-11/bin/pg_ctl"/>
            <nvpair id="pgsqlins-instance_attributes-pgdata" name="pgdata" value="/DPxxxx01/datadg/data"/>
            <nvpair id="pgsqlins-instance_attributes-pgport" name="pgport" value="5432"/>
            <nvpair id="pgsqlins-instance_attributes-primary_conninfo_opt" name="primary_conninfo_opt" value="keepalives_idle=60 keepalives_interval=5 keepalives_count=5"/>
            <nvpair id="pgsqlins-instance_attributes-psql" name="psql" value="/usr/pgsql-11/bin/psql"/>
            <nvpair id="pgsqlins-instance_attributes-rep_mode" name="rep_mode" value="sync"/>
            <nvpair id="pgsqlins-instance_attributes-repuser" name="repuser" value="replica"/>
            <nvpair id="pgsqlins-instance_attributes-restart_on_promote" name="restart_on_promote" value="true"/>
            <nvpair id="pgsqlins-instance_attributes-restore_command" name="restore_command" value="rsync -a /Dxxxxx01/wal_archive/%f %p"/>
          </instance_attributes>
          <operations>
            <op id="pgsqlins-demote-interval-0" interval="0" name="demote" on-fail="stop" timeout="60s"/>
            <op id="pgsqlins-methods-interval-0s" interval="0s" name="methods" timeout="5s"/>
            <op id="pgsqlins-monitor-interval-10s" interval="10s" name="monitor" on-fail="restart" timeout="60s"/>
            <op id="pgsqlins-monitor-interval-9s" interval="9s" name="monitor" on-fail="restart" role="Master" timeout="60s"/>
            <op id="pgsqlins-notify-interval-0" interval="0" name="notify" timeout="60s"/>
            <op id="pgsqlins-promote-interval-0" interval="0" name="promote" on-fail="restart" timeout="60s"/>
            <op id="pgsqlins-start-interval-0" interval="0" name="start" on-fail="restart" timeout="60s"/>
            <op id="pgsqlins-stop-interval-0" interval="0" name="stop" on-fail="block" timeout="60s"/>
          </operations>
        </primitive>
        <meta_attributes id="pgsql-cluster-meta_attributes">
          <nvpair id="pgsql-cluster-meta_attributes-master-node-max" name="master-node-max" value="1"/>
          <nvpair id="pgsql-cluster-meta_attributes-clone-max" name="clone-max" value="2"/>
          <nvpair id="pgsql-cluster-meta_attributes-notify" name="notify" value="true"/>
          <nvpair id="pgsql-cluster-meta_attributes-master-max" name="master-max" value="1"/>
          <nvpair id="pgsql-cluster-meta_attributes-clone-node-max" name="clone-node-max" value="1"/>
        </meta_attributes>
      </master>
    </resources>
    <constraints>
      <rsc_colocation id="colocation-master-group-pgsql-cluster-INFINITY" rsc="master-group" score="INFINITY" with-rsc="pgsql-cluster" with-rsc-role="Master"/>
      <rsc_order first="pgsql-cluster" first-action="promote" id="order-pgsql-cluster-master-group-INFINITY" score="INFINITY" symmetrical="false" then="master-group" then-action="start"/>
      <rsc_order first="pgsql-cluster" first-action="demote" id="order-pgsql-cluster-master-group-0" score="0" symmetrical="false" then="master-group" then-action="stop"/>
      <rsc_location id="cli-prefer-pgsql-cluster" rsc="pgsql-cluster" role="Started" node="node04" score="INFINITY"/>
    </constraints>
  </configuration>
  <status>
    <node_state id="1" uname="node03" in_ccm="true" crmd="online" crm-debug-origin="do_update_resource" join="member" expected="member">
      <transient_attributes id="1">
        <instance_attributes id="status-1">
          <nvpair id="status-1-pgsqlins-status" name="pgsqlins-status" value="PRI"/>
          <nvpair id="status-1-master-pgsqlins" name="master-pgsqlins" value="1000"/>
          <nvpair id="status-1-pgsqlins-master-baseline" name="pgsqlins-master-baseline" value="00008820DC000098"/>
        </instance_attributes>
      </transient_attributes>
      <lrm id="1">
        <lrm_resources>
          <lrm_resource id="vip-master" type="IPaddr2" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="vip-master_last_0" operation_key="vip-master_start_0" operation="start" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="3:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:0;3:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="535" rc-code="0" op-status="0" interval="0" last-run="1624859077" last-rc-change="1624859077" exec-time="90" queue-time="0" op-digest="38fc1b2633211138e53cb349a5c147ff"/>
            <lrm_rsc_op id="vip-master_monitor_10000" operation_key="vip-master_monitor_10000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="4:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:0;4:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="536" rc-code="0" op-status="0" interval="10000" last-rc-change="1624859077" exec-time="72" queue-time="0" op-digest="4cbf56ab9e52c6f07a7be8cbb786451c"/>
          </lrm_resource>
          <lrm_resource id="vip-repli" type="IPaddr2" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="vip-repli_last_0" operation_key="vip-repli_start_0" operation="start" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="1:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:0;1:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="532" rc-code="0" op-status="0" interval="0" last-run="1624859077" last-rc-change="1624859077" exec-time="127" queue-time="0" op-digest="dd04ed3322c75b7bab13c5bea56dbe77"/>
            <lrm_rsc_op id="vip-repli_monitor_10000" operation_key="vip-repli_monitor_10000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="2:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:0;2:433:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="534" rc-code="0" op-status="0" interval="10000" last-rc-change="1624859077" exec-time="55" queue-time="0" op-digest="c76770c29a91fb082fdf1fdd8b0469c3"/>
          </lrm_resource>
          <lrm_resource id="pgsqlins" type="pgsql11" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="pgsqlins_last_0" operation_key="pgsqlins_promote_0" operation="promote" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="12:432:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:0;12:432:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="530" rc-code="0" op-status="0" interval="0" last-run="1624859073" last-rc-change="1624859073" exec-time="3307" queue-time="0" op-digest="2f51441ed087061eb68745fd8157ddb6"/>
            <lrm_rsc_op id="pgsqlins_monitor_9000" operation_key="pgsqlins_monitor_9000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="13:433:8:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:8;13:433:8:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node03" call-id="533" rc-code="8" op-status="0" interval="9000" last-rc-change="1624859078" exec-time="497" queue-time="1" op-digest="978aa48a7da35944c793e174dbee9a1d"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
    </node_state>
    <node_state id="2" uname="node04" in_ccm="true" crmd="online" crm-debug-origin="do_update_resource" join="member" expected="member">
      <lrm id="2">
        <lrm_resources>
          <lrm_resource id="vip-repli" type="IPaddr2" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="vip-repli_last_0" operation_key="vip-repli_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="4:1:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:7;4:1:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node04" call-id="5" rc-code="7" op-status="0" interval="0" last-run="1624600624" last-rc-change="1624600624" exec-time="65" queue-time="0" op-digest="dd04ed3322c75b7bab13c5bea56dbe77"/>
          </lrm_resource>
          <lrm_resource id="vip-master" type="IPaddr2" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="vip-master_last_0" operation_key="vip-master_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="5:1:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:7;5:1:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node04" call-id="9" rc-code="7" op-status="0" interval="0" last-run="1624600624" last-rc-change="1624600624" exec-time="62" queue-time="0" op-digest="38fc1b2633211138e53cb349a5c147ff"/>
          </lrm_resource>
          <lrm_resource id="pgsqlins" type="pgsql11" class="ocf" provider="heartbeat">
            <lrm_rsc_op id="pgsqlins_last_0" operation_key="pgsqlins_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.14" transition-key="4:436:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" transition-magic="0:7;4:436:7:54755ae3-42a4-477c-ae37-8ae8bfbc1f04" exit-reason="" on_node="node04" call-id="192" rc-code="7" op-status="0" interval="0" last-run="1624860816" last-rc-change="1624860816" exec-time="178" queue-time="0" op-digest="2f51441ed087061eb68745fd8157ddb6"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
      <transient_attributes id="2">
        <instance_attributes id="status-2">
          <nvpair id="status-2-pgsqlins-status" name="pgsqlins-status" value="STOP"/>
          <nvpair id="status-2-master-pgsqlins" name="master-pgsqlins" value="-INFINITY"/>
        </instance_attributes>
      </transient_attributes>
    </node_state>
  </status>
</cib>

如果我尝试取消待命状态 node04,它首先将 node03 降级,然后尝试提升 node04,尽管 node04 没有出现。我尝试仅将 node04 单独提出,但也失败了。 但是,如果我尝试从上述情况手动启动 node04,我可以做到。如果我尝试清理 pgsqlins 资源,它会失败。

这是corosync.log

8 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Forwarding cib_apply_diff operation for section 'all' to all (origin=local/ci
badmin/2)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.251.32 2
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.0 b956759712580c1bfdffd25cbf4ab8e9
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       -- /cib/configuration/nodes/node[@id='2']/instance_attributes[@id='nodes-2']/
nvpair[@id='nodes-2-standby']
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @epoch=252,@num_updates=0
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_apply_diff operation for section 'all': OK (rc=0,origin=dci2pg
s04/cibadmin/2,version=0.252.0)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_file_backup:      Archived previous version as /var/lib/pacemaker/cib/cib-60.raw
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_file_write_with_digest:   Wrote version 0.252.0 of the CIB to disk (digest: 8b99629d323c923de59
2700bc4398c49)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_file_write_with_digest:   Reading cluster configuration file /var/lib/pacemaker/cib/cib.ZtvQXP
(digest: /var/lib/pacemaker/cib/cib.fh4Toy)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.252.0 2
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.1 (null)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @num_updates=1
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@i
d='pgsqlins']/lrm_rsc_op[@id='pgsqlins_last_0']:  @operation_key=pgsqlins_demote_0,@operation=demote,@transition-key=10:396:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@transi
tion-magic=-1:193;10:396:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@call-id=-1,@rc-code=193,@op-status=-1,@last-run=1624852894,@last-rc-change=1624852894,@exec-time=0
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node03
/crmd/948,version=0.252.1)
Jun 28 13:01:34 [9294] node04.dc.japannext.co.jp      attrd:     info: attrd_peer_update:    Setting master-pgsqlins[node03]: 1000 -> -INFINITY from node03
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.252.1 2
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.2 (null)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @num_updates=2
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_att
ributes[@id='status-1']/nvpair[@id='status-1-master-pgsqlins']:  @value=-INFINITY
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node03
/attrd/211,version=0.252.2)
Jun 28 13:01:34 [9294] node04.dc.japannext.co.jp      attrd:     info: attrd_peer_update:    Setting pgsqlins-master-baseline[node03]: 00008820CC000098 -> (null) from node03
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.252.2 2
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.3 (null)
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       -- /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-pgsqlins-master-baseline']
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @num_updates=3
Jun 28 13:01:34 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node03/attrd/212,version=0.252.3)
Jun 28 13:01:35 [9294] node04.dc.japannext.co.jp      attrd:     info: attrd_peer_update:    Setting pgsqlins-status[node03]: PRI -> STOP from node03
.
.
.
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='pgsqlins']/lrm_rsc_op[@id='pgsqlins_last_0']:  @transition-magic=0:0;9:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@call-id=445,@rc-code=0,@op-status=0,@exec-time=471
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node03/crmd/956,version=0.252.11)
Jun 28 13:01:36 [9296] node04.dc.japannext.co.jp       crmd:     info: do_lrm_rsc_op:        Performing key=10:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04 op=pgsqlins_start_0
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Forwarding cib_modify operation for section status to all (origin=local/crmd/142)
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.252.11 2
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.12 (null)
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @num_updates=12
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='pgsqlins']/lrm_rsc_op[@id='pgsqlins_last_0']:  @operation_key=pgsqlins_start_0,@operation=start,@transition-key=12:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@transition-magic=-1:193;12:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@exec-time=0
Jun 28 13:01:36 [9293] node04.dc.japannext.co.jp       lrmd:     info: log_execute:  executing - rsc:pgsqlins action:start call_id:132
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node03/crmd/957,version=0.252.12)
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: --- 0.252.12 2
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       Diff: +++ 0.252.13 (null)
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib:  @num_updates=13
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='pgsqlins']/lrm_rsc_op[@id='pgsqlins_last_0']:  @operation_key=pgsqlins_start_0,@transition-key=10:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@transition-magic=-1:193;10:397:0:54755ae3-42a4-477c-ae37-8ae8bfbc1f04,@last-run=1624852896,@last-rc-change=1624852896,@exec-time=0
Jun 28 13:01:36 [9291] node04.dc.japannext.co.jp        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0,origin=node04/crmd/142,version=0.252.13)
Jun 28 13:01:37  pgsql11(pgsqlins)[9613]:    INFO: Set all nodes into async mode.
Jun 28 13:01:37  pgsql11(pgsqlins)[9613]:    INFO: PostgreSQL is down
Jun 28 13:01:37  pgsql11(pgsqlins)[9613]:    INFO: server starting
Jun 28 13:01:37  pgsql11(pgsqlins)[9613]:    INFO: PostgreSQL start command sent.
Jun 28 13:01:37  pgsql11(pgsqlins)[9613]:    WARNING: Can't get PostgreSQL recovery status. rc=2

我的猜测是,起搏器正在从 /var/lib/pacemaker/cib 切换之前读取设置,并使用它来执行这些步骤。任何有关如何重置它的帮助将不胜感激。

解决方法

  • 正如pacemaker的问题中提到的,在将node04置于非待机状态时,pacemaker正在降级node03并试图让node04成为master。它会在此任务中失败,然后将 node03 设为独立主服务器。

  • 由于我怀疑它是否从 cibpengine 文件夹中选择了一些旧配置,我什至销毁了两个节点上的集群,移除了起搏器、pcs 和 corosync 并重新安装所有这些。

  • 即便如此,问题依然存在。然后我怀疑 /var/lib/pgsql/ 上的 node04 文件夹的文件夹权限可能不对,并开始探索它。

  • 直到那时我才意识到有一个旧的 PGSQL.lock.bak 文件,它的日期是 6 月 11 日,这意味着它比 PGSQL.lock 中当前的 node03 文件更旧,因此,起搏器试图宣传 node04 并且失败了。 Pacemaker 不会在任何日志中将此显示为错误。即使在 crm_mon 输出中也没有关于它的信息。一旦我删除了这个文件,它就像一个魅力。

TLDR;

  • 在再次启动起搏器之前,检查 PGSQL.lock.bak 文件夹中是否有任何 /var/lib/pgsql/tmp 或任何其他不需要的文件并删除它们。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res