DRBD故障处理
drbd1主,drbd2辅
1,正常情况下状态:
[root@drbd1 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:2144476 nr:0 dw:36468 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
2,drbd1故障后
drbd1状态:
[root@drbd1 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----
ns:4 nr:102664 dw:102668 dr:157 al:1 bm:8 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
drbd2的状态:
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3,处理方法:
a,将secondary配置成primary角色
[root@drbd2 ~]# drbdsetup /dev/drbd0 primary -o
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
挂载:
[root@drbd2 /]# mount /dev/drbd0 /data1
[root@drbd2 data1]# ll
total 10272
-rw-r--r-- 1 root root 10485760 Feb 13 11:26 aa.img
drwx------ 2 root root 16384 Feb 13 11:25 lost+found
这个时候drbd2开始提供服务,开始写数据
drbd1主恢复正常后:
[root@drbd1 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----
ns:2144476 nr:0 dw:36484 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8
drbd1状态是:StandAlone,此时,drbd1是不会和drbd2互相联系的
我们来查看下日志:
[root@drbd1 ~]# tailf /var/log/messages
Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( WFReportParams -> Disconnecting )
Feb 13 16:14:27 drbd1 kernel: block drbd0: error receiving ReportState, l: 4!
Feb 13 16:14:27 drbd1 kernel: block drbd0: asender terminated
Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_asender
Feb 13 16:14:27 drbd1 kernel: block drbd0: Connection closed
Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( Disconnecting -> StandAlone )
Feb 13 16:14:27 drbd1 kernel: block drbd0: receiver terminated
Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_receiver
脑裂出现!
解决方法:
1>,我们需要将现在的drbd1角色修改为secondary
[root@drbd1 ~]# drbdadm secondary r0
[root@drbd1 ~]# drbdadm -- --discard-my-data connect r0 ##该命令告诉drbd,secondary上的数据不正确,以primary上的数据为准。
2>,我们还需要在drbd2上执行下面操作
[root@drbd2 /]# drbdadm connect r0
这样drbd1就能和drbd2开始连接上了,并且保证数据不会丢失:
[root@drbd1 ~]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:20592 dw:20592 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0