Existing VCS Resource Dependency
VCS resource dependencies are construct used for linking two resources. A dependency consists of parent & child resources, and dictates start/stop order for the resources in the dependency. For starting/keeping online a parent resource, all child resources must be started/kept online first. Currently a parent resource can depend upon multiple child resources but all child resources must be onlined before trying to online parent resource.
With growing use of dependencies in Datacenters, at times it is needed that rather than parent resource depending upon child resource, it depends upon set of child resources. In certain configurations/scenarios, parent resource depends upon set of ‘n’ child resources out of which minimum ‘m’ child resources must be onlined before parent resource is onlined. While child and parent resources are online, user can offline a child resource from a set of ‘n’ child resources till ‘m’ child resources from the set are still online. Similarly while child and parent resources are online; if a child resource abruptly stops from set of ‘n’ child resources and still ‘m’ child resources are running from the set, then parent resource remains unaffected. These scenarios discussed were not handled by existing dependency. Following use cases elaborate the requirement.
SFRAC: atleast 1 OCR must be online for onlining CSSD.
CP Server: atleast 1 IP from each subnet must be online for onlining CP Server.
File Store: atleast 1 mount point must online for onlining virtual IP.
.
‘atleast’ VCS Resource Dependency
In VCS 6.2 release, a new dependency called ‘atleast’ was introduced. In case of ‘atleast’ dependency, from the set of ‘n’ child resources of a parent resource, at least ‘m’ child resource(s) must be onlined before parent resource is onlined. As per dependency’s definitions, child resources must be onlined before parent resource. This holds true for ‘atleast’ dependency also. ‘atleast’ dependency provides flexibility of onlining any ‘m’(minimum) child resources from set of total ‘n’ child resources. ‘atleast’ dependency is elaborated with the help of ora_grp group consisting of a cssd resource and 5 ocr resources.
.
How resources are linked?
CLI:
# hares -link cssd ocr1,ocr2,ocr3,ocr4,ocr5 -min 1
main.cf:
group ora_grp (
.
.
)
.
.
... Resource definitions ...
.
.
cssd requires atleast 1 from ocr1,ocr2,ocr3,ocr4,ocr5
Minimum criteria must be greater than 1 and less than total number of child resources in the set.
.
How dependent resources are displayed?
To maintain backward compatibility, ‘atleast’ dependency is flatten and displayed in existing format. With ‘-atleast’ switch, it is displayed in new format.
Without -atleast switch:
# hares -dep
#Group Parent Child
ora_grp cssd ocr5
ora_grp cssd ocr4
ora_grp cssd ocr3
ora_grp cssd ocr2
ora_grp cssd ocr1
With -atleast switch:
# hares -dep -atleast
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
In VCS Java GUI and VOM, atleast dependency is shown as existing dependency.
.
How resources are unlinked?
Partial unlinking isn’t allowed. All resources must be specified while unlinking ‘atleast’ dependency. Order of child resources is not important.
# hares -dep -atleast
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
# hares -unlink cssd ocr1
VCS WARNING V-16-1-10279 Could not unlink cssd and ocr1. cssd and ocr1 are part of an 'atleast' dependency. Individual/partial unlinking is not allowed in 'atleast' dependency.
# hares -unlink cssd ocr1,ocr2,ocr4,ocr3,ocr5
# echo $?
0
# hares -dep -atleast
VCS WARNING V-16-1-50034 No Resource dependencies are configured
.
Service group state computation with ‘atleast’ resource dependency.
Fault/offline state of ‘atleast’ child resources is tolerated if minimum criteria is met for ‘atleast’ dependency. Service group will be still reported online.
.
Online operation of resources linked with ‘atleast’ dependency
Online operation of resources with atleast dependency can be elaborated with following pseudo code
If (child resource is part of 'atleast' dependency)
Then
If (Number of child resources online from set < m)
Then
Do not initiate online of parent resource.
Else If (Number of child resources online from set == m)
Then
At least m child resources from set of n child resources are ONLINE.
Initiate online of parent resource.
Else If (Number of child resources online from set > m)
Online of parent resource has been already initiated.
Do nothing.
End If
Else
Existing dependency.
Initiate online of parent resource if all other child resources has already completed online.
End If
E.g. onlining ora_grp service group which has resources linked with ‘atleast’ dependency
#Group Attribute System Value
ora_grp State vcslx545-vm1 |OFFLINE|
ora_grp State vcslx545-vm2 |OFFLINE|
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
# hagrp -online ora_grp -any
VCS NOTICE V-16-1-50735 Attempting to online group on system vcslx545-vm1
All child resources are started concurrently.
#Group Attribute System Value
ora_grp State vcslx545-vm1 |OFFLINE|STARTING|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd IState localclus:vcslx545-vm1 waiting for children online
cssd IState localclus:vcslx545-vm2 not waiting
cssd State localclus:vcslx545-vm1 OFFLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 IState localclus:vcslx545-vm1 waiting to go online
ocr1 IState localclus:vcslx545-vm2 not waiting
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 IState localclus:vcslx545-vm1 waiting to go online
ocr2 IState localclus:vcslx545-vm2 not waiting
ocr2 State localclus:vcslx545-vm1 OFFLINE
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 IState localclus:vcslx545-vm1 waiting to go online
ocr3 IState localclus:vcslx545-vm2 not waiting
ocr3 State localclus:vcslx545-vm1 OFFLINE
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 IState localclus:vcslx545-vm1 waiting to go online
ocr4 IState localclus:vcslx545-vm2 not waiting
ocr4 State localclus:vcslx545-vm1 OFFLINE
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 IState localclus:vcslx545-vm1 waiting to go online
ocr5 IState localclus:vcslx545-vm2 not waiting
ocr5 State localclus:vcslx545-vm1 OFFLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
As soon as criteria is met, parent resource is started even if some child resources are still in process of onlining.
#Group Attribute System Value
ora_grp State vcslx545-vm1 |PARTIAL|STARTING|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd IState localclus:vcslx545-vm1 waiting to go online
cssd IState localclus:vcslx545-vm2 not waiting
cssd State localclus:vcslx545-vm1 OFFLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 IState localclus:vcslx545-vm1 waiting to go online
ocr1 IState localclus:vcslx545-vm2 not waiting
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 IState localclus:vcslx545-vm1 not waiting
ocr2 IState localclus:vcslx545-vm2 not waiting
ocr2 State localclus:vcslx545-vm1 ONLINE
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 IState localclus:vcslx545-vm1 waiting to go online
ocr3 IState localclus:vcslx545-vm2 not waiting
ocr3 State localclus:vcslx545-vm1 OFFLINE
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 IState localclus:vcslx545-vm1 waiting to go online
ocr4 IState localclus:vcslx545-vm2 not waiting
ocr4 State localclus:vcslx545-vm1 OFFLINE
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 IState localclus:vcslx545-vm1 waiting to go online
ocr5 IState localclus:vcslx545-vm2 not waiting
ocr5 State localclus:vcslx545-vm1 OFFLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
When parent resource completes with minimum criteria met, Service Group is reported ONLINE. Some child resources are still in process of onlining.
#Group Attribute System Value
ora_grp State vcslx545-vm1 |ONLINE|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd IState localclus:vcslx545-vm1 not waiting
cssd IState localclus:vcslx545-vm2 not waiting
cssd State localclus:vcslx545-vm1 ONLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 IState localclus:vcslx545-vm1 waiting to go online
ocr1 IState localclus:vcslx545-vm2 not waiting
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 IState localclus:vcslx545-vm1 not waiting
ocr2 IState localclus:vcslx545-vm2 not waiting
ocr2 State localclus:vcslx545-vm1 ONLINE
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 IState localclus:vcslx545-vm1 not waiting
ocr3 IState localclus:vcslx545-vm2 not waiting
ocr3 State localclus:vcslx545-vm1 ONLINE
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 IState localclus:vcslx545-vm1 waiting to go online
ocr4 IState localclus:vcslx545-vm2 not waiting
ocr4 State localclus:vcslx545-vm1 OFFLINE
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 IState localclus:vcslx545-vm1 not waiting
ocr5 IState localclus:vcslx545-vm2 not waiting
ocr5 State localclus:vcslx545-vm1 ONLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
.
Offline operation/Fault of resources linked with ‘atleast’ dependency
Offline operation OR fault of resources with atleast dependency can be elaborated with following pseudo code.
If (parent resource is ONLINE)
Then
If (dependency type == 'atleast)
Then
If still at least 'm' child resources are ONLINE from set of 'n' child resources
Then
Parent resource can continue to be ONLINE.
No action required.
Else
Parent resource cannot continue to be ONLINE.
Take action according to *rigidity of dependency.
End If
Else
Existing dependency.
Take action according to *rigidity of dependency.
End If
End If
E.g. Offlining child resources while parent resource is still online.
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
#Group Attribute System Value
ora_grp State vcslx545-vm1 |ONLINE|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd Critical localclus 1
cssd State localclus:vcslx545-vm1 ONLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 Critical localclus 1
ocr1 State localclus:vcslx545-vm1 ONLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 Critical localclus 1
ocr2 State localclus:vcslx545-vm1 ONLINE
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 Critical localclus 1
ocr3 State localclus:vcslx545-vm1 ONLINE
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 Critical localclus 1
ocr4 State localclus:vcslx545-vm1 ONLINE
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 Critical localclus 1
ocr5 State localclus:vcslx545-vm1 ONLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
# hares -offline ocr1 -sys vcslx545-vm1
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
#Group Attribute System Value
ora_grp State vcslx545-vm1 |ONLINE|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd Critical localclus 1
cssd State localclus:vcslx545-vm1 ONLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 Critical localclus 1
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 Critical localclus 1
ocr2 State localclus:vcslx545-vm1 ONLINE
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 Critical localclus 1
ocr3 State localclus:vcslx545-vm1 ONLINE
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 Critical localclus 1
ocr4 State localclus:vcslx545-vm1 ONLINE
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 Critical localclus 1
ocr5 State localclus:vcslx545-vm1 ONLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
As shown in snippet above, offline was allowed for critical resource while parent was online. Service group is still reported ONLINE. Similarly fault of critical resource(s) is tolerated till minimum criteria is met. 3 Critical resource (ocr2, ocr3, and ocr4) have faulted but parent continues and service group is still reported ONLINE.
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
#Group Attribute System Value
ora_grp State vcslx545-vm1 |ONLINE|
ora_grp State vcslx545-vm2 |OFFLINE|
#Resource Attribute System Value
cssd Critical localclus 1
cssd State localclus:vcslx545-vm1 ONLINE
cssd State localclus:vcslx545-vm2 OFFLINE
#
ocr1 Critical localclus 1
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 OFFLINE
#
ocr2 Critical localclus 1
ocr2 State localclus:vcslx545-vm1 FAULTED
ocr2 State localclus:vcslx545-vm2 OFFLINE
#
ocr3 Critical localclus 1
ocr3 State localclus:vcslx545-vm1 FAULTED
ocr3 State localclus:vcslx545-vm2 OFFLINE
#
ocr4 Critical localclus 1
ocr4 State localclus:vcslx545-vm1 FAULTED
ocr4 State localclus:vcslx545-vm2 OFFLINE
#
ocr5 Critical localclus 1
ocr5 State localclus:vcslx545-vm1 ONLINE
ocr5 State localclus:vcslx545-vm2 OFFLINE
If offlining child resource can cause violation of minimum criteria, then command is rejected by VCS.
# hares -offline ocr5 -sys vcslx545-vm1
VCS WARNING V-16-1-10287 Online resources depend on resource ocr5. Take them offline first
# echo $?
1
If fault of child resource violated minimum criteria, VCS takes corrective action and failovers service group. When last online resource ocr5 faults, ora_grp is failedover to peer system.
#Group Parent Child
ora_grp cssd ocr1, ocr2, ocr3, ocr4, ocr5. Min = 1
#Group Attribute System Value
ora_grp State vcslx545-vm1 |OFFLINE|FAULTED|
ora_grp State vcslx545-vm2 |ONLINE|
#Resource Attribute System Value
cssd Critical localclus 1
cssd State localclus:vcslx545-vm1 OFFLINE
cssd State localclus:vcslx545-vm2 ONLINE
#
ocr1 Critical localclus 1
ocr1 State localclus:vcslx545-vm1 OFFLINE
ocr1 State localclus:vcslx545-vm2 ONLINE
#
ocr2 Critical localclus 1
ocr2 State localclus:vcslx545-vm1 FAULTED
ocr2 State localclus:vcslx545-vm2 ONLINE
#
ocr3 Critical localclus 1
ocr3 State localclus:vcslx545-vm1 FAULTED
ocr3 State localclus:vcslx545-vm2 ONLINE
#
ocr4 Critical localclus 1
ocr4 State localclus:vcslx545-vm1 FAULTED
ocr4 State localclus:vcslx545-vm2 ONLINE
#
ocr5 Critical localclus 1
ocr5 State localclus:vcslx545-vm1 FAULTED
ocr5 State localclus:vcslx545-vm2 ONLINE