Customizing the ISCAgent User/Group on UNIX®/Linux Systems

4.1 Configuring Mirroring

4.1.12 Customizing the ISCAgent User/Group on UNIX®/Linux Systems

When installing ISCAgent on UNIX®/Linux, an OS username and group of iscagent is created to serve as the default user for the agent process by default (similar to the way cacheusr serves as the user and group name for Caché processes); the main difference is that iscagent is explicitly a “ nobody ” user/group, with no specific permissions. Access to the protected resources that iscagent needs is established when the agent is started by root before dropping privileges, or granted by the Caché instances it uses via cuxagent. To change the user/group that the agent runs as on UNIX®/Linux, do the following: 1. Create (or edit) the file named /etc/iscagent/iscagent.conf.

42 Caché High Availability Guide Mirroring

2. Add (or edit) the following lines, replacing <username> with a valid username and <groupname> with a valid group name:

privileges.user=<username>

privileges.group=<groupname>[,<groupname>[,...]]

Note: You can specify multiple, comma-separated <groupname>s in the privileges.group parameter. This is useful, for example, for sites where an instance requires multiple group permissions to execute cuxagent.

4.1.13 Mirror Tunable Parameters

In addition to the typical configuration options, such as mirror name, IP address, etc., mirroring lets you modify the system tunable parameters via the Advanced Settings section of the [System] > [Configuration] > [Edit Mirror] page and/or the

^MIRROR routine. The mirror system tunable parameters are listed in the following table:

Table 4–1: Mirror System Tunable Options

Default Value Tunable Parameter

2000 milliseconds Quality of Service Timeout

Ack (Received) Acknowledgment mode

Yes Agent Contact Required for Failover

The larger of:

Trouble Timeout Limit * • 5000 milliseconds • 3 * QoS Timeout

* This parameter is adjusted via the Adjust Trouble Timeout parameter option from the Mirror Configuration main menu list of the ^MIRROR routine on the running primary failover member.

For more information, see the following subsections:

• Quality of Service (QoS) Timeout

• Acknowledgment (Ack) Mode

• Agent Contact Required for Failover

• Trouble Timeout Limit

4.1.13.1 Quality of Service (QoS) Timeout

The QoS Timeout parameter indicates the maximum time, in milliseconds, that processes on the primary failover member wait for data to be acknowledged by the backup failover member. If the backup does not respond within the QoS Timeout, it is demoted from active status to catchup mode, which indicates the backup must perform additional work to catch up

(across town, across the state, across the country, etc.), then the QoS Timeout should be adjusted based on the average round-trip latency between the two locations.

In addition, if the acknowledgment mode is set to Committed, the QoS Timeout parameter should include the average amount of time required by the dejournal process on the backup to write the data to disk.

Note: If the QoS Timeout period expires, the primary failover member enters a “ trouble state, ” where it remains until it can determine that the backup failover member knows that it is no longer active, or the Trouble Timeout Limit expires (see Trouble Timeout Limit in this section). Therefore, the maximum interruption to service on the primary failover member is the sum of the QoS Timeout and the Trouble Timeout Limit.

4.1.13.2 Acknowledgment (Ack) Mode

While the backup failover member (and each connected async member) always acknowledges receipt of data (journal updates or keep-alive messages) from the primary, the Acknowledgment mode tunable parameter controls the behavior of the primary during a “ Mirror Synchronization ” process (see the Mirror Synchronization section in this chapter). The acceptable values are:

• Received (default): An acknowledgment is expected by the primary failover member; the acknowledgment should be generated by the backup failover member upon receipt of data.

• Committed: An acknowledgment is expected by the primary failover member; the acknowledgment should be generated by the backup failover member upon writing the update to the mirror journal file on disk. This is analogous to a syn- chronous commit across the mirror.

Note: If the acknowledgment mode is set to Committed, the QoS Timeout parameter should be adjusted as described in Quality of Service (QoS) Timeout in this section.

4.1.13.3 Agent Contact Required for Failover

The Agent Contact Required for Failover tunable parameter controls the behavior of the failover process on the backup failover member in relation to the availability of the ISCAgent on the primary failover member. Possible values include:

• Yes (default) — The backup failover member does not attempt to take over as primary if it is unable to communicate with the ISCAgent process on the failed primary system.

• No — The backup failover member continues the takeover process even if the ISCAgent process on the primary system is unavailable, subject to the following conditions:

– The ^ZMIRROR user-defined routine (see ^ZMIRROR User-defined Routine in this chapter) must be run in the

%SYS namespace.

– The $$IsOtherNodeDown^ZMIRROR() procedure must exist. If the procedure returns 0 (False) or if the procedure does not exist, the backup failover member aborts the takeover process.

– The backup failover member must verify that the primary failover member is down within a specified period of time (see the “ Backup Failover Member ” bullet item in the Trouble Timeout Limit section of this chapter).

4.1.13.4 Trouble Timeout Limit

The Trouble Timeout Limit parameter is used by both failover members, as follows:

• Primary Failover Member — The maximum time, in milliseconds, the primary failover member waits for the backup failover member to recognize that it has entered an inactive state. During this time, the primary failover member enters a “ trouble state ” and no data is written to the journal, thus ensuring that the failover members remain synchronized

44 Caché High Availability Guide Mirroring

if the primary goes down. The primary remains in this state until it can determine that the backup failover member knows that it is no longer active, or the Trouble Timeout Limit expires.

Note: If the backup failover member is stopped gracefully, it notifies the primary that it is exiting and the primary does not enter the trouble state. However, if the backup crashes or loses connection with the primary, the primary failover member enters the trouble state, where it remains until the trouble timeout limit expires or the connection is re-established; the maximum interruption to service on the primary failover member is the sum of the QoS Timeout and the Trouble Timeout Limit (see Quality of Service (QoS) Timeout in this section).

• Backup Failover Member — The maximum time, in milliseconds, the backup failover member has to determine that the primary failover member is not active. It represents the elapsed period of time between the last message from the primary failover member being received and the backup being able to determine that the primary failover member is down.

Note: The backup failover member cannot become the primary until it verifies that the primary failover member is down.

The value specified for the Trouble Timeout Limit should allow enough time to for the backup failover member to determine whether or not the primary is running:

• For systems configured with Agent Contact Required for Failover = Yes, this is the time to contact the ISCAgent on the other primary failover member (typically two or three seconds).

• For systems configured with Agent Contact Required for Failover =No, the time should reflect the time it takes to obtain the result, including the time to execute the code in $$IsOtherNodeDown^ZMIRROR() (see Agent Contact Required for Failover in this section).

4.1.14 ^ZMIRROR User-defined Routine

As previously noted, setting the the Agent Contact Required for Failover tunable paramete to No causes the backup failover member to take over even if the ISCAgent process on the primary system cannot be contacted. This enables continous database operation even when the node hosting the primary failover member is down, the operating system has crashed, or some other such factor has taken the primary system entirely out of operation.

When this option is in use, however, it is necessary to verify that the primary node is actually down to avoid dual primaries. Therefore, failover cannot occur when Agent Contact Required for Failover=No unless

• the ^ZMIRROR user-defined routine exists and can be run in the %SYS namespace

• the $$IsOtherNodeDown^ZMIRROR() procedure exists and returns 1 (True) if the primary system is unavailable

Note: When using Agent Contact Required for Failover=No, InterSystems recommends that the ^ZMIRROR routine be implemented on both failover members before mirroring is enabled. Otherwise, a restart may be required after

^ZMIRROR is implemented.

4.1.14.1 ^ZMIRROR Entry Points

The ^ZMIRROR user-defined routine contains the following entry points. All provide appropriate defaults if they are

Note: InterSystems does not provide technology to determine whether or not the other node is actually down; contact the InterSystems Worldwide Response Center (WRC) for assistance regarding third-party technology that can be used to make this determination.

$$IsOtherNodeDown^ZMIRROR() can be called multiple times during a single “event” and in different circumstances

(for example, when the backup loses its connection to the primary, when an instance is starting up, as part of a retry loop when something has gone wrong, and so on.). If called when the agent on the other node can be contacted, the result from IsOtherNodeDown is ignored in favor of the answer returned from the ISCAgent; if called when Agent Contact Required for Failover=Yes and the agent on the other node cannot be contacted, the result is is ignored because failover is aborted. Therefore, IsOtherNodeDown is significant only in cases in which the node is actually down (powered off) and the agent cannot be contacted.

The commented sample ^ZMIRROR routine provided later in this chapter shows one possible implementation of

IsOtherNodeDown using a "ping strategy" to determine whether the primary system is down. There are a number of

possible implementation strategies based on elements such as SCSI reservations, shared quorum disk-based file-locking, and so on.

Note: In cases in which the node is up but the agent cannot respond (for example, the agent was process was killed), the result returned by IsOtherNodeDown prevents failover and manual intervention is necessary for the backup member to become the primary.

• $$CanNodeStartToBecomePrimary^ZMIRROR() — This procedure is called when an instance is about to begin

the process of becoming the primary member. The instance has determined that the other instance is not currently the primary and that it is eligible to become the primary.

As a general rule, this entry point is not used in a ^ZMIRROR routine. However, sites that wish to block failover members from automatically becoming the primary — either at startup or when connected as the backup — can include logic here to do so. If this entry point returns 0 (False) then the instance enters a retry loop where it continues to call

$$CanNodeStartToBecomePrimary^ZMIRROR() every 30 seconds until it either returns 1 (True) or detects that the other node has become the primary (at which point the local node will become the backup).

• $$CheckBecomePrimaryOK^ZMIRROR() — This procedure is called immediately before a system becomes the

primary failover member, but before any work/updating is done on that system. If this procedure exists and returns 0

(False), the startup sequence is aborted and this node does not become the primary failover member. Acceptable return values: 1(True); 0 (False)

$$CheckBecomePrimaryOK^ZMIRROR() is called after the instance is fully initialized as the primary failover

member, all mirrored databases are read/write, ECP sessions have been recovered or rolled back, and local transactions (if any) from the former primary have been rolled back. No new work has been done because users are not allowed to log in, superserver connections are blocked, and ECP is still in a recovery state.

This is where you can start any local processes or do any initialization required to prepare the application environment for users. If CheckBecomePrimaryOK returns False (0), the instance aborts the process of becoming the primary member and returns to an idle state.

Note: If CheckBecomePrimaryOK returns False, ECP sessions are reset. When a node succeeds in becoming the primary, the ECP client reconnects and ECP transactions are rolled back (rather than preserved). Client jobs receive <NETWORK> errors until a TRollback command is explicitly executed (see the ECP Rollback Only Guarantee section in the “ ECP Recovery Guarantees and Limitations ” appendix of the Caché Distributed

Data Management Guide).

In general CheckBecomePrimaryOK is successful; however, if there are “ common cases ” in which a node does not become the primary member, they should be handled in CanNodeStartToBecomePrimary rather than

CheckBecomePrimaryOK.

46 Caché High Availability Guide Mirroring

• NotifyBecomePrimary^ZMIRROR() — This procedure is executed for informational purposes after a system has

successfully assumed the role of primary failover member. Acceptable return values: N/A

NotifyBecomePrimary^ZMIRROR() is called at the very end of the process of becoming the primary failover

member (that is, after users have been allowed on and ECP sessions, if any, have become active). This entry point does not return a value. You can include code to generate any notifications or enable application logins if desired.

• NotifyBecomePrimaryFailed^ZMIRROR() — This procedure is executed for informational purposes when a system

fails to assume the role of primary failover member. Acceptable return values: N/A

NotifyBecomePrimaryFailed^ZMIRROR() is called:

– When a failover member starts up and fails to become the primary or backup member.

– When the backup detects that the primary has failed and the backup fails to take over for the primary.

This entry point is called only once per incident. If the node becomes the backup again (that is, connects to the primary), then NotifyBecomePrimaryFailed is called again if the backup fails to take over as the primary; however, once it is called, it is not called again until the node either becomes the primary or the primary failover member is detected.

4.1.14.2 Sample ^ZMIRROR Routine

A commented sample implementation of ^ZMIRROR is provided in the following:

ZMIRROR ;

quit ;don't enter at the top #include %occStatus #include %syMirror #ifndef FailoverMemberType #define FailoverMemberType 0 #endif /*

THIS ROUTINE IS PROVIDED AS-IS, WITHOUT WARRANTIES. IT IS MEANT TO BE AN EXAMPLE/SAMPLE ZMIRROR ROUTINE. YOU SHOULD TAILOR THE ZMIRROR ROUTINE BASED ON YOUR INFRASTRUCTURE AND CONFIGURATION.

FOR EXAMPLE, IF THE 2 FAILOVER MEMBERS

AREN'T CONNECTED VIA A RELIABLE, REDUNDANT NETWORK, THEN YOU MUST *NOT* SET AGENTCONTACTREQUIRED TO FALSE

(I.E., ALWAYS RUN WITH THE DEFAULT AGENTCONTACTREQUIRED=TRUE) - THIS IS BECAUSE YOU CANNOT DEFINITIVELY TELL WHETHER THE NETWORK BETWEEN THE 2 SYSTEMS IS DOWN, OR WHETHER THE OTHER SYSTEM ITSELF IS DOWN.

IF, HOWEVER, YOU HAVE A RESILIANT, RELIABLE, REDUNDANT NETWORK BETWEEN THE 2 SYSTEMS, THEN YOU MAY EXPLORE THE POSSIBILITY OF SETTING AGENTCONTACTREQUIRED TO FALSE. IN THIS CASE, YOU MUST PROVIDE AN ADEQUATE IMPLEMENTATION IN THE ^ZMIRROR ROUTINE. NAMELY, YOU MUST IMPLEMENT

AN APPROPRIATE IsOtherNodeDown^ZMIRROR() FUNCTION WHICH WILL DEFINITIVELY BE ABLE TO TELL WHETHER OR NOT THE OTHER NODE IS DOWN (IF YOU CANNOT PROGRAMATICALLY DEFINITIVELY TELL WHETHER THE OTHER NODE IS DOWN, YOU *MUST* ASSUME IT IS UP AND SUBSEQUENTLY PERFORM A MANUAL FAILOVER).

ALSO, YOUR IMPLEMENTATION OF ^ZMIRROR (SHOULD YOU CHOSE TO IMPLEMENT ONE) SHOULD HAVE ADEQUATE ERROR TRAPPING.

Here's the general algorithm for the code in the example: 1. Try to determine if this node (the one trying to become primary) is isolated (i.e., if it can or cannot access other machines).

Do this by:

a. If an IP address or FQDN is set in the ^ZMIRROR("CONFIG","KNOWN-IP") global, try to ping it. This could be set to a machine that's highly availble inside the network - i.e., a machine that is trusted to be up.

b. If no IP is set in ^ZMIRROR("CONFIG","KNOWN-IP"), try to ping www.google.com. This assumes

that the system can talk to the internet and PING is allowed through the firewall.

These mechanisms are trying to determine whether the local system has become isolated from the network or not. If the local machine is no longer "on the network" then it cannot use a ping to determine whether the other node is up or down so it is very important to set the ^ZMIRROR("CONFIG","KNOWN-IP") node to an appropriate internal machine's FQDN (or IP).

3. The sample then accesses the Mirror configuration via the Config.Mirror* classes, and extracts all of the IP addresses which belong to the node being tested.

These are the:

Mirror Private address, ECP address

Public (SuperServer) address

It is best if there are multiple addresses which are assigned to different NICs so that the failure of a single network card does not make the node appear to be down. If less than 2 configured, we pretend like the other system is up.

4. Each IP address is then PINGed and if the ping returns, the other node is considered to be UP.

5. If all the IPs have been exhausted (and the system wasn't reachable on any of the configured IPs)

a) If only 1 IP was configured, return "Other node is UP"

b) If more than 1 IP was configured, return "OTHER NODE IS DOWN" The sample logs the results of various stages in the ^ZMIRRORINFO global. This can be viewed using the Global Explorer in the System Management

Portal or by issuing 'zw ^ZMIRRORINFO' in the %SYS namespace. */

IsOtherNodeDown() PUBLIC {

quit 0 ;Remove this after you have tailored the routine to the specific configuration. set $zt="IsOtherNodeDownErr"

do logMsg("IsOtherNodeDown^ZMIRROR() invoked") set isdown=0

#;Check if the ^ZMIRROR("CONFIG","KNOWN-IP") node contains an #;address of a node on this network

#;if none specified, try google.com. This will only work if this #;system can talk to the internet

#;set anotherSystemIP to an IP of a node in the network:

set anotherSystemIP=$get(^ZMIRROR("CONFIG","KNOWN-IP"),"www.google.com") #;Check if anotherSystemIP is reachable - if not, we're probably

#;isolated, so we should assume that the other node is up #;(to prevent split brain) and ABORT

if '##class(%SYSTEM.INetInfo).CheckAddressExist(anotherSystemIP) { set msg="IsOtherNodeDown^ZMIRROR() could not reach external IP " _anotherSystemIP_" - we *may* be ISOLATED. Abort takeover" do logMsg(msg)

goto IsOtherNodeDownDone }

set msg="IsOtherNodeDown^ZMIRROR() was able to reach to external IP " _anotherSystemIP_" - we're not isolated, so continuing" do logMsg(msg)

#;Next, pull out all the info from the various configuration pieces that #;are needed for us to talk to the other node.

#;This is version specific (since the implementation for storage of #;mirror information in the CPF file changed in 2012.2).

set majorVers=##class(%SYSTEM.Version).GetMajor() set mirMemberConfig=##class(Config.MirrorMember).Open() if '$IsObject(mirMemberConfig) {

set msg="IsOtherNodeDown^ZMIRROR() couldn't open Mirror Member config." _"Abort takeover"

do logMsg(msg)

goto IsOtherNodeDownDone }

48 Caché High Availability Guide Mirroring

set ourMirName=mirMemberConfig.SystemName if majorVers<2012 { set rs=##class(%Library.ResultSet).%New("Config.MirrorSetMembers:List") set rc=rs.Execute() } else { set rs=##class(%Library.ResultSet).%New("Config.MapMirrors:List") set rc=rs.Execute($System.Mirror.MirrorName()) } if $$$ISERR(rc) {

set msg="IsOtherNodeDown^ZMIRROR() couldn't open Mirror Set Member config." _"Abort takeover" do logMsg(msg) goto IsOtherNodeDownDone } s found=0 while (rs.Next()) { set name=rs.Data("Name") quit:name="" ;out of mirror members if (name'=ourMirName) {

#; Prior to 2012 only the failover members were listed in the MirrorSetMember #; list. Starting in 2012 all mirror members are listed and there is a member #; type field. In both cases there can only be two failover members and we've #; already filtered ourself out above so the next failover member we find #; is the system we're looking for.

if majorVers<2012 {

set MemberType=$$$FailoverMemberType ;see %syMirror.inc } else {

set MemberType=rs.Data("MemberType") ;type values are in %syMirror.inc } if MemberType=$$$FailoverMemberType { set found=1 set agentip=rs.Data("AgentAddress")

In document Caché High Availability Guide (Page 48-62)