XI3.0/PI 7.0: Runtime Load Balancing Control in HA Environment
Description
This blog describes the load balancing mechanism of XI3.0/PI7.0 in a High Availability Environment. It provides a holistic overview of runtime load distribution for XI/PI ABAP application servers, Java server nodes, adapter engine and SAP Web Dispatcher. Not only RFC load balancing but also message flow and mapping request are discussed.
Prerequisite
You are using XI3.0 or PI7.0. You activated different load balancing methods for different resources in the dual stack system. But you want to check if these methods are taken into affect during runtime and if the implemented load distribution strategy is correct. Probably you also want to have an on-demand load balancing by tuning some parameters during runtime.
System Overall Capacity
Each application server has a capacity value for each of the services it provides (ABAP, Java). The capacity is used as an estimated value of the actual "power" of an application server. As it is safe to assume that more dialog processes and server nodes are configured on more powerful machines, this number can be seen as an approximate benchmark. The SAP Web dispatcher needs information about the capacity of a server in order to balance its workload.
The capacity of a list of all application servers can be retrieved from URL:
http://
(The message server port is defined by parameter 'ms/server_port_
Here the 'LB=xx' following 'DIAG' indicates the capacity value of ABAP and 'LB=xx' following 'J2EE' indicates the capacity value of Java. In fact the numeric value after 'LB' of line 'DIAG' equals to the number of dialog work processes on each application server and the numeric value after 'LB' of line 'J2EE' equals to the number of server nodes. The Web dispatcher takes the maximum value of both values as the standard capacity setting for this application server.
Although these values exactly reflect the pre-configured system recourses (dialog work processes or server nodes), they can be changed arbitrarily on demand when system is running (thinking about some customer doesn’t want central instance to get too much load). This will be discussed in the 'Web Dispatcher' part.
Web Dispatcher
In order to perform load balancing, the SAP Web Dispatcher periodically fetches a list of all active application servers of an SAP system. This list includes the host names as well as a static value indicating the capacity of each server (see 'System Overall Capacity' part). So there are majorly two factors determining the load balancing by SAP Web Dispatcher: load balancing strategy and server capacity.
1. Load Balancing Strategy
You can configure the details of the load balancing strategy in the profile parameter wdisp/load_balancing_strategy. Two alternative strategies can be defined:
a. Simple Weighted Round Robin
Parameter value: simple_weighted_round_robin
Each server with capacity k receives precisely k requests in succession, before the next server takes over. This process is simple and deterministic since it contains no dynamic elements.
(This process, especially with end-to-end SSL could lead to unexpected results, if many individual servers have to process too many successive requests. For more details refer to SAP link: http://help.sap.com/saphelp_nw70/helpdata/en/5f/7a343cd46acc68e10000000a114084/frameset.htm)
b. Dynamic Weighted Round Robin
Parameter value: weighted_round_robin (default)
The load is balanced using a load factor. The server with the lowest load factor receives the next request. If a server is assigned a request, the load factor is increased in proportion to the reciprocal value of the server capacity. The load factor is apparent from the Web Administration Interface. As the load factor is constantly changing, the information about the "next preferred server" is simply a snap-shot situation.
2. Server Capacity
The runtime system capacity is shown in the SAP Web Dispatcher Admin page from URL: http://
Here the server capacity can be seen in the Capacity column under Monitor Server Groups. It is fetched via HTTP from the SAP Message Server. Since in XI/PI both ABAP and Java have Message Server, the parameters rdisp/mshost and ms/http_port (for the message server, this port must be configured as an HTTP port, by setting parameter ms/server_port_
The ratio against servers is considered rather than the value itself. For example if application server 1 has twice capacity value as application server 2, the number of processed messages on application server 1 is approximately twice as application server 2 as well. As said before this ratio can be overwritten on demand.
**Be careful that you should only overwrite the capacity if you have already determined it needs to be changed while the system was running. In another word, the standard setting that the Web dispatcher gets from the message server is normally suitable.
There are generally three ways to overwrite the capacity value:
a. Change directly from the Web Dispatcher Admin page.
Simply right click the Capacity label. This change will take effect immediately but lost after Deb Dispatcher restart.
b. Set Web Dispatcher profile parameter.
You can overwrite the capacity value permanently by setting the profile parameter wdisp/server_
wdisp/server_
whereby
This change requires a restart of Web Dispatcher.
c. Change the location where the server capacity stores.
The URL to retrieve the list is determined by the SAP Web Dispatcher profile parameter wdisp/server_info_location that by default is /msgserver/text/logon.
You can create a file 'info.icr' under the same folder as Web Dispatcher and set the value of wdisp/server_info_location to the path of the 'info.icr' file, e.g. file://info.icr/.
The structure of the 'info.icr' file is as the capacity list of all application servers. For more details refer to SAP note 645130 section 3.
RFC Logon Group
Here will discuss how to check the runtime RFC load balancing. The configuration of RFC load distribution by logon group is not a major concern. But the load balancing strategy should be taken into account to understand the runtime statistics. At the end of this part, a test tool "lgtst" will be introduced to test the load balancing.
Logon group concept was introduced to control receiving RFC requests. Normally these RFC calls are sent to a system via RFC destination, providing user and password or on the receiving system treated like online users logging on to a system.
In XI/PI system thousands RFC calls could occur in a very short period of time. One application server may be overloaded immediately while other servers receive no RFC calls since the status of Logon Group is checked every 5 minutes. Therefore dynamic logon load distribution was introduced with two load balancing strategy: Best Quality and Simple Round-robin (SAP Note 593058 - New RFC load balancing procedure).
Afterwards a random number was introduced since release 6.20 during simple round-robin load balancing. As a result, different RFC programs having differently sorted lists of application servers. This ensures that not all RFC programs connect to the application server in the same sequence.
As of Support Package 15 with Release 7.00 or Support Package 5 with Release 7.10, a new strategy named Weighted Round-robin procedure can be used to enhance the simple round-robin (SAP Note 1112104 - Weighted round robin procedure).
Go to SMLG and choose Msg-Srv Status Area. You can see the quality of each application server. Server with better quality always has higher value in the quality column. Refresh the view and check the quality change from time to time. If you discover a server has high quality value but always has low load, you should verify whether the correct load balancing strategy is chosen.
Double-click required logon group in the Logon Favorite Storage list. Version=1 means dynamic logon load distribution is not activated. You have to go to SE16 and change the table RZLLICLASS accordingly. Details are in SAP link:
http://help.sap.com/saphelp_nw70/helpdata/en/28/1c623c44696069e10000000a11405a/frameset.htm
Test Program "lgtst" is also available from SAP Service Macketplace, which can be used to check the load balancing procedure (SAP Note 64015 - Description of test program lgtst).
You can check which application server is used in sequence for each RFC call by this program on OS level:
You can also check the available information about an RFC logon group:
**If you find the load balancing does not meet your expectation, e.g. too many RFC calls on one application server, you have to review your load balancing strategy and choose the most suitable one.
Java Server Node
You can check if load is distributed on all server nodes for certain adapter type (FTP, JDBC, JMS…) or all adapter types via link: http://
You can choose to check sent or received messages. Click into Configure Table Columns and select Node ID. Then the Node ID will be shown in the searching result. Sort the result by Node ID to get an overview of the load distributed on different server nodes.
A better way to get the idea of message load distribution in peak hour is to use Extended Filtering function if you already know your node ID. You set the timeframe for peak hour and specify the node ID. You only need to take care of the total message number on the top without displaying all the messages. This is helpful that you do not have to count thousands of messages yourself during peak hour and you can also avoid the memory issue caused by displaying too many messages.
You can dig into special adapters by selecting Connection Name on the same page.
**If you have several server nodes and find that some of them have significant less or no messages processed during certain period, you can start your troubleshooting.
Mapping Request
It is not easy in runtime to identify mapping request distribution. Keep in mind that the mapping requests sent through Gateway to Java server processes using simple round-robin strategy. So the number of registered processes for destination AI_RUNTIME_
Complementary Documents
This blog doesn’t discuss the implementation of load balancing. However below documents can provide a better understanding of load balancing in XI3.0/PI7.0: