An SOA odyssey

Thursday, December 15, 2005

Using NLB

One of the design requirements for our service-oriented infrastructure was to support a scale out architecture. For our needs we chose to use Microsoft's Network Load Balancing (NLB) system.

Early on we created a reference architecture in order to test the integration between our various components including EDAF services, BizTalk 2004, legacy ASP code, and Ultimus BPM 7.1. In our test environment we set up two clusters, one for the EDAF services and one for the Ultimus services.

In the reference architecture simple tests of both the EDAF services cluster and the Ultimus cluster were performed. In the case of the former 200 requests were sent through one of our orchestrations which ended up invoking our constituent service via the service agent framework we devleoped. In the case of the latter 50 requests that created Ultimus incidents through our facade service at a time were spawned by a test harness. In both cases the tests (using performance monitor) revealed that all requests from a particular client were being serviced by one machine in the cluster. The machines servicing those requests were the servers that had the higher priority set in the NLB configuration in the Host Parameters tab.

NLB settings were then reconfigured from single to "no affinity" and balancing load at 50% in the port rules from within the NLB UI. Tests were then re-executed with no difference in the results.

Details regarding the Microsoft NLB algorithm were consulted. The important paragraph of this document is:

"When inspecting an arriving packet, all hosts simultaneously perform a mapping to quickly determine which host should handle the packet. The mapping uses a randomization function that calculates a host priority based on their IP address, port, and other information. The corresponding host forwards the packet up the network stack to TCP/IP, and the other cluster hosts discard it. The mapping remains unchanged unless the membership of cluster hosts changes, ensuring that a given clients IP address and port will always map to the same cluster host. However, the particular cluster host to which the clients IP address and port map cannot be predetermined since the randomization function takes into account the current and past clusters membership to minimize remappings."

In other words, when “no affinity” is set, the host that services an incoming request is determined by the IP address and client port number in a deterministic fashion based on a randomization algorithm and the number of servers in the cluster.

As a result we looked into how client port numbers are generated on client machines.

It turned out that within our Service Agent framework we have a ServiceAgentBase class from which all servicce agents are derived. Within this class requests are executed using the SoapHttpWebClient class. Internal to this class the following code is in the constructor.

public SoapHttpWebClient(string requestUri)
{
_requestUri = requestUri;
_httpWebRequest = (HttpWebRequest)HttpWebRequest.Create(requestUri);
_httpWebRequest.KeepAlive = false;
_httpWebRequest.Method = "POST";
_httpWebRequest.ContentType = "text/xml;charset=\"utf-8\"";
_httpWebRequest.Accept = "text/xml";
}

Tests using a throw away application and a network monitor utility revealed that requests were load balanced when the System.Net.Sockets.TcpClient was used to generate the requests but not when HttpWebRequest was used as in the constructor code above. This was the case since using TcpClient generates a new connection and therefore a unique port number for each request.

However, by adding the following line of code to the constructor above…

_httpWebRequest.KeepAlive = false;

the HttpWebRequest class creates a new connection for each request which allows the TCP stack on the machine to assign a unique port number to the request. Making this change to the SoapHttpWebClient entails a slight performance penalty on the client machine since connections cannot be reused. As a result, we added a configuration setting, <KeepAlives> to the configuration section for the service agent so that clients have the option of using this setting (with the default set to false).

1 Comments:

Blogger qrswave said...

I'd love to make a substantive comment.

too bad I'm clueless...:)

thanks for stopping by my blog!

2:48 PM

 

Post a Comment

<< Home