I’ve recently performed a CRM 2015 installation on Azure. It was a relatively complex installation, involving multiple load-balanced web servers, multiple app servers, and multiple SQL servers configured for High Availability.

    We installed 4 web servers in total and have them load balanced. Which means there is another server that sits in front of those 4, and that is the one users will actually be accessing. That load balancer then redirects the user to one of the available CRM servers, usually based on a simple round-robin distribution strategy.
We’re using an Azure Application Gateway as the load balancer. The Application Gateway is a very powerful tool with some excellent features that make it significantly better than the windows server load balancing that I’ve typically used in the past. In addition to the basic load balancing, we’re also taking advantage of the SSL Offloading capability, and the Affinity Cookie feature.

    The Azure Application Gateway is a Layer 7 load balancer, sometimes referred to as a “reverse-proxy”. Which basically means that it is a “smart” load balancer, it is capable of making routing decisions based on the content of the http messages. As opposed to a Layer 4 load balancer, which simply distributes traffic with absolutely no regard for the content. In our case we’re not really taking advantage of its full potential. But it’s good to know that the gateway supports much more advanced and complex usage scenarios, should the need ever arise.

    The first gotcha that I ran into: Following the Azure documentation to create the Application Gateway will NOT work initially. After successfully configuring the gateway, I was only getting 502 errors when browsing to CRM through the load balancer. But when hitting the servers directly, CRM was working.

This problem happens because the sample configuration they provide uses a default Health Probe. The probe is how the Gateway automatically detects whether a server is running or not. If it fails to receive an HTTP Success response, then the gateway will remove that server from the pool. This was difficult to troubleshoot initially, because the gateway does not provide any meaningful error message. But after much investigation, I made an educated guess that if the probe is hitting the root url, and obviously has no user credentials, then CRM must be throwing a “Not Authenticated” exception back to the probe. Which the probe then interprets as not successful, and so all the servers get removed from the pool.
    To resolve this we must specify a custom probe in the gateway configuration. The probe requires a URL that allows anonymous access, and at the same time we want it to be a valid test of whether CRM is alive on each server or not. So, the best choice is to use the CRM Discovery endpoint, since it satisfies both of those criteria: http://crmfrontendserver/XRMServices/2011/Discovery.svc
After I made that change, CRM immediately started working through the App Gateway.
Sample config XML for the probe to support CRM:
<Probe>
 <Name>Probe01</Name>
 <Protocol>Http</Protocol>
 <Host>frontendservername</Host>
 <Path>/XRMServices/2011/Discovery.svc</Path>
 <Interval>15</Interval>
 <Timeout>45</Timeout>
 <UnhealthyThreshold>5</UnhealthyThreshold>
</Probe>

    Another thing to watch for, if you are using the App Gateway for SSL Offloading. Then you will need to set the SSL Header in your CRM Deployment Configuration. In the CRM Deployment Manager, right click on Microsoft Dynamics CRM, and click Properties. Then go to the Web Address tab, and click on Advanced. Be sure that “This deployment uses an NLB” is enabled. And also in the SSL Header, fill in: “FRONT-END-HTTPS:on”. That is a standard header used by most Microsoft tools. Other third party SSL Offloading tools would require a different header.

    The last cool feature of the Application Gateway is the Affinity Cookie. This feature sets a cookie on the users local machine, that tracks which web server that user was directed to at the beginning of their session. The Gateway will then continue to direct that user to the same web server until the cookie expires. This is good for the users because they will get good and consistent performance. It is also more resilient to problems that could be caused by apps or customizations that don’t properly support load balancing. The minor downside is that there is a statistical chance that one web server might get heavily loaded while the other servers are used less. The really good news is that there are no additional changes required to make this work with CRM. You simply enable it on the Gateway, and it “just works”.