Deploy OCS 2007 R2 and OCS Edge 2007 R2: With Full Redundancy and High Availability (Load Balanced)
One of big challenges in deploying Office Communication Server 2007 R2 with OCS 2007 R2 Edge Services, was reading Microsoft’s OCS documentation, digesting it, and then deploying a High Availability solution. My goal was to create redundancy in every layer, however Microsoft assumes many things, including customers using NAT, stand-alone servers and split DNS within their networking environment. In my case, we don’t use Split DNS, only have one Hardware Load Balancer, support multiple domains and use Publicly routable IP addresses for each server. This goes against most of Microsoft’s canned documentation which makes HA environments tougher to design.
I will first go into the conceptual view of what is needed to make this deployment scenario work. Hopefully this article will help a few of you better understand OCS. Sometime soon, I will post in depth install instructions for each component and server role. The idea of me writing this was to help others fundamentaly understand all of the OCS 2007 R2 moving parts before you begin installing servers.
Here is a conceptual Visio diagram of an OCS 2007 R2 working environment with Edge Services deployed. There are many ways to deploy this type of environment and many variables that can change the design, such as virtualization and network architecture.
- 2 – Office Communication Server R2 Front-End Servers
- 2 – SQL 2008 Back-End Servers (Active\Passive Cluster)
- 2 – Office Communication Server R2 Edge Servers
- 2 – ISA 2006 Servers (Virtual Machines)
8 Servers in Total.
- Each physical server is running Windows 2008 Standard – 64 Bit – Xeon Quad Core 2.66 GHz Procs – 4 GB RAM – 73 GB Drives)
- Each ISA Server (VMware) is Windows 2003 Standard – 32 Bit – Quad Core 2.66 GHz – 2 GB RAM – 40 GB Drive)
We are using a MD3000 Storage appliance for the OCS SQL Databases. You will need 8 LUNs or Cluster Resource Drives to properly create the cluster for OCS. I roughly allocated 450 GB total to this project.
- The Storage is 5, 300 GB SAS drives carved up as Raid10, assigned as 4 data and 1 hotspare.
- We are keeping 60 days worth of Archived data.
- The Passive SQL Node is also the Archiving server.
- The MSMQ Service is running on the OCS Front-End and Back-End servers for Archiving Purposes.
- We use NIC teaming on most of our physical machines for redundancy and better performance.
- The OCS Front-End Pool Servers have 2 physical NICs that were used to create a single Virtual teamed NIC.
- The SQL Back-Ends have 3 physical NICs, one teamed pair and a standalone NIC for the cluster Heartbeat.
- The Edge servers have 4 physical NICs to create 2 virtual teamed NICs. The Internal NICs have no gateway defined.
- The ISA servers have virtual NICs, so NIC teaming is not required.
Subnet Break Down
You will need at least 5 different VLANs, or subnets. VLAN1,2, and 4 will all be behind the Load Balancer. These subnets MUST be behind the Load Balancer to effectively route between the OCS Servers.
* Since the OCS Edge Servers are Dual-Homed, and since we only have 1 Hardware Load Balancer, we decided to place the Internal and External OCS Edge NICs on the same VLAN. The External NICS are the only NIC with a default gateway. Because of this, you must specify persistent, static routes on the Internal OCS Edge server NICS to map to the OCS Front-End Pool servers (which are also behind the Load Balancer, but on a different VLAN, hence a mess dealing with routing issues). The other option is to place the OCS Edge NICS on the same VLAN. Microsoft recommends you must place these NICs are different VLANs, however we have locked down the servers and ports and ensured they are shielded by the external and local OS based Firewall.
I would like to add that Windows 2003 handles dual-homed configurations more eloquently and routing behaviors changes dramatically with Windows 2008. In our testing, adding perisistent routes were not affective as packets were still trying to go out the default gateway. Jeff Schertz has written a great article going into more detail about the Windows 2008 Strong Host model in regard to deploying OCS 2007 R2 Edge servers: http://blogs.pointbridge.com/Blogs/schertz_jeff/Pages/Post.aspx?_ID=78
VIP and Load Balancing Break Down
- You will need 6 VIPs, or Virtual IP addresses that are all on the same VLAN and are Hardware Load Balanced.
- The VIP names are also the External DNS names for OCS clients talk to. This simplified certificates name requirements. (SANs). Some admins even make OCS-Access VIP SIP.domain.com for even more simplicity.
- * The OCS-AV.Contoso.com and it’s Load Balanced IPs assigned to the physical server External OCS-AV NICs must have port 443 open to the world. If they don’t, Live Meeting will not function properly. (So in this example, IP’s 100.100.231.69, 100.100.229.143, 100.100.229.144 need port 443 open to the Net).
- The port exceptions are also true for the OCS Internal Pool Front-End servers. The Firewall exceptions listed above for the OCS-Pool.contoso.com VIP must also be opened on the Pool Front-End Server IPs (Internal Network only) . (100.100.230.21 and 100.100.230.22). This is due to A\V requirements for the client to talk directly to the OCS Pool Servers.
Once you get to the point where you are installing certificates on the OCS Edge Servers, do not install internal PKI certificates on your Access Edge, A\V Edge or Reverse Proxy and attempt to achieve any valid testing. Basic IM and A\V will work, assuming the internal Certificate Authority certificate is installed and trusted on each testing machine, however Live Meeting will not function properly. The Live Meeting client does CRL (Certificate Revocation Checking) upon the initial connection. It will retrieve the CRL URL from the certificate and attempt to verify the certificate over port 80. In most cases, your internal Certificate Authority or Firewall should block all foreign port 80 requests from external IPs. If you are accessing Live Meeting from a public ISP, the client will fail while looking up your internal CA, dumping misleading errors about authentication, permissions and unable to connect messages. It wasn’t until I performed packet captures that I learned this behavior.
Also be sure to install your internal Certificate Authority into your Trusted Root List on your OCS Edge Servers. If you are running an Intermediary Certificate, place it is the Trusted Intermediary Trust Certificate List (Local Computer)
DNS and SRV Records:
DNS and SRV records must be created for every domain you plan on supporting. You must also point the SRV records at the correct DNS zone. As an example, if you have Vlab.com and NWtraders.com, you must create the appropriate DNS and SRV records for each domain. If you have multiple domains pointing at a single OCS Edge array, instead of pointing the SRV records at OCS-Access.vlab.com, point them at the Host (A) record of Sip.vlab.com for domain 1, and Sip.NWtraders.com for domain 2, etc. The sip.Vlab.com and sip.NWtraders.com domains will point at the same IP as the OCS-Access.vlab.com VIP. This way you can support multiple domains and use a single IP. The certificate must constain the sip.domain.com names and Subject Alternate Names. If you attempt to point the _sipfereationtls._tcp.Contoso.com SRV record at Ocs-Access.vlab.com, OCS federation will break, because of the DNS zone mismatch. To fix this, point _sipfereationtls._tcp.Contoso.com at SIP.contoso.com.
Other Issues and Gotchas:
- Client Migration: If you are running LCS 2005 and are migrating users to OCS 2007 R2, you have several options on the migration path. I would recommend you migrate your users to OCS 2007 R2 first, change DNS\SRV records, and then upgrade your clients at your convenience. When the user is migrated, you must enable Enhanced Presence. The user can continue to use the Communicator 2005 client until they log into Communicator 2007 for the first time. Once they log in with the new client, they cannot go back and use the 2005 client. Along with Enhanced Presence, a flag is set in the OCS 2007 R2 database disabling the ability to use the older client once a new 2007 client is used.
- Use the OCS 2007 Capacity and OCS 2007 Edge Planning Tools Diligently. These are great when planning and designing your individual needs. When you get to the options in the Edge Planning Tool where it asks you for the External and Internal Firewall IPs, if you are running only one Firewall, these IPs (VIP IPs) are the same. This is true for the OCS-WebCon and OCS-AV VIPs as well.
2. Install Forefront for Office Communication Server 2007 R2 on both the Internal OCS Pool and OCS Edge Servers.
- OCS 2007 Planning Tool: http://www.microsoft.com/downloads/details.aspx?familyid=FDF32585-C131-4832-8E27-67E70636C1E8&displaylang=en
- OCS Edge Planning Tool: http://www.microsoft.com/downloads/details.aspx?familyid=EC4B960C-3FE2-41BD-ABDF-AE89CFCB8C6C&displaylang=en
- OCS 2007 R2 Documentation: http://www.microsoft.com/downloads/details.aspx?familyid=E9F86F96-AA09-4DCA-9088-F64B4F01C703&displaylang=en
- Strong Host Model and OCS 2007 R2: http://blogs.pointbridge.com/Blogs/schertz_jeff/Pages/Post.aspx?_ID=78
I hope this article helped articulate some of the nuances of OCS 2007 R2. This is one way I was able to get OCS 2007 R2 deployed and working in a production environment. There may be better ways to do it, especially with varying needs and environments. As always, if you have any questions, let me know.
Best of luck!