02-08-2010 10:02 AM
Hi Guys ,
looking for some help from all you to figure out how GSLB works , any help would be appreciated :-
Question 1 : How does the GSLB controller configured as dns-proxy behaves when the request comes in for the first time, does it assign the ip address for the requested site based GSLB policies like health check , RTT , Flash-back etc or Does it assigns the ip address to the query just based on the HASH INDEX that it uses for hash-based persistence ?
Question 2 : I am just unable to use ssl health-checks in the zone dns - host-info www ssl configuration ? I guess i am missing something , or i have no idea on how to configure health checks for ssl in the dns zone configuration ? so would be great if you can post an example of it.
Question 3 : GSLB controller Redundany - I have two si's in a pair and i have configured on of them as a GSLB controller and gave the virtual dns server sym-priority of 200 and then configured the same configuration on the pair si but changed the sym-priority to 100 , but it seems like that even though the pair si with sym-priority 100 is a backup it still maintains all the information of sites & keeps record of everything eventhough its not being used at that time which is waste of resources as far as i am thinking , so please let me know if there is a better way of getting GSLB Controller redundancy ?
Thank you so much
02-17-2010 01:12 AM
GSLB uses GSLB policy to determine which metric to use. Below is the default order it chooses when evaluate a server's IP addresses in a DNS reply. You can change the order/disable some in a GSLB policy. use command "show gslb policy" to see current ordering..
• The server’s health
• The Weighted IP value assigned to an IP address
• The Weighted Site value assigned to a site
• The site ServerIron ADX’s session capacity threshold
• The IP address with the highest number of active bindings
• The round-trip time between the remote ServerIron ADX and the DNS client’s sub-net
• The geographic location of the server
• The connection load
• The site ServerIron ADX’s available session capacity
• The site ServerIron ADX’s FlashBack speed (how quickly the GSLB receives the health check results)
• The site ServerIron ADX’s administrative preference (a numeric preference value you assign to influence the
GSLB policy if other policy metrics are equal)
• The Least Response selection (the site ServerIron ADX that has been selected less often than others)
• Round robin selection (an alternative to the Least Response metric)
More info on this topic in admin manual here http://www.brocade.com/support/Product_Manuals/ServerIron_ADXGlobalServer_LoadBalancingGuide/gslb.2.2.html#85119
and check section "changing GSLB policy metrics" http://www.brocade.com/support/Product_Manuals/ServerIron_ADXGlobalServer_LoadBalancingGuide/gslb.2.7.html
If your sites have ServerIron(s), you should leverage GSLB protocol for distributed health checking which offloads the GSLB controller doing health checks and gives you more flexibility with the range of health checks (including ssl) at local ServerIron level. GSLB controller does not support https healthchecks directly as part of dns zone . Check the "distributed health checks for GSLB" section in the admin manual here for further info. http://www.brocade.com/support/Product_Manuals/ServerIron_ADXGlobalServer_LoadBalancingGuide/gslb.2.17.html#85572. If this does not suffice, can you elaborate on your environment
GSLB can leverage the same HA deployments are regular server load balancing. (i.e. Active-active, Symmetric-Active and Hot Standby) or have independent GSLB controllers. What you have configured is fine. What impact are you referring to with "waste of resources" as information needs to be sync'd to be in HA..
02-17-2010 09:07 AM
Thank you Mr. Zanager, you answers for question no. 2 and 3 are perfect and are very helpful. But As per brocade since I ended up opening a case with them. GSLB controllers , when are configured to do HASH persistence doesnot use any of the mentioned metric that are described in the gslb policy but only check for the state of health of the vips. Reason : If hash based persistence is enabled then every request for a dns query is seen as a hash index value ranging from 0 to 255 , these values then corresponds to different ip's for a domain and even if one of ip is a more prefferable for the requesting ip but the hash index corresponds it to another domain ip bucket then it will get that less prefferable ip, these are words of brocade tac team.
Another problem that I saw with deploying multiple GSLB controllers with hash persistence and persistence-rehash-disable = true , for which the TAC ended up blaming my deployment or seeing it as there limitation , here 's the discription of the problem :-
I have two pairs of si's in two different datacenters , one in each of those pairs is configured as Gslb coontrollers for all the si's. Now I have these two gslb controllers dns proxy vip as my dns servers at the registrar , I had hash persistence enable as well as rehash disabled so to avoid rehashing at times of FLAPPING , now for testing i ended up bringing one of the datacenter down not at the foundry level but at the network edge router which means that foundries or si's were up but the internet connection was not there so the GSLB controllers at both collocations thought that the other site is down and both gslb controllers had ip for domain active that was reachable to them ( local to their site behind the network edge router ) , since one controller was down so it didnt matter and other one was issuing ip that was actually reachable and everything looked beautiful and was as expected.
Now the down site was brought back up and guess what ? , the down site gslb controller was still giving the ip that was local to its site and other site gslb was giving ip that was local to its site as rehash was disabled, which then just totally kills the idea of persistence as request - reply to the dns query to the same requesting ip was different if it requests different gslb controller each time its going for a domain query.
I dont know if i am being able to completely explain the problem but Brocade Tac as acknowlegded that it is a limitation for that feature. As a work around i have rehash enabled for the hash table but still at times of one datacenter flapping , i know i will have problems with persistence to the client requests.
If you guys have any better idea to solve this problem or better work around then please let me know.
Again thank you so much Mr. Zanager for your help
03-03-2010 02:12 PM
You brought up some interesting points regarding persistence.
The main reason for using hash persistence is if you have two controllers servicing the same domain and you want persistence . 2 controllers never talk to each other, and you want the hash table on them to be identical, so that which ever controller the client request goes to, you will get back the same IP.
For each domain there will be a hash table
Example: For www.foo.com, there will be a hash table.
Client X comes in. We hash client IP X and the domain IP corresponding to this hash is IP2. Controller 1 and controller 2 will return same IP i.e IP2 to the client X since they have identical hash tables.
If any of these IPs goes down, then all the buckets assigned to this IP are reassigned to other IPs in a deterministic manner. If this IP comes backup, then reassignment will happen again. But some customers don't want the reassignment to happen since it disrupts persistence. So they configure rehash-disable.
In your case below, there are two controllers. There are two domain IPs- one configured on each controller. Then you disable connectivity between the two controllers, so only the local IP is up and remote IP is down for each controller. So controller 1's hash table will have only ip1 and controller 2's hash table will only have ip2. You also have rehash disabled. Then you restore connectivity, and both ip1 and ip2 are up on both controllers, however rehash is disabled, so these Ips wont be re-introduced into the hash table and controller 1 will continue to return ip1 and controller 2 will return ip2. If you have rehash-disable, the whole idea is that you will rehash when it is convenient e.g. when your traffic is very less to minimize persistence breakage from rehash.
In general, whenever there is domain IP up down event, persistence will break in some sort of way. So either you allow rehash or if you have rehash-disable, you should manually rehash at the earliest convenient time to ensure that both the controllers return the same ip. This will be be true for any hashing mechanism.
Also could you separate out your site and controller functionality so that the sites are remote to both controllers? You could still run into this issue if each controller lose connectivity to one site and not other and these are not the same sites, but will probably be a rarer occurrence if the controllers are connected to the sites remotely via the internet instead of having one site local and other remote because the local site connectivity will always be up for each controller as long as that controller is up.
Another question I would like to ask is what specific persistence goals are you trying to achieve? For example, if you have a specific client subnet in mind for persistence, then you could use affinity for instance. Alternately, in your set-up, when the connectivity between controllers is restored, what would you want the hashing behavior to be?
03-03-2010 05:58 PM
Thank you for replying to my query Mr. Prajakta , But I think I am not able to explain the problem here , so i will try again , when i say the controllers are at different sites and losses connectivity at that point one of the site has lost its internet connection and its not just the connection between the gslb si's , so even any queries for dns coming in for that site controller are alos not reaching it as it just does not have internet connectivity , but when that connectivity is restored there isnt any sync between the multiple gslb controller which should have been there. I have enabled rehashing in configuration as my business needs can afford to have that unless my data center starts flapping , which will have more ill effects then rehashing .
Anyway , I can not understand why does Gslb uses 10.10.10.x\23 range ? do you have any idea about the use of that range ? so , if you have gslb enabled then doing sh server dynamic bind will show this range but i do not understand the functionality behind it.
03-05-2010 10:11 AM
If you sync between controllers, in general, you are going to break persistence anyways and you will hit race conditions. Example- if both controllers have assigned the hash buckets, who gets precendence? So re-hashing will be a better solution in this case.
The 10.10.x.x space is used by the gslb controller for internal operations such as health checking etc.