02-03-2016 02:58 AM
Multi-location long distance fabrics
Three data centres
2 primary sites (A & B) each with an IBM SAN768B-2 (DCX 8510-8)
1 secondary site ( C ) with an IBM SAN40B-4 (5100)
DWDM connectivity between all sites
LISL configured between A & B (AtoB=74km)
Can the LISL also be connected via site C (AtoC=60km BtoC=13km) to provide a diverse path for fault tolerance?
Solved! Go to Solution.
02-04-2016 03:32 PM
02-08-2016 05:56 AM
Hi Alexey, thank you for the response.
I think I over simplified the solution in my original post.
Currently the base switches in the two DCX switches form an XISL link. Virtual switches at both sites are enabled for XISL, and Logical ISL (LISL) connections exist between virtual switches with the same fid at each site.
I have attached a drawing to try to illustrate the solution I am trying to achieve.
If we created a base switch on the 5100 and connected it to the base switches at A & B would the path still fail over from AviaCtoB if AtoB failed?
02-08-2016 02:03 PM
02-10-2016 06:15 AM
Thanks Alexey, that is looking promising. The final piece of our puzzle is location of resources.
Resources at A&B need to communicate with one another and additionally resources at A&B each need to be able to access a resource at C. (to support a distributed cluster environment with Quorum at site C)
Based on what you have told me so far, due to link cost A&B resources will communicate directly with one another and for the same reason A will directly access C as will B.
In the event of the loss of the line between A&C access to the resource at C from A will automatically route via B?
Would the selection of the alternate path be instantaneous?
I have updated my drawing (attached) to hopefully illustrate clearly.
02-10-2016 02:46 PM
02-11-2016 08:00 AM
This is an IBM SVC Stretched cluster implementation supporting VMWare Metro storage cluster.
In the event of a path loss in the above environment we would need to be certain that the fabric rebuild will happen quickly. A fabric rebuild suggests that all fabrics will be affected and that access from a node to the quorum as well as node to node will be disrupted for a short while? How long might this process take?
Any significant delay in the rebuild would mean cluster nodes not having access to each other and loss of access to the third site for quorum, they would offline to protect against split brain and all hosts would be disconnected (the complete opposite of the requirement for the HA VMWare solution) until links once again re-established.
02-12-2016 01:37 PM