11-04-2013 02:46 PM
I am struggling to become acquainted with my new ADXen. I can't say this has been a straight forward experience. I have run into two types of difficulties with VLANs on the ADX.
1) Adding a VLAN to a dynamic LAG (link aggregation group) cause the LAG to reset. Adding multiple VLANs to a LAG causes it to hang in an seemingly indeterminate state with LAGs having to be be reset by hand.
2) At some point all attempts to add tagged ports to a VLAN result in the ports being added as untagged even though the system provides confirmation that the port has been added as tagged.
Working with TAC has been frustrating.
In the first case this seems to be dysfunctional behavior without cause/rationale. Resetting a trunk just because a new VLAN has been added raises havoc with operations. On the face of it I can see no justification for this behavior -- but I might be missing something. TAC says it is because adding a new VLAN forces a new root bridge election. To which I replay a) on which VLAN does LACP run? b) new bridge elections are required across all STP instances of a pvSTP environment?
TAC finally resorted to "because the manual tells us so". To which I replied that it does not explain the why of the behavior -- behavior that, on the face of it, seems to be dysfunctional -- disrupting all communication across a LAG each time a new VLAN is added and this on a production system? To which TAC simply stated that is the way it is.
The decision was made by designers unknown for reasons unknown and is now the way the product works.
In the second case, this seems to be a bug triggered by some sequence of events. TAC seemed uninterested in exploring this issue. (The only way to clear the system's insistance on adding a tagged port as untagged was to reboot the box. Again this is not a viable solution in a production environment.
Can anyone shed some insight into the background for this behavior (enlighten me as TAC is unwilling to make the effort)?
11-07-2013 02:47 PM
For #1, if I am understanding you correctly, this is expected behavior. I do agree that this is not a good process. I would encourage you to file an RFE (Request for Enhancement) to get this changed. These sorts of things are always better when they come directly from customers themselves.
For #2, the description you are giving below give the impression that some sort of buffer is filling up. Do you have some additional details you could share on that one? I'm thinking about things like the time frame or number of successful attempts before it fails out kind of thing. Also, the version you running would be good to have as well.