04-20-2015 01:52 PM
we are getting Link reset and link timeout messages in our environment, I would like to know how do i Pickup The slot number / Port number from that message? Below is the message:
Event: S8,P-1(125): Link Timeout on internal port ftx=1729 tov=2000 (>1000) vc_no=2 crd(s)lost=2 complete_loss:0.
Does this mean it is on Slot 8/ Port 1 / User Port comes to 417). I am not sure what does 125 indicates in the bracket of the error message? How do i identify the correct port and slot number / port index ?
04-26-2015 11:54 PM
It means Slot 8 internal Port 125. In order to check which port it connects to, you have to execute:
04-28-2015 11:15 AM
we do not have command line access, is there a way to find this from GUI ?
Slot 8 and internal port means the Port index or user port ?
Request you to please suggest and clarify.
Thanks for your time
04-29-2015 03:27 AM
Internal/backend ports cannot be judged with the front end ports. In the bladeportmap we see mapping of internal ports and not front end ports.When we see link time outs on internal ports, first enable bottleneckmon then we should check the core blade to which the concerned port is mapped with and reset the core blade which is less disruptive followed by reseating the port blade If the issue still persists reset the port blade followed by reseating the port blade. The situation in most of the cases resolve here otherwise you have to replace the core blade and check for the messages. If they repeat, replace the port blade i.e slot 8 (where the timeout came in this case).
In the example slot8(p-1)125, 125 is the internal port and we have to check the core blade it is associated to. Link time out occurs during heavy traffic flow at some peak time period of business hours when the frame gets stuck in the link or VC(virtual channel).This condition is also called as buffer starvation or credit depletion. When you enable bottleneckmon for every C2-1012 message you will receive C2-1014 as link reset messages in the RAS log. This means the link will borrow credits from the buffer pool to clear the link however due to hardware issues link will not get cleared and messages keep on repeating.
04-29-2015 03:29 PM
May I ask where you got the information about what happens to resolve an internal link timeout? In my eyes, the recovery action that is described in the documentation - a link reset - makes much more sense, as it will set the credits for the link back to the initial value. From your statement it looks like that the message just says link reset, but it does not do that. That makes no sense to me. I mean the port lacks of buffer credits - Why should adding buffers for this port help here? The would lead to accept some more frames FROM the other side of the internal link.
04-29-2015 10:19 PM - edited 04-29-2015 10:30 PM
The documentation says it will detect the link timeout and reset it when you run the command, This will repeat untill the actual issue does not resolve however my explanation is not to see these messages again, I am also talking about the credits not buffers. Port will borrow credits 1 or 2 depending upon the hardware make. If it is only related to high bandwidth utilization, a link reset will help to clear the link however in many scenarios you would have seen these messages keep on repeating i.e. link timeout and reset (C2-1012 and 1014) which would be related to hardware issues as well then a simple link reset will not be enough to resolve the issue.
04-30-2015 01:48 AM
If you have a physical problem destroying VC_RDYs (or corrupts frames in the other direction so badly they aren't recognized as frames anymore and no VC_RDY is sent) you'll see it again, sure. I just asked myself what borrowing credits could possibly help you. I have to admit, I also don't understand the concept of borrowing credits here. Wouldn't that mean that we just overrule the other port's amount of buffers and send him something in the hope that his buffers are not really full? Seems to be a bit risky, isn't it?