01-19-2018 01:45 AM
Any ideas what I should do to reduce these errors?
I am most worried (rightly or wrongly) about port 9 as it has 103.8million errors in 16 hours.
01-19-2018 01:59 AM
01-19-2018 02:09 AM
Does any maintenance or network outage has occurred during the period?
01-19-2018 02:39 AM
The period is overnight from 5pm till around 9am - Overnight we have some SQL preocesses running and Altaro backups, no maintenance happened, there was no outages that I was aware of, all the backups completed. Now and again we get a Cluster Shared Volume dropping from cluster manager but this did not happen last night during the period on the screenshot.
01-19-2018 02:56 AM
Ports 19 and 23 need to be investigated because they have reported link failures.
01-19-2018 03:05 AM
What would you do about the link failures, would I first change the fibre cable? Or could it be firmware/software related?
With the CRC errors, what would be the most likely cause?
Thanks for all your help so far.
01-19-2018 04:33 PM
There is a couple of potential problems. You have a lot of class 3 discards which may indicate the performance problems. I've seen this in my environment, in my case causing the scsi aborts on linux servers and the paths occasionnal flappings. But for you it comes together with the c3 timeouts which should help with finding the guilty one. Backups, so... maybe the tape library? I would not be so worry about the crc errors as long as it stays at this level- if there is a hardware problem (SFP or cable), the number of errors would increase quickly. For the ports 19 and 23- there were the signal losses coming along with timeouts... but no errors indicating hardware problems. Hmmm... link resets for credit recovery? If yes it should be logged, I don't know anymore if by fabric or RASlog.