09-24-2009 08:32 AM
I configured a fabric with two b300, linked each other with one ISL. I was testing to copy a big file. All started fine. But without any apparent raison, the copy from local disk to remote storage via fiber has stopped to progress.
I checked port statistic and i get this kind of errors :
er_enc_out 102 Encoding error outside of frames
er_bad_os 81392 Invalid ordered set
The statistics were cleared yesterday. And because it is a testing fabric, i know i am the only user of the fabric. So these errors were during this copy.
I don't knwo how to update my config to escape this kind of problem. Any help will be appreciated.
09-25-2009 12:28 AM
No other kind of errors. The portstatsshow command show only er_enc_out and er_bad_os. I used this command on the ports connected from the host to the switch and that one connected from the switch to the storage. The two ports have same kind errors.
09-25-2009 02:49 AM
If you have two HBAs, trying connecting and zoning the second one on the same fabric. If there are no problems, it could the first HBA that is flaky.
If you have two fabrics, you could try zoning the other HBA on the other fabric to different ports on the storage array. If the switch ports on both paths are showing the same symptoms it could be a firmware or driver problem on the HBAs.
09-25-2009 03:09 AM
If the problem is physical, it should exist for each copy ?
But i copied several times the same big file without any problem.And some times the copy hang.
I'll make some checks about hardware problem. I have 1 HBA with 2 ports, not two HBA.
I already use multipathing with this card.
09-25-2009 04:29 AM
OK. So if I understand correctly, you have two ports on an HBA, both connected to one switch and one connection to the storage. You are seeing these same errors on all three ports.
Since ordered sets do not contain data, it has nothing to do with the file itself that is be copied. I would now suspect the HBA driver/firmware/hardware.
Can you do the same tests with another server/HBA?
Do the error counts go up when the copy is successful also?
Remember, ordered sets are purely within the SAN; the OS will never see them.
09-25-2009 05:03 AM
I'll receive some others server in few days. I'll check with them.
But with this one, when the copy reach the end, i don't have any error.
My configuration is the following :
My host have 1 hba with two ports. The two ports are connected with two switches of the same fabric. ie 1 port connected with switch 1, one port connected with switch 2.
The two RAID controleurs of my storage are also connected to the fabric. Each RAID controleur have hba with 2 ports. So, each RAID controler have one fiber connected to switch 1 and one other fiber to switch 2.
09-25-2009 05:51 AM
Sorry I wasn't very clear re the error stats. I would note the ordered set error count before copying the file. After each copy, successful or not, re-note the error count. Does it go up each time or only when it fails.
If you are using a Unix server, are there messages (e.g. timeout errors) ion the syslog?
Have you checked the compatibility matrices between your server/HBA and the switch/FOS level, and between the server/HBA and the storage?