Wednesday 30 September 2009

Error E000FE30 every day on one server...

...for months. For months I've struggled with a problem on ONE server, that happens to be at a remote site on a different subnet, connected via a WAN VPN Link.

Every day, one or more jobs would fail with Backup Exec Errors, mainly E000FE30 - with the useful and generic messages about "communications failure has occured" and sometimes the "connection lost to the remote agent".

Needless to say, I've spent some time working on this, and tried all sorts. Reconfiguring the system to use a different WAN link to ensure the fault isn't with the WAN. Nothing. Checking to ensure the issue isn't with the server, reinstalling agents, trying all sorts.

I've updated network drivers, checked all sorts of patches etc - but nothing, Still this error - consistently failing jobs.

I even got a colleague to look at it for a fresh pair of eyes and he too tried all sorts. Given the error, we suspected "something" to do with comms, but never found any issue, and in hundreds of tests conducted could never replicate the issue - transferring large files to/fro the server worked fine etc.

Today I found the answer. The "Large TCP Offload" feature on the Network Card. While I've seen plenty of issues with this feature before, you normally see it with terrible throughput on the system in general and so on - but this machine is solid as a rock for everything else.

Still, the setting is off, and first complete, full backups in a few weeks... voila!

Top tip for anyone else facing this problem - don't just check the network drivers, but try turning off these features, even if you cannot see this issue at any other time on the machine.

Is this a Backup Exec issue? I'm not sure, but I'm happy to blame it since everything else works just fine.

4 comments:

Anonymous said...

I have the same problem! I don't have the "Large TCP Offload" option, but I have a short list of other Offload options on both my built-in NIC and my SAN connection.

Offload Receive IP Checksum
Offload Receive TCP Checksum
Offload TCP Segmentation
Offload Transmit IP Checksum
Offload Transmit TCP Checksum

All of them are turned on. Do you know what might happen if I were to turn all of them off?

The Backup Exec Goat said...

Hi,

The worst that would happen is that you'd see worse performance and/or higher CPU usage on the server pretty much.

It does sometimes cause a brief drop in traffic on the server, so one to try out of hours.

The various settings can be tried in various combinations to see what works best - there isn't a hard/fast rule in my experience - depends on the network cards, drivers, direction of the moon etc.

Anonymous said...

Interesting site, tis a shame it's still not kept going. But did you ever see this?
http://seer.entsupport.symantec.com/docs/290098.htm

It was posted before your blog about TCP Offload.

PeteLong said...

Heres some more information that might be helpful,

Backup Exec Job Fails With an E000FE30 Error

Pete
PeteNetLive