Showing posts with label error messages. Show all posts
Showing posts with label error messages. Show all posts

Monday, 22 May 2023

DPM Rollup Update Installation Fails - Failed to execute SQL string, error: BACKUP DATABASE is terminating abnormally

Recently we decided to deploy Update Rollup 5 for Data Protection Manager 2019, but were running into issues getting the update to install.

In the installation log we spotted lots of:

MSI (s) (BC!10) [14:34:02:258]: Transforming table Error.

But most importantly we saw:

ExecuteSqlStrings:  Error 0x80040e14: failed to execute SQL string, error: BACKUP DATABASE is terminating abnormally., SQL key: KB5024231_BackupDpmDb.sql.97DDF5B3_3770_4C3E_8673_52BD081E1EFD SQL string: BACKUP DATABASE DPMDB_SERVERNAME080fe81c_748b_403a_b6ca_fa1d3a042da9 TO DISK = 'QFEDPMDB_SERVERNAME080fe81c_748b_403a_b6ca_fa1d3a042da9.bak' WITH COPY_ONLY, INIT

Well why would that be an issue - it usually works...

The answer is annoyingly simple:

If you delete the file:

QFEDPMDB_SERVERNAME080fe81c_748b_403a_b6ca_fa1d3a042da9.bak

(which will be located in your normal backup directory for SQL Databases and named with the guid and machine/database name you actually have rather than the above)

It will work. But why does it care, the file usually exists from the last update rollup.

The answer is in the command - if you run that in SSMS you'll see this error appear:

System.Data.SqlClient.SqlError: Cannot use the backup file because it was originally formatted with sector size 4096 and is now on a device with sector size 512. (Microsoft.SqlServer.Smo)
Ah ha! Indeed - the last update rollup we installed we were indeed using a disk with 4096 sector size, and the new ones are back to 512 for the system volume (long story for another day!)

Deleting the old backup is the easiest fix.

Monday, 30 May 2022

DPM 2019 - Upgrade leaves Secondary Protection Servers Inaccessible

It is now 2022. We're still using DPM (Data Protection Manager) as our primary backup solution. It continues to be the hidden gem of the Microsoft System Center suite and a product I really hope they never screw around with because overall it is fantastic. It might have a simple interface, it might not have all the bells and whistles, but at the core, it does backup, and most importantly, restore workloads.

We've recently upgraded our DPM 2016 suite to DPM 2019 and after the upgrades almost all was well. It took a few jumps as we had to first update SQL Server on each of the DPM servers as DPM2019 didn't support the older version we used, but equally DPM 2016 didn't support a new enough version for us to have bothered until now.

I strongly recommend you backup the DPM database via SQL Enterprise Manager before you begin using a "Copy Only" backup. Do this on every DPM Server. Then find the SQL server version that is supported by both the current DPM version you use, and the one you intend using (in our case supported by DPM 2016 UR10 and 2019 UR4)

The only problem we found remained was that Secondary Protection didn't work - the Primary DPM Server insisted that there was a problem with the agent version of the Secondary DPM Server but we knew that to be bogus.

DPM itself would say:

"Data Protection Manager ID: 296"

"The protection agent is incompatible with the version of DPM that is installed on this computer. All subsequent protection and recovery tasks will fail on this computer until you update the protection agent"

This is a bogus error - and indeed if you refresh the status of the secondary server you'll briefly see another error that comes/goes - indicating something else is amiss.

The "Recommended action" is a bit misleading in the case of secondary protection servers as it ultimately advises you should reinstall the agent or update the agent but that isn't something you can do on a Secondary DPM Server - assuming your Secondary Protection DPM server already runs the same release of DPM as your Primary (eg if it was DPM 2019 so too must the Secondary be - but also they should be up to the same Update Rollup), then this isn't the problem.

More likely is that the Primary server has not got permissions setup properly to allow the Secondary to talk to it. To fix this:

(1)

On the PRIMARY DPM Server, go into the Local Users and Groups section, and find the group named "DPMRADmTrustedMachines" - make sure the Secondary Server name is listed in this group - and if not, add it. 

In our case, adding the machine back to this group (it was clearly lost during the upgrade), then refreshing the agent status twice on the Secondary server corrected the issue and DPM now thinks all is well and has started providing secondary protection.

(2)

On the PRIMARY DPM Server, go to the "DPMDRTrustedMachines" group and ensure the Secondary DPM Server is listed there - if you don't do this your Agent Status is likely to say "OK" but you'll find you get error 33119 and issues enumerating the contents of the Primary DPM Server and/or a Protected Server it has already protected that you're trying to setup or modify the Secondary protection of.


Saturday, 24 December 2016

DPM: Replica is inconsistent, Error 3106, "system cannot find the path specified" - despite restarting both DPM + Protected Server systems

It's been a while, but we've recently come across a ridiculous error situation with DPM that caused a random subset of Protected Servers to just stop backing up.

Each Volume or resource would be in "Replica is inconsistent" state. You'd play the usual game of running consistency checks, or consistency & synchronisation but the job goes to "OK" for a while - but the ability to make a new recovery point is missing (eg the "Create a recovery point after synchronizing") option is unavailable, and a short while later a protected volume would return to "replica is inconsistent" state.

After much head scratching and monitoring, we realised it was very simple... the installation path for DPM includes a temp folder, for example:

"C:\Program Files\Microsoft System Center 2012\DPM\DPM\Temp"

The folder "MTA" had been removed as part of a clear up of old temporary resources, and despite this folder not being actively used during a backup it seems not having it breaks DPM.

Simply recreate a folder called "MTA" and you'll then find everything is working just fine again - re-run those consistency checks and then make a recovery point with synchronisation and all will be well.

Hopefully this will help someone else with a similar issue!


Friday, 22 May 2015

Error 33119 trying to establish Secondary Protection

It has been a long time since we posted here. That's mostly because we rarely have major issues anymore since we decided to use Data Protection Manager instead of Backup Exec.

The reality is that in our experience... it just works.

However, we recently had a problem where our primary DPM Server at a site failed. Having rebuilt it, and gotten things going again, we had issues getting Secondary Protection to work again, with DPM throwing error 33119

After wondering what was happening, we found the answer pretty simple...

On the PRIMARY DPM Server...

Make sure "DPM Writer" and the "DPM Access Manager" services are running - for some reason the Primary server wasn't running those services (although was otherwise working) - a quick service start and they've been reunited and now work.

Saturday, 1 June 2013

DPM and Manual Replicas. Not quite as intuitive as the rest of Data Protection Manager!

We don't tend to post here very often anymore, mostly because it's rare there's anything much to say since we moved away from Backup Exec (which seems to be just as bad as ever when I recently evaluated it to see if it had improved).

Data Protection Manager does, on the whole do a damn good job of making backups work reliably. There are some annoying quirks and on the whole when it does have a problem of some sort it tends to be communication with server issues, and the error numbers, messages and explanations are pretty hopeless in many cases, which does take the shine off an otherwise decent product.

But one area that we have recently been exploring in more depth is the "Manual Replica" feature. Traditionally we've not really had any need to get initial replicas via anything but the LAN/WAN links already in place, but more recently we had a few scenarios where it sounded like it could be a winner.

The process is pretty much undocumented as far as TechNet goes - there is a vague explanation but it doesn't cover several obvious and common scenarios whatsoever.

The basic principle is that you literally "copy" the source drives files to another media - like a removable hard disc, then copy them onto the DPM server. 

Sounds simple right?

The first stage is - assuming you've got a couple of good tools that let you grab "in use" files or get via some VSS handiness, and can add an NTFS formatted drive etc. In practice it is the second part - getting it into DPM - that far more convoluted as you have to know the replica path, you have to mount it so you can access it to copy files and then you can copy the files.

The hassle lies in having to get the replica path from DPM's admin console (easy), copying that to clipboard and into notepad or similar (easy), extracting the volume ID (easy as long as you know which bit is the volume ID), and then finding the actual volume ID windows has via mountvol (time consuming to do when you have a lot of volumes), and not that easy. Finally you need to mount that volume so you can access it, browse to the "Full" folder and copy the files (only easy once you know how).

All of this would be relatively simple if the documentation was decent, but it isn't.

What makes it more bizarre is that the process of getting the IDs and mounting volumes is something that could be done within the UI if they wanted pretty easily, or at least via some DPM powershell.

Since this wasn't the case (and isn't as of the latest build), one of my colleagues and I knocked up a little script to make the last part a bit neater - essentially we feed a script the DPM volume string, and the drive letter we want to mount as, and the script queries the system volumes, finds the matching one and mounts it. Voila!.

Copy the files, then run a consistency check on the manual replica, which will then complete the job and let ongoing backups happen.

So we're done now right? Well not quite.

All of the above assumes you're creating a manual replica of a file storage volume (eg a typical "drive"). There doesn't appear to be any documentation on the TechNet site for any release of DPM that explains how you create manual replicas for resources like SQL Server, Exchange - it suggests you can, and you can make those resources go into "manual replica creation pending" but it does not make it at all clear how you actually achieve this.

So my simple request to Microsoft is PLEASE provide better documentation in this area, and perhaps throw in a couple scripts to simplify the process so we can just "get on with it" - as a general rule the Server 2012 and DPM 2012 releases do exactly that so it'd be good to see that last bit work too.

Thursday, 23 August 2012

DPM 2012 - no reports on backup jobs for you!

This morning we came across an interesting issue with DPM 2012, something which is really trivial, but really annoying if you don't know what is wrong!

Recently I moved our DPM servers over to DPM 2012 - and was amazed by how smoothly it went (more because I'm used to upgrading Backup Exec and watching the earth cave in), and all seemed to be working well.

However, I'm VERY paranoid when it comes to backups - and so my colleague has responsiblity for making sure backups happen - reviewing reports and so on. We'd been using the reporting in DPM 2010 quite happily, he received and reviewed the reports on a regular basis.

Since 2012 he'd not been getting them... it seems that the reporting gets broken on an upgrade, and the SMTP Server settings were not right anymore - bizarrely each server we had seemed to have different states - one had no SMTP Server details anymore, one had them but complained they weren't right, and the other had half of the settings. All very strange.

In theory this wasn't an issue - back into SMTP Settings, repopulate and reconfigure the reporting part.

Or not...

Despite having all the details in the "SMTP Server" setting, and having those details set correctly (validated by the send test option and the receipt of a test e-mail etc) we couldn't setup any of the reports.

Why?

Error 3010
"DPM cannot setup an e-mail subscription for this report"
(and then advises you to go and setup your SMTP Server!)

It turns out that the issue is in fact that SQL Reporting Services doesn't actually have the details you entered - from what I can see, the system still looks in the DPM 2010 instance of SQL (because it doesn't remove it or the instance of old SQL it made) at upgrade, so it updates that instead. D'oh!

Whilst that's clearly a bug and should be fixed, the good news is that there is a quick fix.

Go into SQL Reporting Services Configuration, log into the DPM2012 instance, choose the "E-Mail Settings" and fill in the Sender Address and SMTP Server. Save that and you're golden.



Wednesday, 8 August 2012

DPM: Error 10048 0x02740

So it's been a while since we posted, and generally that's because DPM is a godsend. It almost works without attention, and apart from a few quirks we've discovered, and the odd thing that happens but isn't ACTUALLY documented anywhere sane, things are good.

Compared to Backup Exec (we stopped using it on BEWS 10d), DPM 2010 seems to "just get it" when it comes to backups. Once a backup is setup, it knows how to backup, it knows when to backup and it actually backs up.

If there's an issue it attempts to fix minor issues via consistency checks etc, and if there's really an issue, it is glaringly obvious for you to fix it. That's pretty good news when you're used to Backup Exec being hell on earth, randomly dropping jobs to "hold" status because of an issue and so on.

However yesterday a random issue cropped up which I hadn't seen before.

One of our Protected Servers Agent Status in the DPM console was "unavailable" - and the error logged was "10048 0x02740". The protected server in question is running Exchange Server 2007.

This appears to have happened because the IIS process on the protected server was suddenly using TCP Ports 5718 and 5719. This prevents the DPM Remote Agent from starting.

To fix this, you can simply:

(a) Stop IIS (iisreset /stop)
(b) Run the DPM Agent
(c) Start IIS (iisreset /start)

...or in our case, do nothing - by this morning it had cleared itself (we hadn't restarted IIS as we didn't want to drop the live users connected via OWA and OutlookAnywhere on this box).

Friday, 6 May 2011

Common DPM Errors...

Since we've now got most of the Data Protection Manager 2010 installations done, I thought I'd share a few common issues we've come across, and the fixes. Maybe this'll save you a LOT of hassle...

"Access Denied (0x80070005)"
Common causes are listed all over the place, suggesting Firewalls as the issues and DCOM Permissions. All entirely possible. One other thing to consider, especially if you've setup Forest Trusts etc, just make sure you've made sure the AD Network holding your DPM Server(s) is fully accessible - and that this traffic isn't restricted either! In our case, we had a Cluster with 2 servers, one in a Subnet (we'll call this Subnet A), another in a different subnet (Subnet B) and our DPM Servers (and the DPM AD Network) in another (Subnet C).

While Subnet A and B could talk without restriction, and A could quite happily talk to C, for historical reasons, B and C weren't completely open for communication. So my tip - make sure you've considered Active Directory Authentication and not just "DPM to Protected Server" issues!

Agents are "unavailable" and "VssError: Invalid value for registry"

This ia bit of an odd one and just "happened" on a previously perfectly happy server. We resolved this by simply removing the account used to push out the agents in the DCOM Config (run "dcomcnfg.exe"), find the "DPM RA" in the list and remove/readd the user. No idea what caused that mind!

Replica is inconsistent with System State and repeatedly so...

Especially if you're on a Windows 2003 SP-2 32-bit system? Yep, thought so. You've probably just not got enough space on the system drive (normally C:\). You should move the normally hidden "DPM_SYSTEM_STATE" folder to another drive, ideally with +10GB free, and then update the data source...

\Microsoft Data Protection Manager\DPM\datasources\PSdataSourceConfig.xml

change:

%SystemDrive%\DPM_SYSTEM_STATE\*

so it points to wherever you put it... easily sorted.


Hopefully they'll help you for now, more tips later!

Wednesday, 30 September 2009

Error E000FE30 every day on one server...

...for months. For months I've struggled with a problem on ONE server, that happens to be at a remote site on a different subnet, connected via a WAN VPN Link.

Every day, one or more jobs would fail with Backup Exec Errors, mainly E000FE30 - with the useful and generic messages about "communications failure has occured" and sometimes the "connection lost to the remote agent".

Needless to say, I've spent some time working on this, and tried all sorts. Reconfiguring the system to use a different WAN link to ensure the fault isn't with the WAN. Nothing. Checking to ensure the issue isn't with the server, reinstalling agents, trying all sorts.

I've updated network drivers, checked all sorts of patches etc - but nothing, Still this error - consistently failing jobs.

I even got a colleague to look at it for a fresh pair of eyes and he too tried all sorts. Given the error, we suspected "something" to do with comms, but never found any issue, and in hundreds of tests conducted could never replicate the issue - transferring large files to/fro the server worked fine etc.

Today I found the answer. The "Large TCP Offload" feature on the Network Card. While I've seen plenty of issues with this feature before, you normally see it with terrible throughput on the system in general and so on - but this machine is solid as a rock for everything else.

Still, the setting is off, and first complete, full backups in a few weeks... voila!

Top tip for anyone else facing this problem - don't just check the network drivers, but try turning off these features, even if you cannot see this issue at any other time on the machine.

Is this a Backup Exec issue? I'm not sure, but I'm happy to blame it since everything else works just fine.

Friday, 31 October 2008

Why are error messages not unique

That's what I want to know.

Why is it that you get an error message in Backup Exec, and it spits out an error for you, and so you click on it, which takes you to a web page with a description of that error, right?

Wrong. With Symantec Backup Exec, it just takes you to a list of issues which may or may not be remotely close to what you have issues with, and rarely has any useful answer.

Here I am today again trying to find out why certain jobs keep failing without any sort of sane error reason. Another day fighting Backup Exec.