Currently we have a reporting database that is created from a full backup of our production database and restored to a separte SQL instance. Now the business is wanting to archive data out of production but maintain the archived data in the reporting database. So I am looking for recommendations on how to perform this task.
Does anyone maintain a separate reporting database? Does CA have any tools/applications to do this?
This is an interesting sceanrio. Most folks do not have a need to report on archived historical data, but rather would just want to use it for research purposes. Reporting is typically used for current data metrics, so your scenario is different than most customers. I think you will have to take a multi-step approach here to accomplish this (to some extent), but even with that, I dont think it will be exactly what you want.
The "archive" piece of the archive-purge functionality to create archive files, which can be loaded into another system that can hold historical data, however you would need an SDM instance attached to this separate database because the archive files need to be loaded using pdm_load. So that is the first challenge.
The second challenge here is that you are replicating the current database by doing a SQL backup and SQL restore, which encapsulates the entire current prod database and not just the data that has changed since the last backup, which means that it would overwrite any archived data that you loaded into the separate reporting database, so that is another challenge.
There are two (maybe 3) possible scenarios that I can think of for this - neither of which are perfect, but here goes.
- this one uses only one reporting database, but will require a bit more work every time its refreshed
- create an SDM instance and configure it against the reporting databases
- when you run archive-purge, have it create archive files and store those somewhere
- when you do a SQL backup/restore to refresh the reporting database, you will have to then fix the orphaned SQL user and then reconfigure this reporting instance of SDM against it
- then you can use pdm load to load in ALL of the archive files that you have
- the challenge is that you would have to reload ALL archive files every time you do this, so this approach may not be super practical
- split things up into two databases, one for current data, and another just for archived data
- Use the current reporting database that you have to continue to do the sql backup/restore procedures as you are doing now with the current dataset
- Set up a second SDM instance with its own database which will be used ONLY for archive data - you would basically first replicate your production system to this "archive reporting" SDM environment
- then going forward you would use pdm_load to load ONLY the archive data into this second environment
- the challenges are that you would have to maintain the second SDM instance such that if you added any custom tables or columns to your production SDM instance, you would need to do the same on this archive reporting instance so that when you run the archive-purge job and then load that archive file in, the archive reporting instance has the same custom schema and can load the data properly
Third Scenario (not 100% sure on this one - would need to research this more):
The only other thing that comes to mind is if you can find out if there is a way to do a SQL "delta" backup of some sort which would only grab the data that was updated SINCE the last backup, in which you can then restore that on top of the original database that you restored to the reporting instance, thus keeping the existing records there. I am not sure if this is possible, you would have to talk to some SQL DBA folks about that to see if it can be done, and then you would have to thoroughly test it out before implementing it in production.
Those are the only scenarios I can think of for this.
Hope this gives you some ideas at least!