2006-05-16

Firebird Backups - Some notes

Since I backup my Firebird database on a daily basis, and I burn a CD with data from a week, size is something that begins to matter me: the weekly backup will not fit in a CD in a few months. If I can get a way to shrink it 3 to 4 Mb per day, that will save me 20 to 35 Mb per week, and could make a difference before switching to DVD.

My working process is simple:

- The Office works from 9 am to 6 pm from Monday to Friday
- Sometimes, some users work on Saturdays and Sundays
- I have some automatic work that runs at night, that changes the data every day.

This is my backup process:

1) I run gbak every hour, between 8 am to 8 pm
Some users like the company so much that they start working before 9 am, and others stay here after 6 pm.
The gbak process takes from 3 to 3,5 minutes. No garbage collect - I will do it at night.

2) Gzip the backup file
The gzip process takes from 1,5 to 2 minutes.

3) copy the gzip'ed file to a backup server
The copy process takes about 30s.

Some cleaning process is needed here: I have hourly backups, just to be safe, but I really don't need to keep them. So, every day, at 20h00, I delete all files from "yesterday ", and I just keep the last backup - from 8pm. I do the same thing in the backup server.

4) Every sunday, at 11 pm, I burn a CD with data from all week (8pm backups).

5) Since I keep the daily files, I burn a DVD every month as well.
After that, I delete all daily files from both servers: two backups are enough.

I do have something more in this process: for example, every day at 11 pm I do a restore just to make sure that I'm getting a workable backup; and I do some garbage collecting every day at 2 pm.
(I do have another hourly process running on the live database, getting some statistics, but I will talk about it in another post.)

Back to my problem: is there a way to shrink this backup files?

The obvious will be to start using bzip2 instead of gzip, but bzip2 is really slower, and eats a lot of CPU power, what could affect the total speed of operation (I'm spending 5 minutes on this every hour).

I had some doubts about using a non-compressed backup against gzip. Sometimes, the data to be compressed can make a big difference, so...

Let's test it. :-)

I backup'ed my database with diferent parameters:

-T [ transportable ]
-NT [ non-transportable ]
-E [ expand, no compression ]

I used gzip and zip with maximum compression parameter.

Keep in mind that my database doesn't store binary blobs (well, it does, but some small data <200Kb). There is a "small" diference between a 650Mb database with lots of tables and records, and a 650Mb database that stores a 600Mb zip file inside a blob. :-)

Here are the test results:

Original database size: 648.024.064

Par. Size GZip BZip2
NT.......447.889.920....98.492.992....68.837.394
NT,E....922.023.936...105.368.832....66.865.451
T..........468.917.760....99.344.400....67.464.070
T,E......1.099.931.136...108.921.026....65.739.309

It was interesting to see that a expanded backup compress better with bzip2 that a compressed one. It was really nice if gzip'ed worked with different chuncks of data, what will save me some time changing and testing the scripts.

It's time to dig further on gzip manual and parameters. I really would like to change just some parameters on my script, what will make testing much simpler.

It seems that I will be using a "non-transportable" backup (gbak defaults to transportable), and I will be using gzip to compress the hourly files and bzip2 to compress the last file of the day (the one that will be archived). BZip2 is really slow compared to gzip, so I will use it only when I really need it.

Notice that "transportable" isn't important to me. A transportable backup should be used when you use plataforms with a different byte-order: this is not the case, since I'm using Intel cpu's all over (in linux servers and in windows workstation).

Some other remarks:

1. This database went live January 1, 2002
Average users connected simultaneous: 35
Average transactions per working day (8 hours): 7500
Actual users: 70

2. This is an ERP system developed in Delphi.
I use IBO to connect to Firebird.

3. I never had a problem, so I never had to use a backup. :-)

4. I just did a backup/restore operation once - On January 1, 2006
No special reason. :-)
I was here at the time, no one was working, and I said to myself:
"Well, I think that 4 years is enough"

5. I'm not going to mention the version of Firebird that I'm using.
You will guess it - I never upgrade it on this location.

Firebird is here: http://www.firebirdsql.org
If you are using it, remember to join the Firebird Foundation!

2006-05-02

Batch file para mudar a Impressora por Defeito:

Daniel Schneller's Weblog : Weblog
rm recursivo e ficheiros com nomes que incluem espaços...

Passo a vida a esquecer-me disto! Eis um exemplo que apaga todos os AVI

find . -name "*.AVI" -type f -print0 | xargs -0 rm -f