Tuesday, 22 January 2013

Terabyte Migrations, Chapter 2

The tenders were called, the bid went in. My plan was to use Enterprise Replication to shift the bulk of their data, leaving the old Solaris and new Linux boxes tracking for a few days until we could do the cut-over with a nominal outage during a quiet period for the apps. The client must have thought this sounded great, because we got the job.

Just because I had this page in draft as a guide for the continuing story, I'll put it up. Needless to say, a lot of reasons, explanations and solutions need to be written to flesh it out. Some of the items would have taken their own pages.

Where shall I go with this chapter? I'll start with a shopping list of points...

Have a script to quickly show the PK on tables.

Put explicit primary key definitions on almost all tables. "Almost" is because you might find that things like audit tables, log tables, etc have no indexes whatsoever. The trade-off is having to migrate the data in these tables during the actual cut-over session, and that will take some time. Also, the table might be the target of a foreign key, which means that the dependent table can't be replicated if this table is not replicated.

Setup the replication to be one-way (Primary/Target) on the production systems.

Have a script available to diff the count of rows on both sides. Add other scripts for further integrity checking if you so desire.

Older IDS versions such as 10's or 9's cannot participate in the cdr validation or cdr repair commands.

Disable triggers on a the ER target - the new machine, that is.

Prepare a topologically sorted list of the tables, sorted for PK/FK dependency so you can migrate tables from parents down to child tables. Beware circular references in the PK/FK's

The alternative is to disable all FK on the new database until the migrations are all finished. NOTE: you cannot disable the PK because ER needs them.

Since ER requires primary keys, you can't disable or delete the PK from the target machine in the hope of a faster load.

Have a script to calculate decent extent sizes for the new machine. The script can analyse the output of oncheck -pt from the old machine. NOTE: going from 32 to 64 bit machines will probably make the data grow. You should also aim to make the extents have a bit of growth factor anyway, so take that into account.

Calculate your first and next extent sizes thoughtfully. If you're lucky, all the tables have explicitly named indexes so that you can set the extent sizes directly. Any "secret" index backing a unique or other constraint will have to rely on the NEXT SIZE setting of the table to allocate it's extent; if these hidden indexes take significantly different amounts of space, you'll not be able to pick a perfect growth factor.

Unloading tables with a massive amount of blobs are worth unloading and loading with a few parallel processes. Unfortunately, the HPL kinda sucks for blobs...


Beware raw tables with HDR. They cannot be promoted to STANDARD under HDR. One part of the middle-sized system would build a bunch of data into a raw table (allegedly for speed) and then convert it to a standard table and shove some indexes on it. With HDR running, this process was failing. Sadly, the process was building a very large list of permissions a person had concerning every document in the system, and it had to be prepared before they could complete login!

Beware the old trick of building an unlogged database then converting it to logged. This cannot be used with ER; nor will HDR allow this.

No comments:

Post a comment