We recently had one of our groups report that merging data was resulting in data loss. Specifically, when they merged two records, they noticed that the contribution records on the record that was deleted were not carried over to the record that remained.
I investigated and found the culprit: we were missing a foreign key constraint between the contribution table and the contact table. In fact, we were missing a lot of foreign key constraints in this database.
The reason the mising foreign keys caused data loss is because CiviCRM quite cleverly uses the existance of foreign keys to determine what data should be merged. With this method, the merge code doesn't have to be re-written every time a schema change is introduced. This approach strikes me as smart even though it had bad side effects in our case.
So - how did our schema get out of sync? This problem only affects our four oldest databases and I suspect these databases were ones I had to convert from myisam to innodb and I suspect I did not manage that properly. Or I may have powered through an upgrade without properly reviewing errors.
As I started researching how to fix this problem, I was quite pleased to find the mysqldbcompare utility which promises to identify differences between two databases and presents SQL statements to alter the second database so it will match the first. Perfect tool for the job. I created a fresh CiviCRM installation to use as my reference and got to work.
Then I wasted nearly two full days. Problems with this tool included:
I actually worked through most of these problems with ugly hacks before I discovered that if you muck with your table definitions too much you can cause the innodb engine to crash by simply executing an ALTER TABE statement. At this point, two full days in, I decided to change course dramatically.
There are a few proprietary tools that promise this functionality, but I wanted a free solution. I ended up writing a rather dramatic sledge hammer to do the job. It's a little nerve racking to run, but it does the trick.
Once we got our schemas in line, we werent' done. We already had a database full of orphaned records, however, now the orphaned records were officially recognized by the database. This condition led to SQL errors left and right (for example, the smart group that pulls in all records with a valid email address in civicrm_email blew up because it tried to populate civicrm_group_contact_cache with a contact_id that is not present in civicrm_contact). Immediately after fixing our schema, we had to run code that identified and deleted (or set to NULL) orphaned records.
That script is also included in our git repo.
The moral of the story is: if you vaguely remember converting your database from myisam to innodb many years ago, you may want to check to make sure you have your foreign keys in place. It's not longer just a good idea: missing keys will result in database loss.