Utilising Custom searches for data cleanup and contact integrity checking

Published
2015-08-01 21:04
Written by
seamuslee - member of the CiviCRM community and Core Team member - about the Core Team

The Australian Greens developed 2 new custom searches to enable us to do data clean-up and to monitor for spam records that were coming into our system, mainly through the Drupal user registration form.

The first search (addresses needing fixing) has a number of different combinations to find different issues with addresses. This allows to check for where there is a state set but that doesn't match the set country or where the postcode isn't correct for Australian postcodes. All Australian Postcodes are numerical and are 3 or 4 digits in length.

This search also has options to check to check on addresses which we have put in "NCA" in the street address field, (this stands for No Current Address as there was no easy way to have addresses on hold).This is done so we can review and see if we have a more up to date address being added to the system.

We also have an option to check on and make sure that the city (we call suburb) field is filed out if state and country are set. This is so that when we do a postal mailing we can reduce the number of errors when we send it with Australia Post.

There is also a search to find addresses that don't have a number in the street address.

With the 2nd search (spam), it has a number of options. They are all about trying to find and eliminate potential spam records off the database.

The most common identifiers are where first name matches last name, the name contains numbers, if there is a Mixed Case Last name. Postcode is Non numeric and is of an Australian Address (as mentioned above Australia has no non numeric postcodes).

The phone number or postcode is longer or shorter than specified length. The defaults are set to Australian conditions (4 in both cases for postcodes and 10 min and 14 maximum for phone numbers).

Also there is search option to find punctuation where it shouldn't be in the name. These 2 searches have helped us keep our data as clean as possible, it has also allowed us to eliminate as many spam records as possible but there is constant vigilance to be maintained.

This work was commissioned by the Australian Greens and was built by Andrew McNaughton. You can find the extension in the extension list https://civicrm.org/extensions/aug-searches.