courtly
courtly
courtly
courtly

Upcoming Events

San Francisco CiviCRM Meetup - February 8th, 2012
February 8th, 2012
Come meet others from the Bay Area who are interested in, using or developing (more...)

UK usergroup - London meetup
February 8th, 2012
Come and meet others from the UK that are using CiviCRM or are interested in (more...)

Chicago CiviCRM Meetup
February 17th, 2012
Please join other CiviCRM users, administrators, and developers in the Chicago (more...)

London user and administrator training
February 23rd, 2012
A comprehensive two day hands on training course covering the configuration, (more...)

CiviCRM London sprint Feb 2012
February 27th, 2012
Following the CiviCRM training here in London, we will have a CiviCRM code (more...)

Philadelphia - CiviCRM Meetup for Q1 2012
March 13th, 2012

UK South West - CiviCRM Meetup
March 20th, 2012
Come meet others from the Area who are interested in, using or developing for (more...)

[Bristol, UK] user and administrator training
March 21st, 2012
A comprehensive hands on training course covering the configuration, (more...)

San Francisco user and administrator training
March 29th, 2012
A comprehensive two day hands on training course covering the configuration, (more...)

CiviCRM Usability, Test and Code Sprint - San Francisco (March 2012)
March 29th, 2012
This usability, code and test sprint is targeted at CiviCRM users and (more...)

CiviCon 2012 San Francisco Bay Area - April 2nd 2012
April 2nd, 2012
CiviCon is THE annual event bringing together the people who use, develop, (more...)

CiviCRM Documentation, Test and Code Sprint - after CiviCon San Francisco (April 2012)
April 4th, 2012
This sprint is targeted at CiviCRM users and developers who want to work on (more...)

CiviCRM Components

Tools for engaging your supporters...

CiviContribute


CiviEvent


CiviMail


CiviMember


CiviReport


Civi-migrate - proof of Concept

Not Just a Contact Database

These optional components give you more power to connect and engage your supporters.

  • civiCASE

  • Case management for clients and constituents.

  • civiEVENT

  • Online event registration and participant tracking.

  • civiMEMBER

  • Online signup and membership management.

  • civiMAIL

  • Personalized email blasts and newsletters.

  • civiREPORT

  • Report generation and template management.

June 5, 2010 - 01:31 — Eileen

So, amongst all the discussion of import methods lately I just wanted to flag another possible approach - creating a CiviCRM hook module for the Drupal migrate module

There are a bunch of great blogs out there on how to use the table wizard module with the migrate module to import data from various mysql tables or views into Drupal nodes / users / taxonomies / content types - for example:

http://www.lullabot.com/articles/drupal-data-imports-migrate-and-table-wizard

The migrate module has a bunch of hooks to allow you to use it for other forms of migrations. I rattled up the module / code pasted at the end of this blog in a couple of hours as a proof of content for using this approach to CiviCRM imports. The code I threw together just offers up civicrm_contact table fields but I think there must be some clever ways to use existing import tools rather than this rudimentary approach.

The really nice thing about using this approach is that it constructs an array or object (in this example the $params object) based on the front-end configured mappings and then additional hooks have an opportunity to re-factor this $params object (e.g. re-parsing address fields) before the $params array is passed to the contact create api I think there is lots of potential here - especially since migrate module:

- already interacts with drush

- allows you to specify how many contacts to import at a time

- non-developers can make changes without needing the code to change (this can be a problem with a scripted solution if you have both developers & non-developers involved)

- allows migrations to be reversed or updated as you tweak it

- provides error reporting

- potentially allows you to use related tables as your source data - ie. it should be possible from what I understand to create an import that imports contacts with associated contributions from multiple tables - http://drupal.org/node/591776#comment-2107050 - I haven't worked through this yet

If we wrote a really good migrate hook module then all we would need to do to customise our imports is write hooks to massage aspects of the data. Obviously this is drupal centic approach but I think most of the people looking at big scripted migrations are doing it in drupal.

To get this code working you will need the modules:

- table wizard (tw)
- schema
- migrate
- views
- view ui (recommended)

You will also need to have a source mysql table to use with a primary key. You need to add this table in table wizard and analyse it before you can go to migrate module & create a content set. The blog by lullabot or the docmentation on migrate should help here

** - NB - I struggled to find a useful place to post this as a zip as I didn't seem to be able to add a file to the wiki page******

Code for module civicrm_migrate

migrate.info

****************************
; $Id$
name = Fuzion Migrate CiviCRM
description = Add on for Migrate module to migrate into CiviCRM
core = 6.x
package = AA Fuzion
version = 0.0
dependencies[] = "migrate"
dependencies[] = "civicrm"

************************************
migrate.module
**************************************

<?php 

function civicrm_migrate_migrate_types() {
  $types = array('contact' => t('CiviCRM Contact'), 'contribution' => t('CiviCRM Contribution'));
  return $types;
}

function civicrm_migrate_migrate_import_contact($tblinfo, $row) {


  civicrm_initialize( );      
  require_once( 'api/v2/Contact.php' ) ;

  //section copied from example
  $params = array();
  // Initially populate the new object according to the mappings
  // this is a standard bit of code from the example
  foreach ($tblinfo->fields as $destfield => $values) {
      if ($values['srcfield'] && $row->$values['srcfield']) {
      $params[$destfield] = $row->$values['srcfield'];
    }
    else {
      $params[$destfield]  = $values['default_value'];
    }
  }

  // Give other modules a shot at manipulating the object
  $errors = migrate_destination_invoke_all('prepare_contact', $params, $tblinfo, $row);

  $success = TRUE;
  foreach ($errors as $error) {
    if ($error['level'] != MIGRATE_MESSAGE_INFORMATIONAL) {
      $success = FALSE;
      break;
    }
  }


        
      if ($success) {
        $result  = giantrobot_civicrm_contact_add($params);  

   // $newid = example_sub_save($sub);
    // Call completion hooks, for any processing which needs to be done after node_save
    $errors = migrate_destination_invoke_all('complete_contact', $params, $tblinfo, $row);

    $sourcekey = $tblinfo->sourcekey;
    migrate_add_mapping($tblinfo->mcsid, $row->$sourcekey, $newid);
  }
  return $errors;
  
        
}

function giantrobot_civicrm_contact_add($params) {
  $params['dupe_check'] = TRUE ;

 
  $contact = civicrm_contact_add($params) ;
 
  if ( !civicrm_error( $contact ) ) {
    // for clarity
       return $contact;
  }
 
  else {
     return;
    // let's see if we have multiple matches
    if ( stristr($contact['error_message'],'Found matching contacts') ) {
      // if so, we'll get the lowest contact ID and update them
      $contact_ids = explode(',',$contact['error_data']) ;
      sort($contact_ids) ;
      $contact_id = array_shift($contact_ids) ;
      if ( (int)$contact_id > 0 ) {
        $params['contact_id'] = $contact_id ;
        $params['dupe_check'] = FALSE ;
        $contact = civicrm_contact_add($params);
        // for clarity
        return $contact ;
      }
      else {
        // some unlikely civicrm_error which gave us a non-numeric
        // contact_id
      }
    }
    else {
      // not multiple duplicates - some other civicrm_error
    }
  }
  // we didn't handle update to first dupe; this is either a
  // successful add of a non-dupe, or a civicrm_error

  return $contact ;
}





function civicrm_migrate_migrate_fields_contact($type) {

  $sql = " SHOW COLUMNS FROM civicrm_contact ";
  $contactFields = db_query($sql);

  while ( $field = db_fetch_array( $contactFields) ) {
   $fields[$field['Field']] =  $field['Field'] ;
  }  

  
  return $fields;
}

( categories: )

Comments

Scalability

Thanks for this useful post, Eileen.

Until the migrate functionality pushes the import implementation down from PHP processing of each row into SQL that operates on all records to be imported, it won't be able to handle large data volumes with any kind of reasonable performance. I can imagine using either the civicm_mapping or something in Drupal's schema to dynamically create the appropriate query/queries.

I really like Dalin's reiteration of the DRY principle. As we move to 4.0 I think that it should be kept front and centre as we consider frameworks and architectures. It's a useful additional way of looking at making the code more comprehensible and modular.

On another note, I believe the wiki does allow attaching documents, but the tiny paperclip on the top left of the page that is used to access them is non-intuitive and should be reworked. Almost everyone looks around at the bottom of the page for attachments, since that is the common standard, and the paperclip is so small it hard to find even when you are looking for it.

What do you mean by reasonable performance ?

I mostly develop custom code using the api and running for the shell. It processes enough contacts per second so I'm pretty sure that's over by the end of the coffee break. Even if you had 100'000th contacts, it will be ready by the next morning if you launch it late at night as long as you go through 3 contacts per second (and to be that slow, you need a lot of tests and data massaging, and probably do a lot of redundant lookups for every contact).

Definitely not fast, but good enough for imports that you run mostly one time, isn't it ?

Yes, I don't think

Yes, I don't think scalability is a problem for us. I ran through a very simple contact import for 30,000 contacts fairly easily. I did find that running the import through a browser it stopped every 6000 or so. This is a configurable setting but it means that you don't run into the problems you do with the Civi GUI import where you lose sight of what the outcome is when you over-do it.

The main thing I'm working on is importing pledges & associated contributions so most of my effort has gone into a (partial) api for pledges. I've got it working but I need to play with the relational side of it - ie. I using table wizard relationships / migrate to feed the contact id created in the contact import into the pledge import and the pledge ID into the contribution import (made harder by the need to use a combined key for the relationship between pledges & contributions)

I do note that the point someone made about the Civi import logging is a bit of a red herring as so does the API stuff.

Hi, I'll keep playing with

Hi,

I'll keep playing with this for a bit longer but I guess I'm looking at this for the same reason that Lobo wrote his own line by line import parsing script a couple of blogs earlier - ie to manipulate the data before importing it. Because otherwise we wind up exporting, grooming in Excel & re-importing which is OK if you only have to do it once. Or else we have to write a script to do it - in which case Migrate seems to reduce the amount of script required.

NB - I should note I'm not

NB - I should note I'm not suggesting this approach is substitute for the CiviCRM import tools but rather a tool in the toolkit for complicated migrations - i.e it does part of the job that you might otherwise write a script to do & gives a good interface ( a hook) for intercepting & modifying the import

NB - I should note I'm not

"NB - I should note I'm not suggesting this approach is substitute for the CiviCRM import tools"

I would. Well almost. Great integration with Migrate/Table Wizard would really kick ass and be far better than any of the existing import options. However there's two challenges that I see:

You'd have to break the DRY principle.
http://en.wikipedia.org/wiki/Don%27t_repeat_yourself
You'd need to duplicate the logic of the internal CiviCRM API within your table wizard hooks to account for things like adding entries to the log tables, firing CiviCRM hooks, and performing any special logic that normally happens when you create a contact/contribution/etc.

Last time I used Table Wizard/Migrate modules (which admittedly was a while back) there was a big limitation where all relationships had to be one-to-one (ex. if you have a row in your import that has multiple taxonomy terms you can only import one of them. You need to come up with another solution (probably some custom hooks) for the rest). Not sure if this limitation has been overcome since.

I'm kind of hoping that some

I'm kind of hoping that some of the internal Civi functions could be called from the migrate hook rather then re-writing it all from scratch. Maybe the API would benefit from having a 'log' option when you action things using it?

Re the relationships - I got the impression that you could do many-to-one from what I read but I haven't got it working myself yet. The link I posted in the body seemed to say you can but it is a bit confusing and I was planning to sit down & work through it soon.

I think I'm going to have to import pledges & I suspect the migrate hook will be the quickest way to get that up & running