Published
Friday, September 6, 2019 - 06:59
Written by

This blog post explains how you could insert data from a CSV file into CiviCRM. We use Pentaho Data Integration to read the CSV file and to call a Form Processor in CiviCRM.

This blog post is an example and when you follow the steps described in this post you can run the same import as me.

Pentaho is a tool to extract and transform data.

The form processor is an extension to create end points for forms in CiviCRM. Those end points can then be called through the api.

Prerequisites:

Create the form processor

First we create a form processor. In CiviCRM go to Administer --> Automation --> Form Processor.  And add a new Form Processor

We give our form processor the name Migrate Person and set permission to Access CiviCRM Backend and API and make sure your API user has this permission.

We add four input fields of type short text:

  • First name
  • Last name
  • E-mail
  • Tag

We then add an action Create/Update Individual and link the fields First name, Last name and E-mail.

We then add an action Add tag to contact. We link the contact ID. To the contact ID of the action Create/Update iindividual. We link the field Tag to tag.

Create a Pentaho Transformation

In  Penthao create a new transformation. See image above for the final transformation.

Add an CSV File Input. Right click on the CSV File Input and click Edit. Add your file and after that press Get Fields.

Add a Modified Javascript Value transformation. With this transformation we create the JSON data which will be send to CiviCRM. Add the following script:

var jsonData = {
  first_name: first_name, // key: field_from_pentaho.
  last_name: last_name,
  email: email,
  tag: 'Migration for blogpost'
};

var jsonRequest = JSON.stringify(jsonData);
var postData = 'json='+encodeURIComponent(jsonRequest);

jsonData is a JSON object with a the name of the property (e.g. first_name) and a value (data in the first_name column in the csv file). In our example the name in the form processor fields and the column names in the csv file are the same. But you can off course have a different field name in your form processor and a different name in your source file.

Add the postData field to the Fields Section.

Add a REST client transformation. This step sends the actual data to CiviCRM.  The url is http://your-civicrm/sites/all/modules/extern/rest.php?key=sitekey&api_ke... (The action is the name of the form processor).

Set the method to POST

Set the bodyField to postData (this is the data we created in the pervious step)

And set application Type to FORM URLENCODED.

Now we are ready to go. Go to menu action and press Run. This will run and improt the file.

In real life

In reallity your source files are not as neat formatted as the example of above. Also you usually have multiple source files and you also want to do some clean up of the data. So your transformation becomes a little more complicated than this example one.

See for example this one I have used recently to migrate tens of thousands of contacts from one system to CiviCRM:

Improving performance

When you have a really large data set you can also process records in parallel. This will speed up the run time of your migration. This is only something advanced users should do.

To do so right click in Pentaho on the REST Client and click on Change number of copies to start... And set the number to 5 or so.