Writing a custom import script ..

2010-05-28 10:01
Written by
Anonymous (not verified)

The next phase of the CiviSchool project is collecting and maintaining all parent / student information online. This avoids the annual filling out forms work by the parents, and also the stuffing of envelopes by school staff during the summer and reentering all the information in the SIS when school starts.

To make things easier, we wanted to export and import as much data as possible from PowerSchool. It has a pretty good export and we got all the information in a nice familiar CSV format. However the system is not very "relational". Parents and addresses have multiple records (one for each student), hence there is some amount of inconsistency in the system with regard to names and addresses. Also we have the CiviCRM/Drupal based system in operation for a year and this has also resulted in slight divergence between the two systems.

Due to various family structures, the school maintained the family information as a max of 2 households (1 address per househould), max of 4 individuals (2 per household), 2 email addresses and 6 phone numbers (1 set per household: home, mobile, work). We had most of the parent name and email in CiviSchool.

the basic mapping of the csv file was:

0 - "Student_Number" - external_identifier
1 - "First_Name" - first_name
2 - "Middle_Name" - middle_name
3 - "Last_Name" - last_name
4 - "Dob" - birth_date
5 - "Ethnicity" - race
6 - "Lgbt" - family_structure
7 - "Adopted" - family_structure
8 - "Singleparent" - family_structure
9 - "Father" - parent_1
10 - "Father2" - parent_3 (if parent_3 is there, this is parent_2)
11 - "Street" - parent_1 street_address
12 - "City" - parent_1 city
13 - "State" - parent_1 state
14 - "Zip" - parent_1 zip
15 - "Home_Phone" - parent_1 - home phone
16 - "Fatherdayphone" - parent_1 - work phone
17 - "Fathercellphone" - parent_1 - cell phone
18 - "Fatheremail" - parent_1 - email
19 - "Mother" - parent_X (if street address present, parent_3 else parent_2)
20 - "Mother2" - parent_4
21 - "Mailing_Street" - parent_3 street
22 - "Mailing_City" - parent_3 city
23 - "Mailing_State" - parent_3 state
24 - "Mailing_Zip" - parent_3 zip
25 - "Mother_Home_Phone" - parent_X home phone
26 - "Motherdayphone" - parent_X work phone
27 - "Mothercellphone" - parent_X cell phone
28 - "Motheremail" - parent_X email

I decided against using CiviCRM import for the following reasons:

  • The relationship structure was a bit different and data dependent
  • Some relationship data fields had to be filled in during import
  • The duplicate matching rules were more effective if they were restricted (since we already had a fair amount of data in the system)
  • There was a lot of script -> test -> tweak -> repeat all over again happening, and this was easier to do via scripting

My current import script is in the sfschool repository. Its very specific to what i'm doing, but a good basis for someone attempting to do a large import with various data rules etc

I'm still working on improving the script and finishing the import. I'll probably write a follow on blog post of what the script does and how in the next week or so

Filed under