Writing a custom import script ..

Opublikowane
2010-05-28 10:01
Written by
Anonim (niezweryfikowany)
The next phase of the CiviSchool project is collecting and maintaining all parent / student information online. This avoids the annual filling out forms work by the parents, and also the stuffing of envelopes by school staff during the summer and reentering all the information in the SIS when school starts. To make things easier, we wanted to export and import as much data as possible from PowerSchool. It has a pretty good export and we got all the information in a nice familiar CSV format. However the system is not very "relational". Parents and addresses have multiple records (one for each student), hence there is some amount of inconsistency in the system with regard to names and addresses. Also we have the CiviCRM/Drupal based system in operation for a year and this has also resulted in slight divergence between the two systems. Due to various family structures, the school maintained the family information as a max of 2 households (1 address per househould), max of 4 individuals (2 per household), 2 email addresses and 6 phone numbers (1 set per household: home, mobile, work). We had most of the parent name and email in CiviSchool. the basic mapping of the csv file was: 0 - "Student_Number" - external_identifier 1 - "First_Name" - first_name 2 - "Middle_Name" - middle_name 3 - "Last_Name" - last_name 4 - "Dob" - birth_date 5 - "Ethnicity" - race 6 - "Lgbt" - family_structure 7 - "Adopted" - family_structure 8 - "Singleparent" - family_structure 9 - "Father" - parent_1 10 - "Father2" - parent_3 (if parent_3 is there, this is parent_2) 11 - "Street" - parent_1 street_address 12 - "City" - parent_1 city 13 - "State" - parent_1 state 14 - "Zip" - parent_1 zip 15 - "Home_Phone" - parent_1 - home phone 16 - "Fatherdayphone" - parent_1 - work phone 17 - "Fathercellphone" - parent_1 - cell phone 18 - "Fatheremail" - parent_1 - email 19 - "Mother" - parent_X (if street address present, parent_3 else parent_2) 20 - "Mother2" - parent_4 21 - "Mailing_Street" - parent_3 street 22 - "Mailing_City" - parent_3 city 23 - "Mailing_State" - parent_3 state 24 - "Mailing_Zip" - parent_3 zip 25 - "Mother_Home_Phone" - parent_X home phone 26 - "Motherdayphone" - parent_X work phone 27 - "Mothercellphone" - parent_X cell phone 28 - "Motheremail" - parent_X email I decided against using CiviCRM import for the following reasons:
  • The relationship structure was a bit different and data dependent
  • Some relationship data fields had to be filled in during import
  • The duplicate matching rules were more effective if they were restricted (since we already had a fair amount of data in the system)
  • There was a lot of script -> test -> tweak -> repeat all over again happening, and this was easier to do via scripting
My current import script is in the sfschool repository. Its very specific to what i'm doing, but a good basis for someone attempting to do a large import with various data rules etc I'm still working on improving the script and finishing the import. I'll probably write a follow on blog post of what the script does and how in the next week or so
Filed under
(logged-in users can click thumbs up if they thought this blog post was useful) (login to vote or to comment)

Comments

Hi,

Is admin lobo?

I have been using the migrate module for drupal content import and my feeling is that there is potential for a CiviCRM hook for it to bridge the gap between Civi import processes & interacting with the raw data.

The Migrate module, works in conjunction with table wizard which gives you a front end view into the mysql tables from which you are importing data. You can add comments and view them with the end users to clarify them. You can then create a view based of this with all the flexibility offered by views and migrate wizards allows you to extract data from that view and tell it which fields to put it into. By default it has fields for nodes, users and such like but you can expose other fields with hooks - e.g. civi fields.

Migrate gives you a front-end screen to run the migration and you can specify how many items to import - great for testing - and roll-back your imports, adjust and re-do. You can also run migrate from Drush.

At the receiving end you would add a hook that receives an array of fields from the migrate module and you need to tell it what to do from there. This is of course where the work comes in but I think it has huge potential. The roll-back functionality would have to be written too.

http://drupal.org/project/migrate
http://civicactions.com/blog/2009/jul/25/migrating_your_website_drupal