Wednesday, February 14, 2007 - 15:39
Written by

An interesting discussion spawned on the civicrm-dev list recently regarding our implementation of custom groups and fields. We have been super cautious about this and have advised people not to create more than 20 custom fields per object (contact, activity, group, relationship etc). However creating a custom field is relatively easy and people have been doing it at a fast and furious pace. We've heard of implementations with more than 100 custom fields. This has worked for some people, but a large number of custom fields has the following problems:

  • Custom values are stored in a thin table, one per value. This table will grow significantly for large number of custom values
  • Primary Export is not possible since it will exceed the mysql JOIN limit of 61 tables
  • You cannot make a large number of these custom fields searchable, else search will exceed the mysql JOIN limit
  • There is potentially a slow down with operations involving large number of custom fields. I suspect we can easily optimize these queries once we are aware of them

Seems like this is a good time to take a step back and examine the model again and how well it will suit us in the future. Everyone does agree that our current model is not the best way going forward. Drupal went through a similar transition from flexinode to CCK and seems to have addressed the issue relatively nicely. We should learn and absorb the lessons from the Drupal CCK experience.

CCK creates a new table for a custom group and then adds a column for each custom field. Thus you have have entry in the table for each contact, rather than n entries in the custom value table. This reduces the number of JOIN's from n to 1. However if the user creates lots of custom groups, the custom groups will also hit the mysql JOIN limit. I suspect it will be easer for folks to group their custom fields under a small number of groups

David Strauss suggested on IRC that we use one extension table per domain and object type and alter that table as the definition. This allows the system to have a fixed number of tables and avoid the JOIN problem. This could potentially create extremely large tables for large number of custom fields. However this is much more efficient than the previous problem since it reduces the number of JOIN's even more. i.e. a max of one per object type

We also need to be able to handle multiple values per custom field. CCK does not handle this currently. We will need to allow multiple rows per object id for custom groups that permit multiple values. Thus there will be no unique constraint on the object id in these tables while single value tables will have a unique constraint on the object id. Thus it might make more sense to deal with multiple values with respect to a group, rather than a single field.

Depending on the size of the database the table alter process will need to be done offline, since mysql will take a long time to alter a large table.

While we are busy trying to get 1.7 under control, we'll ruminate on this and continue the discussion on the mailing list ...

Filed under


I have been Googling, and haven't found clear guidance about practical impacts of the number of custom fields, in total, or per object.

Looking at what I think is a current ERD, it looks like there is still one custom fields table, which means that these warnings/considerations are still relevant?

Thanks for any clarification.

In 4.0.6, I added a field group called Academic, and I see a table civicrm_value_academic_1 in my database, so it seems that, at some point the "CCK" approach was chosen.

To finish my example:

, `civicrm_contact`.`last_name`
, `civicrm_value_academic_1`.`institution_1`
, `civicrm_value_academic_1`.`year_2`
INNER JOIN `acendedu_civi1`.`civicrm_custom_group`
ON (`civicrm_contact`.`contact_type` = `civicrm_custom_group`.`extends`)
INNER JOIN `acendedu_civi1`.`civicrm_value_academic_1`
ON (`civicrm_value_academic_1`.`entity_id` = `civicrm_contact`.`id`)
WHERE (`civicrm_contact`.`id` 2);

That is to say...contact_id matches the entity_id in the custom field value records, and the custom (field) group extends Individual, which happens to be the contact_type of the Contact. Clear as mud?

since this is a really really old blog post :)