Three weeks ago I wrote about our quest for performance at the Socialist party. This week we had a follow up sprint and I want to thank you for all the comments on that blog.
During this sprint we have been looking into the direction of the amount of groups (+/- 2.700) and whether the amount of groups slowed down the system. We developed a script for deleting a set of groups from all database tables and we deleted around 2.400 groups from the system and we saw that this had an positive impact on the performance.
Before deleting the groups adding a new group took around 14 seconds. After removing 2.400 groups, adding a new group took around 3 seconds. So that gave us a direction in which we could look for a solution.
We also looked what would happened when we delete all contacts who have not a membership from the database and that also had a positive impact but not as huge as the reducing the amount of groups. The reason we looked into this is that around 200.000 contacts in the system are not members but sympathizers for a specific campaign.
We also had one experienced database guy (who mainly knows Postgres) looking into database tuning; at the moment we don't know what the outcome is of his inspection.
From what we have discover by reducing the groups we have two paths to follow:
- Actually reducing the amount of groups in the system
- Developing an extension which does functional the same thing as groups but with a better structure underneath and developed with preformance in mind. (no civicrm_group_contact_cache; no need for nesting with multiple parents; no need for smart groups).
Both paths are going to be discussed at the socialist party and in two weeks we have another sprint in which we hope to continue the performance improvements.
Read more
- https://civicrm.org/blog/jaapjansma/the-quest-for-performance-improvements
- https://civicrm.org/blog/jaapjansma/the-quest-for-performance-improvements-3rd-sprint
- https://civicrm.org/blog/jaapjansma/the-quest-for-performance-improvements-4th-sprint
Comments
Reducing the number of groups always makes sense in my opinion. It removes unnecessary bloat and most sites have large numbers of unused groups.
Nested groups tend to be particular problematic for performance purposes.
I do think that an extension for performance might work as a test run - but I think if you go down this path you should have a plan for how to re-integrate it back into core or you will run into trouble later. Personally I would investigate how to do it in core against 4.7 and then from there look how to spin that work into a back-port or extension if you need to use it on 4.6.
I would also note that sometimes turning off smart group caching improves performance, as does turning off opportunistic smart group updates.
Turning off opportunistic caching is a must, if you have a lot of smart groups.
With that many contacts in an active database, civi would be trying to rebuild the smart group cache too frequently.
In terms of slow searches because of the number of contacts, I would also suggest creating a custom search which only returns the fields you need. It's helped greatly with our database of around 275,000 contacts.