The quest for performance improvements - 6th sprint

Gepubliceerd
2017-02-24 05:59
Written by
jaapjansma - member of the CiviCRM community - view blog guidelines

In this blog I want to explain the round up we have done around the refactoring of the acl_contact_cache. In the previous sprints we discovered that a lot of the performance was slowed down by the way the acl_contact_cache was used (or rather not used at all). See also the previous blog post: https://civicrm.org/blog/jaapjansma/the-quest-for-performance-improvements-5th-sprint

At the socialist party they have 350.000 contacts and around 300 users who can access civicrm. Most of the users are only allowed to see only the members in their local chapter.

In the previous blog we explained the proof of concept. We now have implemented this proof of concept and the average performance increase was 60%.

We created a table which holds which user has access to which contacts. We then fill this table once in a few hours. See also issue CRM-19934 for the technical implementation of this proof of concept.

Performance increase in the search query

In the next examples we are logged in as a local member who can only see members in the chapter Amersfoort. We then search for persons with the name 'Jan'. And we measure how long the query for searching takes.

The query for presenting the list with letters in the search result looks like

SELECT count(DISTINCT contact_a.id) as rowCount  
FROM civicrm_contact contact_a 
LEFT JOIN civicrm_value_geostelsel geostelsel ON contact_a.id = geostelsel.entity_id  
LEFT JOIN civicrm_membership membership_access ON contact_a.id = membership_access.contact_id  
WHERE  ((((contact_a.sort_name LIKE '%jan%'))))  
AND (contact_a.id = 803832 
  OR ((((
    ( geostelsel.`afdeling` = 806816 OR geostelsel.`regio` = 806816 OR geostelsel.`provincie` = 806816 )
    AND (
      membership_access.membership_type_id IN (1, 2, 3) 
      AND (
        membership_access.status_id IN (1, 2, 3)
        OR (membership_access.status_id = '7' AND (membership_access.end_date >= NOW() - INTERVAL 3 MONTH))
      )
    )
  ) 
  OR contact_a.id = 806816
 )) 
 AND (contact_a.is_deleted = 0)
))
ORDER BY UPPER(LEFT(contact_a.sort_name, 1)) asc;



As you can see that is quite a complicated query and includes details about which members the user is allowed to see.  Only executing this query takes around 0.435 seconds and the reason is that mysql has to check each record in civicrm_contact (which in this case is around 350.000 and growing)

After refactoring the acl cache functionality in CiviCRM Core the query looks different:

SELECT DISTINCT UPPER(LEFT(contact_a.sort_name, 1)) as sort_name  
FROM civicrm_contact contact_a 
INNER JOIN `civicrm_acl_contacts` `civicrm_acl_contacts` ON `civicrm_acl_contacts`.`contact_id` = `contact_a`.`id`  
WHERE  (((( contact_a.sort_name LIKE '%jan%' ))))  
AND  `civicrm_acl_contacts`.`operation_type` = '2' 
AND `civicrm_acl_contacts`.`user_id` = '803832' 
AND `civicrm_acl_contacts`.`domain_id` = '1' 
AND (contact_a.is_deleted = 0)    
ORDER BY UPPER(LEFT(contact_a.sort_name, 1)) asc

The query now takes around 0,022 seconds to run (20 times faster).

Explanation

How does this new functionality works:

1. Every time an ACL restriction is needed in a query civicrm core only does an inner join on the civicrm_acl_contacts table and that is all

2. The inner join is generated in the service 'acl_contact_cache'  that service also checks whether the the civicrm_acl_contacts table need to be updated or not.

3. When an update of civicrm_acl_contacts table is needed depends on the settings under administer --> System Settings --> Misc --> ACL Contact Cache Validity (in minutes)

So how does this look like in code?

Below an example of how you could use the acl_contact_cache service to inject acl logic into your query:

// First get the service from the Civi Container
$aclContactCache = \Civi::service('acl_contact_cache'); // The $aclContactCache is a class based on \Civi\ACL\ContactCacheInterface
// Now get the aclWhere and aclFrom part for our query
$aclWhere = $aclContactCache->getAclWhereClause(CRM_Core_Permission::VIEW, 'contact_a');
$aclFrom = $aclContactCache->getAclJoin(CRM_Core_Permission::VIEW, 'contact_a');

// Now build our query
$sql = "SELECT contact_a.* FROM civicrm_contact contact_a ".$aclFrom." WHERE 1 AND ".$aclWhere;
// That is it now execute our query and handle the output...

The reason we use a service in the Civi Container class is that it is now also quite easy to override this part of core in your own extension.

The \Civi\ACL\ContactCache class has all the logic to for building the ACL queries. Meaning that this class contains the logic to interact with the ACL settings in CiviCRM, with the permissioned relationship etc.. All those settings are taken into account when filling civicrm_acl_contacts table which is per user and per operation once in the three hours.

 

Filed under

Comments

At Future First we created a similar "contact can view" table to get around problems in the ACL. What I would recommend is to not just have the ACLs populated on a time basis, but driven by particular events. For instance, if the ACLs are viewable by relationships, deceased, membership status, then have changes to these populate the table. Then you aren't redoing the same work unnecessarily for so many contacts.