4.7.8 Group Contact Cache deadlocks improvement

Publié
2016-05-22 20:20
Written by

Busy sites have often encountered problems with deadlocks on the group contact cache. There were no less that 3 different code contributions to mitigate this problem put up for 4.7.8 and a number of other discussions have been going on in JIRA.

 

Merged into 4.7.8 are some improvements which we hope will mitigate this problem for those sites that experience it. JIRA is the primary source of information on this, however I wanted to share a brief overview.

 

The focus of the fix so far has been very much on deadlocks when clearing the contact cache. The reason being that these have been causing the most havoc. In analysing the problem it turned out that when someone edits pretty much anything the code looks for stale groups & tries to flush the cache. Depending on the size of the cache that can take a few seconds and it doesn't update the cache last flushed date until afterwards. So if the query takes 2 seconds but things are being edited several times a second those later edits will hit deadlocks.

 

The improvements in 4.7.8 have a few aspects

  1. changing the tables to timestamp & eliminating complex date manipulations. This was mostly to eliminate complexity & code that was highly prone to error, but also included speeding up a couple of queries by removing calculated time functions.
  2. adding a mysql lock so that edits can detect that another process is clearing the cache & not attempt to clear it at the same time. This is a best effort approach as prior to 5.7 mysql we can only support this when there is not a CiviMail running, and the CiviCRM code does not yet have a mysql 5.7 mode to do it properly. Despite this it should have an impact on the number of deadlocks
  3. explaining this is the reason I wrote this blog.... from 4.7.8 it will be possible to turn off 'opportunistic' flushes of the group contact cache table. This means you need to replace the 'poor mans cron' with a real cron - and you want to do it pretty often. If your smart group cache time out is every 5 minutes and your cron only runs every 10 minutes your cache will be 15 minutes old most of the time. I would expect you would want to run the cron every minute (& possibly reduce your time out by a minute). 

 

To turn off the 'opportunistic' caching you can use drush

drush cvapi setting.create smart_group_cache_refresh_mode=deterministic

Or edit "civicrm.settings.php"

global $civicrm_setting;
$civicrm_setting['CiviCRM Preferences']['smart_group_cache_refresh_mode'] = 'deterministic';

or via the api explorer.

You will need to add a cron job calling the API "job.group_cache_flush".

Filed under

Comments

Anonymous (non vérifié)
2016-06-16 - 20:53

Hi Eileen - Smart Groups in my instance of CiviCRM (Drupal 7.43, CiviCRM 4.6.14) do not update when new individuals comply with the smart group criteria and is possibly caused by the issue described in your blog above. I have a number of cron jobs running (mainly to deliver emailed reports) and the main cron job runs hourly. You suggest that cron ought to run every minute rather than every hour which mine does and that civicrm.settings should be changed as shown above. My questions are: firstly, do I need to add a cron job calling the API above as well and if so how frequently should it run; secondly how do I know if poormans cron is running or not?

Regards
Martin Fuggle

The discussion above focusses on the flushing rather than the filling of the caches -  the job you need to schedule to clear the caches is

 "job.group_cache_flush".

 

Note that I haven't tackled the filling only the flushing as they turn out to be distinct problems & the latter had the most impact on the sites I dealt with

Anonymous (non vérifié)
2016-06-20 - 16:09

I have tried to implement the job to clear group caches but it fails as shown in the attached screenshot. I am clearly going wrong somewhere. Would it be possible to provide a little more detail to assist me?

Cheers
Martin Fuggle

A year or more late but for reference...

Eileen's post refers to changes in Civi 4.7, whereas your comment indicates that you were using 4.6 .
Re smart groups not updating, check the value of your smart cache timeout at /civicrm/admin/setting/search?reset=1 .