Published
2010-09-15 13:17
As hinted in the blog post on upcoming features, CiviCRM 3.3 will ship with the first cut of database-level logging.
This feature was first discussed alongside contact undelete (introduced in CiviCRM 3.2) and specified further on the wiki, and after running some tests the current idea of the implementation follows quite closely what was discussed and – especially – speced there (so do check the wiki for details).
The general idea is that every CiviCRM table that we’ll want to log will acquire a counterpart log table (e.g., civicrm_contact_log will be introduced to track civicrm_contact changes) that will be append-only (and so use the ARCHIVE engine). The log tables will mirror their counterpart ‘source’ tables, with additional columns for storing the logtime, the MySQL connection id, the contact id of the current CiviCRM user and the action that is happening to the ‘source’ table (insert/update/delete).
The log tables will be populated using triggers on the relevant ‘source’ tables, with the connection id and the current CiviCRM user coming from the current session. This approach allows us to offload the logging to the database, which should be much more efficient, less error-prone and automated.
As triggers are often not available on shared hosting, logging will be an optional feature (off by default), available for enabling after a check that all the requirements are met.
We plan to leverage the CiviReport stack to do log-based reporting; in the future (CiviCRM 3.3 if possible for a first cut) we also plan to introduce undo/revert functionality; this will be based on the connection id tracking in the log tables (so an undo/revert atom will be whatever happened to the database during a single connection).
Comments
it might be useful to have the option of handling the log tables in a separate database. the logs have potential to significantly bloat the db -- especially if there's no mechanism for limiting history (e.g. 5 latest revisions are kept). that will impact backups and general performance -- and will make navigating the db more cumbersome.
it would also be great to have some built in diffing utilities. as i understand it, the entire row would be duplicated on change -- not just the delta data. that's great from an archival standpoint, but most of the time people just want to know the piece that changed.
The existing log, beside being more "lightway" as still the benefit of being able to store not only logs about changes on the contact record itself, but potentially about the join tables too, eg. contact_group, contact_tag, contact_case...
In short, please keep the existing log, I think it serves a different and complementary function.
X+
Interesting - I've never found the existing log very useful and had (before I saw the logging specs) hoped that it would be done away with & replaced by an enhanced activity log. I had thought it didn't really tell us much at all but some actitivies (contribution received on, bulk mail sent) are little more than a log and make it hard to see the 'real' activities (that are not reflected elsewhere or applicable to all in the DB). My imagined solution would have seen log done away with but some extra activity types added (the things that are currently in the log) and then some user configurability as to which types of activity would show on the 'activity tab' and which would show on the 'log' tab.
But when I read the spec I realised it was addressing a completely different problem.
On the logging spec - what information will be added to the log table in terms of:
- date row was changed
- user that changed it (CMS userid)
- contact that changed it (civi contact id)
- reason / action that caused it to change
Hi,
Imagine that on this log, you get beside the "something changed on the contact", "something changed with a group and this contact", "something changed with a tag"...
It would allow to be able to do plenty of asynchronous tasks, eg:
- export all the contacts that have been added/removed in group X to your mailman
- export these to your outlook
and in general, plenty of these actions are done synchroneously (eg sync with og group), meaning that you got a less responsive interface for no reason, as it could be done in the background if we had a proper tool to trigger them.
X+