Reply to comment
- Not Just a Contact Database
-
These optional components give you more power to connect and engage your supporters.

civiCASE
Case management for clients and constituents.

civiCONTRIBUTE
Online fundraising and donor management.

civiEVENT
Online event registration and participant tracking.

civiMEMBER
Online signup and membership management.

civiMAIL
Personalized email blasts and newsletters.

civiREPORT
Report generation and template management.



Suggestion for revision/versioning
I support option (a) for the undelete flag, with one modification that will be a step towards full logging support without adding hurdles getting the basics in to 3.2
Proposal:
Instead of having an is_deleted column, use an integer column called "setID". SetID stores the following values:
- If record is added, setID defaults to 0
- if a record is updated, create a copy first and assign setID to the primary key of original record. Then update the original row
- If record is deleted assign setID = -1
example (omitting the modifiedBy/timestamp columns for clarity)
id, name, setID
1 David -1
2 dave 1
3 david 1
4 David 1
This indicates that a record was edited 4 times. First time it was added the name was "dave", then changed to "david", then to "David", then deleted.
As with the is_deleted flag, returning the HEAD version is simple: WHERE setID=0. If the record is deleted it will not show up.
If we require to undelete, simply look for setID=-1 and set it back to 0.
Notes:
- a little counterintuitive is that the HEAD row has an id that is always less than all of it's revisions. But that's not too hard to get used to, especially since this only concerns logging.
- this scheme assumes that all objects requiring logging have integer primary keys. If not, a CHAR key could be used. VARCHAR is no good for performance reasons.
- Timestamp has to have higher resolution (per millisecond or better) or we'll get in to trouble running scripts that may update the same records at the same time. In mysql long is the only option.
If time is of the essence for 3.2, we don't need to concern ourselves with storing the log row. What we do accomplish is having the right column type, and introducing the idea of a HEAD revision in all of our views/queries and the scheme will be ready for logging.
A further enhancement is to add a second column - "revision" - which is akin to revisions in SVN but requires more application logic - i.e. if I go through and tag several contacts and do a bulk update, they will all get their setIDs updated, and will also get a common revision number. This allows rollback - that's a much bigger project for the application layer.
Regarding multiple tables:
- I lean towards not creating separate log tables - harder to maintain changes, more tables to manage, harder to build queries that span both HEAD and log data.
- if performance is of the essence logging can be turned off entirely or kept to a certain period, as described in the original post. Indexing is obviously key.
- if someone needs logging in perpetuity and the data volume is huge, they will run into similar issues of manageability with the log tables, i.e. they will be useless unless properly optimized.
- if there is a use-case for large installations with perpetual logging, it would be easy to extend the "truncate after X days" function with an option to move to archive.
- And there is always log replication.
All that to say that single table seems easier to manage, and there are workarounds for large-scale installations.
Other 3.x features:
- switches for non-logged operations (for certain tasks such as bulk import/update)
- roll-back functionality (requires "revision")
- audit log interface. Needs to be designed to be useful - on a table/row basis changes are easy to show for a db admin, however providing a list of only the changes in human readable form is not that easy. Alternative is to populate civi's existing edit log feature with additional information.