Simultaneous Editing

Published
2012-07-22 18:01
Written by

Brian Shaughnessy (lcdweb), who has been working with the New York State Senate's CiviCRM project, recently raised the issue of simultaneous editing: What happens when two users simultaneously make changes to same contact record?  We've held a few discussions on IRC to examine the issue and draft some solutions.  We would appreciate further feedback and ideas on how to address the issue.

The Problem

As described by Brian:

The issue is that if you and I get into the same record around the same time, and you save the record first, when I save, it will overwrite your changes.  So let’s say you add a middle name, and I get in and add a birthdate.  The data is not in conflict, and both changes should be preserved, but they aren’t -- when you save the birthdate (where the edit form was loaded before I saved the middle name), the empty middle name overwrites the value I had saved. 

The Solution

The most common solution (used by Drupal, Joomla, and many other systems) is called record-locking.  In this approach, the first user will be allowed to make her changes -- and the second user will see an error indicating that a 'lock' on the record prevents his changes.  To implement record-locking, we must make a few decisions.

Decision: Entities

In some systems (such as the Java Persistence API), locking can be easily configured for any entity or table stored in the database.  This can work well if all of the "Edit" screens are designed with an understanding of the locks, but fine-grained locking can create more corner-cases: what happens when one user locks on a phone number record (eg using inline "Edit" blocks) while a second user locks an email (eg using "Batch Update") and a third user tries to obtain locks on the street and email and phone (eg using "Edit Contact")?

For an initial implementation, we propose to enable coarse-grained locking for the primary elements of a contact -- the first name, last name, demographics, phone numbers, email addresses, and street addresses.  In future releases, locking can be enabled for events or other high-level entities other high-level entities.

Decision: Optimsitic or Pessimistic

Drupal and Joomla demonstrate slightly different variations on locking:

  • Optimistic Locking: In Drupal, two simultaneous users who edit the same web page will both be allowed to see the "Edit" screen and begin making changes -- under the optimistic assumption that only one of them will actually <em>save</em> the changes.  The first user who hits "Save" will succeed -- the second user to hit "Save" will get the error (and feel a pang of disappoinment because his changes cannot be saved).
  • Pessimistic Locking: In Joomla, a web page can be edited by only one user a time -- if a second user tries, he will immediately see an error.  This spares the second user from the disappointment of unsavable changes, but it comes with a different downside: if anything goes wrong for the first user (eg he walks away or suffers a computer crash), then the record will be stuck -- locked and uneditable by others. (The record won't be stuck permanently -- after a few minutes, the lock can timeout.)

Decision: Legacy Code

Any locking solution requires extra logic to detect locks, display errors, etc.  We can add this logic to several important screens (like "Edit Contact"), but practical considerations will require leaving other code unchanged. (For example, any scripts that call the API are currently written with the assumption that there are no locks.) So what happens when legacy code tries to modify an otherwise locked record?

For optimistic locking, there is basically one option: let the edit go through. For pessimistic locking, three options:

  • Ignore the lock: Any change made by the legacy code will go through immediately, regardless of whether someone else has a lock.
  • Steal the lock: Any change made by the legacy code will go through immediately. If someone else has a lock, they will lose it and be unable to save changes.
  • Fail if locked: The legacy code will receive an error (eg API error or PHP exception) telling it that the change could not be saved.

Decision: Activation

Lock functionality will be a new addition to CiviCRM -- which means that some administrators may not be familiar with it, subtle bugs or performance issues may arise in the initial implementation, etc.

For the first release, we may wish to treat "locking" as an optional feature which can be enabled or disabled by administrators. Alternatively, we could go all-in and activate locking on all installations.

See Also

An issue with more specific ideas and steps has been filed as CRM-10553.

 

Comments

I agree that it would be quite complex to define great locking behavior for every single form -- which is why that's left out of scope for the proposed project. :) If I could restate the proposal, it's this: "The Edit Contact screen [and View Contact inline edit blocks] will claim and respect preventive locks. All other existing ('legacy') UIs will treat concurrency as they do today -- with the extra side-effect of interrupting anyone using the Edit Contact screen." To show this, I'll try to answer your points individually.

With edit in place (including at the single field level) and the new setvalue action on all the entities and civimobile that comes on the top of the not so edge case problems you already described, not sure the lock solution in the right answer.

There are a few options for those screens: they can do nothing and use the default lock-stealing behavior which will basically allow the edits to go through (as they do today) but which will kick-out anyone concurrently using the "Edit Contact" screen. Alternatively, the client could request a lock while rendering the screen. (For CiviMobile, this might mean chaining 'contact.get' with 'contact.lock'.) The subsequent AJAX calls ('contact.setvalue', 'email.create', etc) would not need any significant changes. Why? Because the contact-lock would be held on behalf of the user's session -- there's no need for callers to pass around or keep track of lock IDs, revision numbers, last-modified-times, etc.

The problem becomes even more complex with profiles that are used by the end user directly. What happens when a end user register to an event/make a donation/renew their membership/register to the newsletter...?

The lock has to take into account profiles (and weforms) as well and it can become explosively more complex (an event registration with a profile containing a name,+phone number+email+country needs to lock all these entities?

If the registration/renewal only deals with the participant/membership record, then locking is irrelevant because it's not within the scope of the lock system (which only covers core contact+address information). If there is a profile form which touches the contact, then the default policy (lock-stealing) is probably best -- it lets the constituent's changes go through and breaks any concurrent edits on the staff's "Edit Contact" screen.

And then what about creations? if I add a primary phone using a profile, what happens in the backoffice when editing existing phones? should the creation be locked, or should changing the is_primary be uncovered by the locks?

If the phone is part of a contact record that is locked, then the choices are the same: use the default lock-stealing behavior or block the creation/edit.

It is safe to assume you can't update all the edit forms/chanels, both within the core civicrm and even more with alternate interfaces (eg. the mobile interface, the xmpp...), so "legacy code" will exist and ignore locks.

Yes, definitely. Perhaps "legacy code" was too pejorative -- "default behavior" or "lock-stealing behavior" would be better descriptions.

Suggestion:

Wouldn't a monitoring framework answer better the need?

When I'm starting to edit/view a contact, it flags that I'm monitoring it. When someone else modify it, I get a notification (new phone number xxx for the contact yyy done by user zzz).

From a UI perspective, real-time monitoring is much better than locking. A good example might be Google Docs -- multiple users can edit the same document, and they don't have to do locks because everyone sees the same changes in real time. The problem is more with implementation: PHP kind'a sucks at letting servers push data to clients. That's not saying it's impossible, but the best techniques (eg "Comet") break the traditional HTTP request cycle, and many tools common to CiviCRM installations (i.e. PHP interpreters, firewalls, reverse-proxies) cater to the traditional HTTP request cycle. The technique is more commonly used with different stacks (like node.js).

I'm personally down with a "monitoring" approach if other folks find these consequences acceptable:

  • The "Edit Contact" (and "View Contact-Inplace Edit") will make more AJAX requests and run more concurrent PHP processes -- possibly affecting performance with large userbases or slow network links. (Alternatively: This could be mitigated somewhat by introducing a major new dependency like node.js -- but that won't be available in many commodity hosting environments.)
  • Firewalls and reverse-proxies need to be tested and tuned for compatibility. (In commodity hosting environments, the firewall policy is usually "take-it-or-leave-it".)
  • If the hosting environment cannot support monitoring, then the feature must be disabled. (I'm not sure if it should be turned on or off by default, but it needs to be toggleable.)
  • For legacy code (screens which don't incorporate monitoring -- ie most screens), there are no options for tweaking behavior. (By contrast, with a pessimistic lock, the default behavior of "ignore/steal/fail" could be a *policy* question which could change on a per-site/per-page/per-user basis using config-options or hooks.)
  • The solution will mitigate problems when users are working at roughly the same time (ie an overlapping time frame of a minute); but if two users actually click "Save" within a second of each other, then the changes may unknowingly overwrite each other. (Unless, of course, we implemented optimistic locking on top of monitoring -- but then our project's total-cost goes up.)
  • We need to draft a new plan/estimate.

Hi,

Thanks for the clarifications, can we go through a scenario? I'm not sure I understand how the conflict is presented

I got an event registration for the october meetup with a profile that contains the email+first name+last name+phone

I'm editing (using the normal back office contact edit form) The details from john doe, because I want to update his email (john@doe.com instead of john.doe@alo.com) add a phone (123456789) and write in the source field that I met him at the last meetup (I forgot when I created the contact)

While I'm editing it, John comes online and registers to the next meetup. He doesn't change his email, add a phone (123456789, the same I'm trying to add) and update his first name to Johnny.

What is happening when I save? I got a warning saying that I lost the lock (in a human friendly way), but how do I review what John did vs. my modifications?

are some fields not in conflict (ie. phone+source) automatically updated anyway?

How do I know the first name and emails are in conflict ? How can I decide which version to keep?

My expectations: When the staffer submits the edited contact screen, he gets a warning. The warning is generated as part of the form validation, so nothing is saved to the database (even if the staffer focuses on different, non-conflicting fields). The warning would be coarse and would not highlight individual fields.

 

It's not a coincidence that the proposal involved neither automatic merging nor highlighting of conflicts. Either of those features would increase complexity by roughly the same degree -- i.e. they require retaining a copies of the old revision of the contact (and closely related entities), then performing field-by-field comparison across the entities, then communicating those details back to the user. If we did provide such fine-grained feedback, then we could reasonably say that "We have the best damned concurrency support on the market!" (Have you seen another product in this space with this functionality?) OTOH, the increased complexity would increase technical risks and probably increase costs. Is this a feature that justifies the risk/cost?

 

FWIW, in a "monitoring" approach, there's a similar trade-off between granularity and complexity/risk -- there, the choice is whether the monitoring system reports a coarse message ("This contact is being edited by another user") or granular message ("Field X has been changed by another user").

 

Last thought: If our goal is "to provide the best damned concurrency mechanism for web CRMs", then a fine-grained monitoring approach is a good design. If our goal is "to mitigate a rare issue with minimal cost/risk", then a coarse-grained lock is a good design.

Hi,

Definitely node.js is made for that.

As it's likely that you need big org and lots of users to get a "high enough" risk of collisions, it might not be such a big constraint?

For smaller orgs, it might be put it place as a saas on a different server than their civicrm hosting server?

 

As for the implementation, websocket is probably the right solution in the future, not sure if it is the present or not. Anyway, as for comet, likely needs a separate running server I think (that might be in php, found examples) otherwise will use all the threads on apache.

Otherwise, polling is a "not pretty but works solution".  Every X seconds, ajax request "someone else modified the contact?" To  avoid DOS , might increase X after each pooling, and plan a return status "stop asking, overloading the server"

http://socket.io/ seems to do a good job at hiding the complexity and workaround the various different implementations, but no php server (pretty much every language but php it seems ;)

It's pretty obvious but didn't mention it: monitoring can notify before any change too, eg. "John doe has started registering for event X", "Your colleague Tim has started editing this contact too"...

I am wondering (no matter the implementation choice) if this feature shouldn't be implemented as an extension, with the existing hooks (buildForm, validateForm, pre/post, we should have enough without having to modify the core)

X+

 

One user case to bear in mind with the locking approach is there needs to be a mecahnism to release locks. For example if a user clicks edit on a contact record and then loses internet access or closes the browser they will have an oprhon lock on the contact record. Effectively this lock will stop others editing it for a period of time. I've seen this in other client server based systems but I suspect will be much more of a problem on CiviCRM where contacts can login and amend their own details.

That's a good point. FWIW, I think the issues from allowing constituents to edit themselves would be mitigated by a policy that "Staff obey locks, but constituents steal locks" -- i.e. it's usually better to let edits go through for constituents (who aren't as trained, committed, motivated as staff) and shift the cost of the conflict onto the staff.

Just thinking about this a little more, also taking into account what I've seen done in other systems.

Assuming we're using optimistic locking at the moment i.e. we're assuming nothing is locked.

Another approach is to only check at the time of saving i.e. check the log tables to see if the data had changed from the data that was presented. I'm not particularly fond of this approach but just something to throw out and probably fits the definition of optimistic locking. The approach defined above is more along the pessimistic locking approach where locks are positively seeked out prior to allowing edits.