Releases: We are what we repeatedly do

Published
2016-10-25 15:49
Written by

At the CiviCons and developer meetings this year, we've had several conversations about release strategy. The topic is a bit abstract -- touching on a web of interrelated issues of technology and scheduling and business-process. I've been searching for a way to explain this topic to people who don't eat and breathe code in CiviCRM's git repos -- an analysis which is a bit simpler and more transcendent.

The best analysis predates us by a few years -- Will Durant attributed the idea to Aristotle's Nicomachean Ethics, paraphrasing:

We are what we repeatedly do. Excellence, then, is not an act, but a habit.

All of us -- users, developers, administrators, core, contributors -- want excellence in CiviCRM. Does that mean we need one grand release to prove our excellence? No, excellence merely means that we review every patch thoughtfully. Excellence means that we have mastery over our use-cases, and that we affirmatively and repeatedly demonstrate the quality of all our systems. Excellence means that compatibility and usability are upfront requirements. Excellence means that every release is production-ready -- because our thinking and our tools and our processes and our values habitually make them so.

As examples, here are some of the habits which help make us excellent:

  • Study problems or codes before acting on them.
  • Read and comment on other people's work. This makes their work better -- and it makes you smarter.
  • Define clear tests (either manual or automated) -- regardless of whether you work on patches or extensions or deployments.
  • Do your testing before the release -- not after the release.
  • Prioritize your testing on the things which seem most important or most risky for you.
  • Treat compatibility and usability and migration as upfront requirements.
  • If something requires a break, then make a new space where the breakage is safe. If you absolutely must have breakage in a common space, then coordinate and communicate and schedule proactively.
  • Don't be surprised by mistakes. Instead, correct them.
  • Take care of yourself. Don't try to fix every problem at once -- focus on a few topics which you can manage well over time.

Leap by Extension. Iterate by Month. (LExIM)

Aristotle and Durant are well-and-good, but they offer high-minded ideals which leave room for a lot of interpretation. More concretely, we've been working on a rule of thumb. When planning an improvement for CiviCRM, a developer should ask a simple question: does this idea represent a major leap forward or an iterative improvement?

  • If a change represents a major leap forward in functionality, it should be developed as an extension. Taking a leap is risky. There can be bugs and edge-cases and small discrepancies between the old way and the new way. Only a few people will be ready to take the leap at first -- but the folks who are ready to take a leap will want to take it quickly. Leaps should generally be prepared as CiviCRM extensions which can be downloaded and enabled separately. The extensions can stabilize on their own schedule -- without posing any risk to existing users. Leaping in this way also encourages modularization.
  • If a change represents an incremental fix or improvement, it should develop iteratively -- as a carefully-reviewed patch for the existing code. We should take concrete, upfront steps to check the quality of the change and ensure that it still meets the original requirements. This review process should follow a clear schedule so that multiple stakeholders have the opportunity to participate in a timely way.

Consider a few examples:

  • Example 1: Generating PDF and Word Documents
    • Discussion: CiviCRM has long included support for using mail-merge tokens to generate PDF documents. But what if you want to generate Word documents instead? The screens and fields and validations are pretty much the same in either case -- we just need a small option to toggle between pdf and doc output.
    • Solution: Iterate by month. This is a small change to an existing feature. Converting the feature to an extension would be much harder than patching it directly (and the conversion would actually create additional risks). Since we patch the code directly and release the new code to all users, we have strong responsibility to review it carefully from multiple angles.
  • Example 2: Mosaico
    • Discussion: CiviMail uses the "CKEditor" library for writing emails with headings, paragraphs, and basic text styling. However, it does not support drag/drop layout of blocks or columns -- and it lacks optimizations for robust display across email clients. The "Mosaico" library does provide these features -- but the user-experience and data-structures are very different from "CKEditor".
    • Solution: Integration between CiviCRM and Mosacio should be developed as an extension. Users who are most excited about this functionality should install it on their own. After a large number of users have been working with it successfully, we should enable the extension by default.

These two examples are relatively clean. Some situations don't cleanly break down one way or another, but we can still resolve them if we stay focused on the goals of continuity, compatibility, and reasonable transitions. Here are few more challenging examples:

  • Example 3: CSS Theming
    • Discussion: CiviCRM comes with one built-in "look and feel". This determines font-sizes, button colors, visual spacing, etc. However, this "look and feel" has grown outdated. The look-and-feel can be changed by replacing the file "civicrm.css", but doing so would be a very visible disruption (firstly, disrupting users who were trained in the old "look and feel"; secondly, disrupting customizations which tuned-in to the old "look and feel"), and it can take time for a new "look and feel" to reach maturity.
    • Solution: This involves both a leap and an iteration: the new "look and feel" is a leap which should be delivered as an extension. However, there's little precedent for "theme extensions" -- because this requires some new APIs. These new APIs must be added incrementally to the main application (without disrupting the old "look and feel").
  • Example 4: PHP 5, PHP 7, and mysqli
    • Discussion: PHP 7 made a change to the PHP standard library which required converting CiviCRM from php-mysql to php-mysqli. This posed a risk of breaking existing installations, but there was no reasonable way to package this as an extension.
    • Solution: Iterate by (three) month(s). The patches were prepared and submitted as part of the monthly release cycle, but they were held-over for extra time. In that period, we did extra research to assess impacts on existing users, tested more thoroughly, revised the patches, held discussions on the mailing-list and blog, and created an in-app advisory to help admins get ahead of the issue.

LExIM is only a rule of thumb. If you can frame your project as a leap/extension or as a small iteration, then you're on the right track. If it's not possible, then break it down and focus on these guidelines:

  • The next release should be compatible with existing installations and customizations. Users who understand the old release should be effortlessly comfortable in the new release. Take affirmative/positive measures to control this.
  • If it is unavoidable that an important change causes a break, then work to understand it throughly, communicate it, and prepare affected users or systems.

Areas of Improvement

Adopting habits which make for an excellent product requires work. To be sure, we've made a lot of progress in recent years, but there are still more areas where we can improve:

  • We need to strengthen our community-of-practice (skills, documentation, tools, etc) around testing. Tests aren't just for core -- they're for extensions and sites, too.
  • We need to get better about directing information manageably. As the scope of our systems and community have grown, the firehoses of JIRA, Github, etal, have grown too. Developing thematic working-groups and thematic release-cycles enables us to meet with like-minded people and deliver more impact.
  • Regressions can and will happen -- when they do, we need better feedback processes to manage them more effectively, to act more quickly, to prevent recurrences, and to reduce harms.

If you have other thoughts on how to improve -- how to ensure that every change and every release is something reliable which we can all be proud of -- please feel free to share.

Filed under

Comments

I wonder if there is a reasonable way for us to track the quality of our fixes and enhancements so that we could start to quantitatively and statistically evaluate the quality of the changes made to the core codebase. This is related to moving up to Level 4 of a Capability Maturity Model (https://en.m.wikipedia.org/wiki/Capability_Maturity_Model).

Good question. It would be great to get a better measure on regressions -- Eileen and I have been talking about doing regular post-mortems after the release for a chance to talk about them. I'm not anxious to jump right into a hyper-structured approach since we need to build more culture around this. We'll try to do one a week after 4.7.13 comes out ( https://github.com/civicrm/release-management/ ).

For CMM, I wonder if there's some incentive for us to pursue a more formalized model -- e.g. maybe it helps in contracting?

Anonymous (not verified)
2016-11-02 - 03:56

Thank you for this blog post, I felt very happy reading it as it resonates with my views of some of what's important for the success of the CiviCRM project.

Your asking for people's suggestions encouraged me to share some of what I have to share. So here it is.

There are a number of books that I think can greatly help the people who are working on CiviCRM. The information in these books (I think) will lead to a change in approaches and consequently to greater quality and quantity of the work produced. I think the desired change in culture and habits will come this way.

The books I recommend:

Scrum and XP from the Trenches
Pragmatic Thinking and Learning
Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems
Science and Sanity

Also, talking about testing, I myself have not been able to get into test-driven development, but there are many resources out there. I think Henrik Kniberg has written on the topic. Unfortunately, as with agile and many other great things, most books and courses on the topic likely provide very little value. So you might want to seek contact with people whose expertise in test-driven development you trust, develop partnerships with them and learn from their experience.

Getting better at time management can also lead to a huge increase in productivity. Here is an article just to open the subject up: http://blog.trello.com/how-to-pomodoro-your-way-to-productivity

As you go through the books above, you might start asking more and more questions along the lines of: How is the team organizing itself? What platforms is it using for internal communication? How is it prioritizing? How much time is it spending on which things?

One thing that immediately comes up for me is Jira. Why not use github for issue tracking? Are there Jira features that warrant spending time on maintaining the installation? Can those things be accomplished through github?

I'm noticing the team schedules release dates for bugs and then keeps pushing the release date back when the bug has not been solved and there is a new release coming. Why spend time on pushing back the release date? Can this be done in a different way (e.g. default bug status of Unscheduled)?

And another thing: how are other open-source projects organizing themselves? I have experience with Joomla and Wordpress and I can tell you that when there is a new Joomla release, quite often it breaks things. Then another release follows, but often it breaks other things. So very often there are a few quick minor releases after a larger Joomla release. But with Wordpress, which I have been using for over a year now, this has not happened even once during that time. The difference in code reliability seems tremendous between the two CMSs. How is this so? What is Wordpress doing to achieve this quality? Are their releases focused on optimizing and further optimizing the existing features and codebase? I'm saying - it seems worthwhile studying the approaches and strategies of successful open-source projects. Probably there are many things they have discovered that can be adapted to CiviCRM.

This seems enough for a single post, please let me know if you find it useful. My main message is: we work on educating ourselves => our ability to impact the world improves proportionally.