The 4.7.24 release is scheduled for the first Wednesday of September. Ordinarily, there would be an announcement about the release-candidate (RC) in mid-August, but we're doing something a bit different this time around -- extending the RC to a full month, which mean the RC is available now at http://download.civicrm.org/latest. Testing out the RC is a great way to ensure that your systems will continue to work in the next release. Let me talk about how this change helps.
Quality-control for a new release is a team sport. It relies on reporters and developers performing thoughtful analysis when they identify a problem or improvement. It relies on developers writing code which anticipates errors, improves test-coverage, and follows recognizable conventions. It relies on infrastructure running automated tests. It relies on reviewers understanding the problem-domain and questioning assumptions. It relies on testers trying crazy scenarios with a messy mix of customizations and add-ons. It relies on clear communications about what's changed and what's at risk. And it relies on all of these things happening in a reasonable timeframe.
We've made several improvements to our regular habits during the 4.7.x period -- increasing the size of the test suite by 25%, increasing the number of test authors by 50%, improving the communications in the release-notes, improving feedback from code review, and so on. Perfection? No, these can improve more. But this post isn't about any of that. It's only about one part of the process: release-candidates.
Release-candidates originally appealed to people with a certain mindset -- people who are passionate about creating "stability." The ones who write check-lists of what the system needs to do. The ones who setup three identical servers because "sandboxes" and "staging" allow "realistic" experimentation. They talk about getting sign-off on the demo before going "live." Right next to the office's "No Smoking" sign, they scribble graffiti like: "No Live Edits".
Why would release-candidates appeal to someone passionate about stability? Because they solve a conundrum:
- If you test a major subsystem (like CiviCRM) during its development, then you test a moving target. It will continue to develop after you've done your test, rendering results irrelevant. The testing wasn't representative.
- If you test a major subsystem (like CiviCRM) after it's been released, then you'll uncover bugs after it's been released. But that's too late! The buggy release is published. Getting someone to fix it will be a lot harder.
The RC is a compromise which freezes the code -- it's close to the final release, but there's still time to fix serious problems. Changes are only accepted for very recent regressions and for release-notes -- which means you're testing something pretty realistic.
What does all this mean quantitatively for CiviCRM? Looking at the monthly releases of 4.7.x over the past twelve months, each release has typically had 100-130 changes. The RC period has typically lasted 1-2 weeks, during which 4-8 changes were accepted. Only ~5% of changes occur during the RC.
It's fairly common for a new release to ship with a new, frustrating problem -- a "frustrating" problem is one that seems obvious in retrospect, which could have been identified with some more attention on the RC.
I don't have hard numbers, but based on Github activity and informal discussion on the
dev-post-release chat, my sense is that:
- Every month, the RC process prevents 2 - 5 new problems that would have been embarassing. (Yay! This is helping!)
- Every month, the RC process misses 1 - 3 new problems that it should have easily identified. (Boo! Not doing enough!)
Should we count that as a success or a failure? With a set of 100-130 changes, it's a small percentage, and getting that far takes a lot of work. We should be thankful for the efforts of our PR reviewers -- like Eileen McNaughton, Seamus Lee, Monish Deb, Chris Burgess, Brian Shaughnessy, Jitendra Purohit, Coleman Watts, and so many others -- who strive to keep that low. We should also be thankful for the folks who regularly test RCs. Even though there aren't as many, folks like Karin Gerritsen and Dave Jenkins have prevented several real problems from getting into the release.
So there's a lot to appreciate. But still: 1-3 frustrating problems is frustrating, and we want to cut those down.
One way or another, we need more human attention to catch these problems. If you've got a pile of money to hire some more humans to concentrate on it, great -- we'd love to have them. I don't, so I'd like to share a simpler idea raised by my friend Eileen:
Let's get the RC's more deeply ingrained in our community's culture. If you're doing any customization/development, then your process should incorporate RC's. This suggestion could play out in a few ways, depending on what kind of project you're pursuing:
- If you're evaluating a feature or configuration option, evaluate it on the RC. Within a couple weeks, that'll be the official version.
- If you're writing an extension, write it on the RC. By the time your extension is ready, that's the version you'll be running.
- If you're doing a test in preparation for an upgrade, test the RC. When the new release goes stable, then do the upgrade for real.
- If you're setting up a staging server, deploy it with the RC. When the release goes stable, then move it to production.
- If you train people, train them on the RC. By the time they're up-to-speed, they'll be trained on the current version.
(This approach reminds me of the old yarn: don't focus on where the ball is; focus on where it's going.)
These ideas sound a little scary. After all, the RC could have a bug. But if we all think that way, then we'll avoid the RC, and our fates will be worse: the bug won't been seen until after the release, when people start deploying to live sites!
The Month-Long Freeze
That's a lot of background! If you're still reading, thanks for sticking around. Let's close this loop and link our goal with this new policy.
Changing the culture around RC's will require energy from many people. Within my power, there's a simple change that can help: extending the RC period to a full month. This helps in two ways:
- More time means... more time: Testing takes a while! More time makes it easier to schedule testing, and it gives more time to react to issues.
Constant availability: Extending the RC to a full month has a neat side-effect: the new stable release (eg
4.7.23) and the new RC (eg
4.7.24-rc) are prepared on the same day of August. In September, the next stable (
4.7.24) and the next RC (
4.7.25-rc) will also be prepared on the same day. You don't have to check a calendar to see when it will be available. At any given moment, there's always a choice: download the stable or download the RC. That choice can be based on important questions (like "What kind of work are you doing?") and avoids annoying questions (like "When will the RC be ready?").
More to come
We're looking to make more improvements to the release-cycle. Some of them will require discussion, infrastructure, etc, and I don't have time or space to talk about them in this post. But stay tuned on the blog and mailing list.
Thank you Tim for not only coming up with what seems like a great approach to improve our testing coverage, but also for taking the time to fully communicate the reasoning behind the strategy. It makes sense to me - and that's saying something. :-)
I'll talk to my team to see what we can do to work RC testing more fully into our SOP. This approach definitely would appear to make it easier to do so. Having a month instead of 1-2 weeks is a HUGE difference. I'm really curious to see how this plays out.
Thanks as well to the PR reviewers you mentioned. I also am very grateful for the time and effort they put in. Thank you all!
Thanks Tim, very thoughtful decision: more time to test the RC can only be beneficial, and aligning the RC to the release cycle makes a lot of sense.
Makes sense, looking forward to how this pans out. Thanks Guys.
@guyiac - thanks, I'd like to echo and endorse your comments. Tim, this change is so right! More incentive for me to get a test environment setup and learn enough to be able to contribute to the effort. Thanks. DVH