CPython workflow changes
After more than two years, our new GitHub workflow is ready to accept changes (you can look back to my first “” on changing our workflow to see how things have changed since I started working on this)! I hope you are all excited to see this finished; I know my wife is very excited as she’s tired of listening to me talk about it for a third of our marriage. ;)
First and foremost, I want to thank everyone who helped with this. Thanks to Donald and Barry for writing the initial PEPs proposing and and Nick . Thanks to everyone on for helping out with various discussion (and which will continue to host discussions on future improvements on our workflow). Thanks to Ezio, Maciej, and Anish Shah for helping with the changes required to bugs.python.org in order to keep the issue tracker around. Thanks to the infrastructure team for helping deal with the migration of the and repos (especially Donald and Ernest). Thanks to Senthil for doing the conversion of the repo itself. Thanks to Benjamin for helping with hg.python.org stuff. Thanks to Zach for helping with the buildbots (and the devguide). Thanks to Mariatta, Carol Willing, Berker, Oleg, and Stéphane Wirtel for helping with the devguide. There are also plenty of other people who have provided feedback over the past 2 years on mailing lists and in-person.
What has changed
The documentation in the should be up-to-date, so don’t worry about keeping this around as a reference. Consider this more of an announcement letter to get people quickly up-to-speed and excited about the new workflow.
The location of the code repository
CPython’s code now lives at https://github.com/python/cpython . hg.python.org/cpython is and will stay read-only (no determination has been made as to how long the Mercurial repository will be kept running). It should also be mentioned that we are doing squash commits and not rebased commits or merge commits as some projects on GitHub do. This basically means we will continue to have a single contribution lead to a single commit, keeping our history linear and compact.
To up the bus factor on the new repository, I have set up a team for and made them administrators on the repository. I don’t necessarily expect RMs to use this power, but it’s there in case any of them need to change a setting in order to get a release out (to be upfront about it, I’m also in the team as its creator, but I have administrative privileges for the on GitHub so it doesn’t change what I’m able to do).
Specifying issue numbers
Traditionally we have specified issues as “Issue #NNNN”. The problem with this format is that text in this format to GitHub issues and pull requests. While our repository will initially have no issues or PRs to automatically link to, this might not be true long-term (GitHub does the automatic linking eagerly at push time, so there’s no worry for older commit messages; we actually almost mutated the history to match the new format but in the end I made the decision not to as I didn’t consult with python-committers prior to the migration to make sure such a change was acceptable to everyone).
To avoid this issue we are going to start specifying issues as “bpo-NNNN”. This clearly delineates that issue numbers directly relate to bugs.python.org since “”. This is also a format that GitHub supports — “GH-NNNN” — as well as JIRA who , so there’s precedent for choosing it. This change applies both to and in . Mentioning an issue this way in a pull request title or comment will connect the PR to the corresponding issue on bugs.python.org. Mentioning an issue this way in a commit message will cause a comment to show up in the issue relating to the commit.
Cherry-picking instead of merging
When a patch has spanned more than one version/branch we have always done a forward merge. The common issue with this, though, is it leads to racing with other committers from when you make to your initial commit in the oldest version/branch to pushing to hg.python.org on the newest version/branch. There was also the problem of having to remember that Python 2.7 is a special branch which was never merged forward.
To deal with these issues we will use going forward. This allows changes to be pulled into other branches as independent commits. This prevents any commit races with other core developers as we have traditionally needed to deal with when doing forward merges that span e.g. three branches. It also allows using CI to easily verify a change works if each cherry-pick is done as a separate pull request. The Python 2.7 branch also stops being a special case when backporting. It also prevents potential issues stemming from contributors submitting pull requests against the master branch by default and not the oldest branch a change should be applied to. Finally, this also removes the discussion of whether a change should be backported or not from blocking the commit into master to begin with.
Labels will be provided for people to use to help track any cherry-picking that needs to occur for a pull request (e.g. “backport to 3.6”). I also left the “bug” and “enhancement” labels to help classify PRs (adding more labels is easy so we can do that as our experience and workflow organically converge towards common practices).
All feature branches have been marked as . This means that feature branches cannot be deleted nor can they be pushed to directly. The latter will be the biggest change as it means all changes must go through a pull request. This helps make sure that there are no accidental breakage of code (I know I have done this multiple times when in a rush and I didn’t take my time when preparing a commit). This also means that all core developers follow the same development workflow as any other contributor. This not only allows all core developers to be able to help any other contributors with our workflow, but it also helps make sure we are aware of any sticking points in the contributor process so that we can all work towards resolving them for everyone’s benefit. (If experience shows that this is too much overhead we can turn this off.)
All feature branches that are in security-only mode are locked down so that only can approve pull requests to them. For all branches that have reached EOL, no one is able to push to them. I expect that RMs will also use this feature when they are ready to gate all commits to a branch on their approval (e.g. when a release reaches RC, maybe even beta if they choose to go that far).
What has improved
Accepting PRs through GitHub’s web UI
While using hg.python.org, all commits had to be done through Mercurial’s CLI. With the move to GitHub we gain the ability to accept pull requests through a web UI. While this will only accept the change into the branch it was submitted against (which can be changed in the web UI), for situations where a change does not need to be backported it will allow for easier acceptance of a change. (When a change does need to be backported this is when you need to cherry-pick and that requires using the git CLI). If a change does need to be cherry-picked into an older branch you can either wait to accept the PR when you have a clone to work with or accept the change into master now and then cherry-pick later when you have a clone available.
Previously changes required running the test suite manually along with verifying various other things like the documentation building. Moving to GitHub allows us to leverage the Travis continuous integration service to test several things in parallel automatically for each pull request:
- Debug build under gcc
- Debug build under clang
- Documentation is valid and has no stale links
- Python.h C++ compatibility
While this doesn’t solve all testing scenarios (e.g. this doesn’t test a macOS or Windows-related change due to the added hours it take for a PR to be “green” when run on Travis for macOS or AppVeyor for Windows), it does help with the common case of a cross-platform change. (There is an to add some code so that these tests only run when appropriate files have changed so that e.g. fixing a spelling mistake doesn’t run the test suite.)
It should be mentioned that status checks on issues are not required prior to committing a pull request. While this may be a good idea long-term, until we know that our test suite is stable enough to not have regular flaky tests this would be more trouble than it’s worth (GitHub does visibly show, though, when not all status checks have passed so you won’t easily ignore this situation either).
Traditionally the code coverage of our tests was only known when someone ran the test suite manually under something like . Even when someone did generate a coverage report it was generally not shared with other developers, and so it wasn’t widely known if a pull request increased or lowered test coverage.
With the move to GitHub we are able to use to calculate code coverage for each pull request. This also implicitly tests a non-debug build as that’s used to make the coverage results run faster. It should be noted, though, that some tests are skipped due to them holding up the coverage run from completing. (There is an to use coverage.py’s fullcoverage hack so that the coverage report can even be accurate for modules imported during interpreter startup.)
To tell if someone has signed the PSF contributor license agreement you have to look to see if they have an asterisk by their name on the issue tracker. Unfortunately this is a passive thing to need to check for and is easily forgotten. Thanks to GitHub’s webhook events and developer API we now have a bot which checks if the contributor(s) to a pull request have signed the CLA, adding an appropriate label to the PR to signal the CLA signing status (the bot is named ). If the contributor(s) have not signed the CLA then a message is left on the PR explaining how to rectify the issue (it’s either they need to connect their GitHub account to their bugs.python.org account or they need to sign the CLA; there’s also is an easter egg that occasionally appears in the message).
If a contributor does end up fixing the issue that leads to the bot thinking the contributor had not signed the CLA, you can remove the “CLA not signed” label and the bot will recheck the PR and add the appropriate label (this also happens automatically if any code changes are made to the PR). If for some the reason the bot has a hiccup then no label will be applied (this is to act as a safeguard against false-negatives and to make it easy to spot when something has gone wrong due to the absence of either a “CLA signed” or “CLA not signed” label). To trigger the bot again you can simply apply the “CLA not signed” label and then remove it.
There is now a file which gives general info on contributing. Primarily it has all the various build status badges and links, link to the devguide and an overview of how we differ from most GitHub projects, and a mention of the CoC. For core devs I suspect the badge list for all active branches will be useful to know when something is broken (I am only listing the test coverage for the master branch to prevent encouraging people spending time trying to increase test coverage for bugfix-only branches) .
This isn’t here for discussion per-se, but to let people know what I am thinking should change next. If you want to help or discuss anything in this section, please subscribe to core-workflow and participate there. Ideas are also tracked on the (where there are other ideas for improvements beyond the two listed below).