Right away, I will admit: CircleCI is not my favorite CI/CD tool. There has been an explosion of new-generation tooling that isn’t all hype in this product space which has presented improved ergonomics, functionality, and pricing for developers compared to CircleCI.
However, sometimes, the correct choice isn’t what we want but what we have.
Given the client’s longstanding familiarity with CircleCI as a platform and the task at hand, a monorepo orchestrated with CircleCI seemed a suitable choice for encouraging code sharing and enforcing a consistent set of practices across business units.
And so, dear reader, I have identified and navigated all the foot-guns and false-starts so that you may learn from my begrudging, grumbling hours spent accomplishing this task using my not-favorite CI/CD tool.
The project structure
To begin, let’s imagine a repository with the following structure:
$ tree -a myproject
myproject
├── .python-version
├── __init__.py
├── common
│ ├── common
│ │ └── __init__.py
│ ├── poetry.lock
│ └── pyproject.toml
├── poetry.lock
├── poetry.toml
├── pyproject.toml
├── subproject_one
│ ├── Dockerfile
│ ├── poetry.lock
│ ├── pyproject.toml
│ └── subproject_one
│ └── __init__.py
└── subproject_two
├── Dockerfile
├── poetry.lock
├── pyproject.toml
└── subproject_two
└── __init__.py
Two projects (subproject_one
, subproject_two
) are independently deployable services, both of which consume a common
package of library-level code. I wanted to orchestrate a CI/CD pipeline such that merging changes to our main
branch would automatically deploy to staging and that deploying changes to a prod
branch would automatically deploy to production. Further, I had build and validation steps that were common to all three directories and build, validation, and deployment steps that were unique to each directory. Nothing exotic here - for example, I might run a linter across all files and I might build and push an image for subproject_one
to a particular Docker repository that is different from subproject_two
.
Don’t fight their APIs
My initial inclination was to create three configuration files. One for tasks that might be common across all projects, for example, running the tests across a project or validating that the code is properly formatted. And two others, each corresponding to our subprojects, where we could place logic specific to those projects.
I recognized that this was not the official recommendation but attempted this (for a time) anyway. Orbs (CircleCI’s word for packages) bundle functionality to 1) filter based on paths and 2) invoke a “continuation” of a pipeline in order to run another file. These can be stitched together to create separate files for each project, and I did this for a time. However, in retrospect, I would not recommend it, and I migrated away from this approach. It was finicky, error-prone, and difficult to maintain. You live and learn, right?
CircleCI’s dynamic configuration prescribes creating two files: a config.yml
, where we can author jobs common to all projects and invoke our project-based workflows, and a continuation_config.yml
, where we can author our project-based jobs and workflows.
You may be wondering: won’t that become a huge mess of a file? Particularly, if many subprojects are present in our monorepo, one file containing many mixed concerns would make most software engineers eager to refactor.
Well, you’re right. It could become a huge mess of a file.
But! I have identified a few techniques we can use to keep it modular, DRY (don’t-repeat-yourself), and maintainable.
So where does that leave us?
First, we need a directory for our CircleCI configuration files and some proprietary setup:
├── .circleci
│ ├── config.yml
│ └── continue_config.yml
Your config.yml
file must include a setup: true
block alongside some CircleCI-specific configuration.
From there, we can move on to the aforementioned techniques.
Use the path-filtering orb, Luke
CircleCI’s path filtering orb provides functionality to continue a pipeline based on the paths of changed files. The mapping
parameter allows us to pass variables to our continuation configuration for use in when
clauses of our workflow. This provides a mechanism to trigger particular workflow branches. In practice, this will look like:
version: 2.1
setup: true
orbs:
path-filtering: circleci/path-filtering@1.0.0
jobs:
validate-source-code:
steps:
...
workflows:
always-run:
jobs:
- validate-source-code
- path-filtering/filter:
name: check-updated-files
mapping: |
common/.* run-common-workflow true
subproject_one/.* run-subproject-one-workflow true
subproject_two/.* run-subproject-two-workflow true
base-revision: main
config-path: .circleci/continue_config.yml
...
parameters:
run-common-workflow:
type: boolean
default: false
run-subproject-one-workflow:
type: boolean
default: false
run-subproject-two-workflow:
type: boolean
default: false
...
workflows:
subproject-one:
when:
or:
- equal: [true, << pipeline.parameters.run-subproject-one-workflow >>]
- equal: [true, << pipeline.parameters.run-common-workflow >>]
jobs:
...
subproject-two:
when:
or:
- equal: [ true, << pipeline.parameters.run-subproject-two-workflow >> ]
- equal: [true, << pipeline.parameters.run-common-workflow >>]
jobs:
...
Notably, this provides the flexibility to run all workflows when a change occurs in common
and only run a particular workflow when changes occur in its subdirectory.
Keep things DRY with the tooling available
YAML isn’t a programming language, but it is a declarative configuration language with not-often explored advanced features. Some of my favorite features to use are anchors, aliases, and merge keys. Combined, they allow us to author re-usable snippets in our CircleCI template (and most yaml
documents in general):
common_settings: &common_settings
executor:
name: python/default
tag: 3.10.8
subproject_one_common_settings: &subproject_one_common_settings
working_directory: ~/myproject/subproject_one
<<: *common_settings
...
jobs:
subproject-one-validate:
<<: *subproject_one_common_settings
steps:
- myproject-checkout
- install-acme-cli
- validate
So, if you have repeated snippets of orchestration (and you likely do, given you’re working in a monorepo), creating a common block of configuration, anchoring it, and then using that anchoring via aliases and merge keys allow us to write it once and run it everywhere, DRYing up your configuration file.
Use filters for branch-based logic
I am more familiar with the GitHub Actions style workflow triggers to invoke particular workflows based on branch conditions. CircleCI offers similar functionality via filters.
For our example project, I wanted to create three different workflows based on branching:
- First, for every pull request and merge, run some common tasks (such as validating the change has no syntax errors).
- Second, when a change is merged to
main
, and has no git tag, deploy it to a staging environment. - Third, when a change is merged to
prod
and has a tag of the formv$.$.$
(such asv1.0.0
), deploy it to the production environment.
In practice, this looks like:
stg-filters: &stg-filters
filters:
branches:
only: main
tags:
ignore: /.*/
prod-filters: &prod-filters
filters:
branches:
only: prod
tags:
only: /^v.*/
...
workflows:
subproject-one:
jobs:
- subproject-one-validate
- subproject-one-deploy-stg:
requires:
- subproject-one-validate
<<: *stg-filters
- subproject-one-deploy-prod:
requires:
- subproject-one-validate
<<: *prod-filters
Combined with the aforementioned anchoring, aliasing, and merge keys, we can compose a common set of branch-based rules to use in our workflows for each subproject included in our monorepo.
Don’t be afraid to offload complex logic into scripts
If you’re struggling to fit a complicated step into your job or workflow declarations, offload that logic into a script. This can be authored with bash, or even your favorite programming language, for example:
#!/usr/bin/env python
import os
NAME = os.environ["NAME"]
print(f"Hello, {NAME}!")
For my purposes, this was helpful to orchestrate a sequence of steps that required the usage of an API client given my deployment target did not have a CircleCI orb available. I know I would rather debug a python script than a hobbling of bash in a CI configuration file when it (inevitably) breaks.
RTFM!
This sounds naive, but consulting the official documentation for a CircleCI configuration file proved to be the best source of information while exploring the tools available. Further, it informed me of what options were available to me and provided brief examples for their implementation.
Googling for answers tended to lead to outdated community answers. And using ChatGPT for CircleCI was often flat-out wrong. So, in this instance, doing things the old-fashioned way paid the most dividends.
If you’ve made it to the end of this post, you’ve either (hopefully) added new tools to your toolbox or (unfortunately) continued to search for answers.
Feel free to reach out to mavrick.laakso@testdouble.com in either case with feedback, praise, or condemnation (maybe you really like CircleCI - no judgement!) Until next timeFor my current engagement, I was tasked with setting up a new repository and change management processes to support an enterprise-ready data engineering and machine learning platform. My client has dozens of repositories successfully validating pull requests, promoting changes, and orchestrating deployments into their respective environments using CircleCI.