Migrating to monorepo
My journey of a successful migration from multirepo to monorepo.
Recently my job has presented me with the opportunity to migrate our codebase to a monorepo.
This kind of change should not be taken lightly, but we designed a plan and made strategic decisions to achieve it.
I am very lucky to have good teammates that helped me along the way, here is what I gathered from this project.
The task
The task was simple: to merge together repository A and repository B into one monorepo. As a proof of concept I have created the following repositories: https://github.com/jeop10/blue and https://github.com/jeop10/green .
As you can see these repositories contain a very little demo app (which consists of just a couple of files) just to prove that we can have files with the same name in different folders.
Starting the process
The first thing I did was to create a branch and "wrap" the code of each repository into it's own folder.
The result can be seen on the following branch *-monorepo
of each repository.
Once I had that done I started to search for a way to merge together the two repositories while keeping the commit history, this was very important for us.
One quick google search revealed the following command to be used:
The key part here is the allow-unrelated-histories
tag, because that allows git to keep the commit history
from the repository you wish to merge.
Keep in mind that to use the previous command you first need to add an origin to the repository that you are trying to merge together. Here is a full list of commands executed:
# Asumming you are already on the monorepo folder
# and already executed git init
# Add the remote
git add remote blue-origin [email protected]:jeop10/blue.git
# Fetch the specific branch
git fetch blue-origin/blue-monorepo
# Merge into blue-monorepo keeping history
git merge blue-origin/blue-monorepo --allow-unrelated-histories
# commit the changes
git commit -m "merge blue repo"
# remove remote
git remove remote blue-origin
Repeat the same steps to merge the green repository. After that, and assuming you have already added the proper origin to GitHub you can just push the commits.
And with that the job is done, or so we thought but there were some steeps still needed.
Don't forget about .github
Our repositories use GitHub Actions and when merging two or more repositories together you cannot have a .github
in
every subfolder, you can only use one, which is present in the root directory.
In order to fix this, we created actions for each project and make them execute based on changes to files in specific paths, using an awesome GitHub action called Path filters, here is the link to the repository: https://github.com/dorny/paths-filter
With the path filter in place if there is a merge affecting just the blue project, only the blue actions will be executed.
One more thing
Be mindful that if you have the same Github Action for push/merge you will need to do some conditional steps because of the way the Path Filter detects the changes.
The result
After all you should see something like this: https://github.com/jeop10/monorepo an organized list of folders with all the projects in one place, keeping their history.
Some thoughts
Strategy is everything if you are going to implement a task like this with a team of developers, some key points are:
- Set a date for the migration to monorepo.
- Have your team merge all the changes before that date.
- Close write access to the repositories that you want to merge when the migration has started.
- Remember also to have a backup plan, so make sure you don’t delete or alter the commit history of the repositories you are trying to merge together before you make sure that the monorepo works.
- Keep an eye on your CI/CD platform, because of the change to monorepo some of the paths that you or team had set before might not work. This sounds obvious but sometimes we forget about those details.
And lastly, remember to automate as much as you can. I developed a quick bash script that once executed will create the monorepo repository merging the projects together.
This is key because by having a script we were able to replicate and do some tests before we decided to go live and minimize typing errors.