Back

How To: Migrating GitHub Repos to a Monorepo

thumbnail
by Daniel Selans Daniel Selans

To avoid peppering too many opinions in this “how to” guide, I’ve written a separate “opinion-piece” article on monorepos — you can read it here: Mostly Terrible: The Monorepo.

Sometime in early Dec 2023, our team decided to migrate ~10 public repositories for an OSS project to a monorepo.

This article provides a “rough” outline for the meatiest parts of the migration.

Good luck! 🤞

Goals

We’ve got two goals for the migration:

  1. Put the contents of the individual repos under a subdir in the monorepo
  2. Inject the original commit log of the individual repos into the monorepo

Repos

We will be migrating the following repositories to a monorepo github.com/streamdal/mono:

  1. streamdal/server

    • Golang
  2. streamdal/console

    • TypeScript + Deno + React
  3. streamdal/cli

    • Golang
  4. streamdal/docs

    • Astro
  5. streamdal/wasm

    • Rust + Wasm
  6. streamdal/protos

    • Protobuf schemas for Go, Python, Rust, TS
  7. streamdal/wasm-detective

    • Rust lib
  8. streamdal/wasm-transform

    • Rust lib

Requirements

  • We’ll be operating from /Users/dselans/Code, referred to as the “work-dir”
  • You will need to have git and zsh (for native chdir()) installed locally
  • Most of the migration will be handled by a migration script
  • Last, I performed this migration on MacOS (Sonoma) — if you’re using something else, you might need to do some tweaking 🤷‍♂

Step 1: Layout

You need to figure out and come up with a directory structure/layout for your new mono repo. This is an extremely important step and if you fuck this up now, it’ll be twice as painful to unfuck this later on.

This is the layout I used for streamdal/streamdal - it is fairly common and non-controversial - it might work for you.

┌── assets               <---- static assets used in monorepo
│   ├── img
│   └── ...
├── apps                 <---- target dir that will contain apps
│   ├── cli
│   ├── console                
│   ├── docs
│   ├── server
│   └── ...
├── docs
│   ├── install
│   │	  ├── docker
│   │   └── ...
│	  ├── instrument
│   └── ...
├── libs                 <--- target dir for app dependencies, common/forked libs
│   ├── protos
│   ├── wasm
│   ├── wasm-detective
│   ├── wasm-transform
│   └── ...
├── scripts
│   ├── install
│   │	  ├── streamdal.sh
│   │   └── ...
│   └── ...
├── LICENSE
├── Makefile
└── README.md

Step 2: Prep Work

You should probably do this during off-hours when folks aren’t updating repos often. If that’s not possible, no big deal, you’ll just have to do some syncing post-migration.

Go through the list of repos and clone them to your work dir:

# Change into the work-dir
$ cd /Users/dselans/Code
# Grab the migration script
$ curl -o migrate.sh <https://raw.githubusercontent.com/streamdal/streamdal/main/scripts/monorepo/migrate.sh>
# Clone the "to-be-migrated" repos
$ git clone [email protected]:streamdal/server
$ ...

The migrate.sh script expects repo dirs to exist

Open migrate.sh in your editor and update the following bits:

  1. MONO_DIR - specify the target monorepo dir (mono)
  2. BASE_DIRS - specify the dirs that the script should create (can leave as-is, if the layout above makes sense)
  3. FILES - specify what files the script should create / touch
  4. Update REPOS with a space-separated list of the “to-be-migrated” repos you cloned
  5. Update SUB_DIR with the dir you want the migrated repos to live under — ie. If you have REPOS="foo bar" , MONO_DIR="mono" and SUB_DIR="apps" - the “foo” and “bar” repos will be migrated to ./mono/apps/foo and ./mono/apps/bar.
  6. Save and exit editor

Step 3: Migrate!

We are ready to begin the migration.

# From work-dir
$ zsh migrate.sh

The script will attempt to do the following:

  1. Create $MONO_DIR and initialize a git repo in it
  2. Make a copy of ../$REPO as ../$REPO.clone
  3. Perform all further work from ../$REPO.clone dir
  4. Move ../$REPO.clone/* into ../$REPO/$SUB_DIR
  5. Commit and merge changes in ../$REPO.clone/*
  6. Chdir to $MONO_DIR and set ../$REPO.clone as a remote
  7. Merge ../$REPO’s main into $MONO_DIR’s
  8. Commit and move on to the next $REPO specified in REPOS=

The “meaty” part of the migration is complete.

Post Monorepo Migration Hot Tipz

Performing the git part of migration is the first step in your “journey”. There will be a handful of other things you’ll need to do to get things into decent shape and ready for production.

Here are some tips to get you started:

  1. Repo size is a legitimate concern now, so start by identifying large, dupe, or garbage assets. du -sh * | grep M in your monorepo dir. Look for log files, build dirs, node_modules, Rust’s target dirs, accidentally checked in Docker-compose volumes, etc.
  2. When you find garbage, don’t forget to add the paths to a .gitignore
  3. Unless you have a large number of repos (20+), I would do one repo migration at a time to catch any potential issues and fix them on the spot.
  4. You will probably not like the structure or redecide something and ultimately have to rerun the migration again. rm -rf $MONO_DIR && rm -rf *.clone to start fresh.
  5. Don’t forget to update README.md’s in the migrated repos to indicate that the main repo has moved to $XYZ. For good measure, “archive” the repo as well (via repo settings in GitHub).
  6. Attempting to update everything in one go is a huge undertaking and will take longer than you anticipated. Migrate the repos first, finish that, then tackle CI, then tackle READMEs, docs, and so on.
  7. Updating CI will probably be the heaviest lift. You first want to gate the workflows so that they run only when the PR contains changes for /apps/some-app/* — You can accomplish this by using GitHub’s path filters on a pull_request trigger (and on push to main). Take a look at our workflows for reference.

UPDATE 01.2024: Gotta admit, having everything in one place is pretty nice. Intellij appears to be smart enough to understand that diff subdirs have diff languages — I would’ve imagined it would have problems, at least with indexing. Not bad.

Want to nerd out with me and other misfits about your experiences with monorepos, deep-tech, or anything engineering-related?

Join our Discord, we’d love to have you!