Patrick Desjardins Blog
Patrick Desjardins picture from a conference

Sharing Code Between TypeScript Repositories Using Git Submodules

Posted on: 2022-12-15

Scenario

Imagine you have two TypeScript projects that have a similarity. You can avoid duplication and share some code. A viable solution to share is to extract the code into a reusable NPM library and use the NPM library in both projects.

However, it has some overhead and might not fit your current process. Primarily if these two projects only use the shared code and your code is private. Furthermore, if you are sure TypeScript projects will consume your code and that your situation does not allow easy access to a private NPM repository. Finally, you may need more time for an isolated library with proper unit tests, and thus, you are using your code and debugging directly while consuming the code, meaning you want to have first-grade debug with breakpoint like if the code was living in your repository. In that scenario relying on Git Submodule is a viable solution.

You can follow along with these two public repositories: This repository uses a submodule Git repository

The Concept

The solution using Git Submodule is to have separate repositories for all your project and one for shared code. The submodule concept is similar to copying the whole repository into another repository. It is possible to have the shared repository not have any package.json -- only TypeScript files. Once you connect your repository to the submodule, the submodule code lives inside a folder on your machine but not in your Git repository.

Your Git repository points to a commit of another Git Repository. So, Git keeps track of which commit the submodule is linked to, allowing any parent to control when to get an updated code version. It is similar to NPM semantic version but relies on Git commit instead.

In reality, you should have a full fledge package.json file that compiles and have tests, but it is not required. The advantage is that you may later produce an NPM package and create tests isolated from any parent repository. Also, a package.json file tells the consumer which dependencies are required. We will discuss later why it might be advantageous with two different ways to work with dependencies.

The most important thing with a submodule is that you are making a copy while staying connected to the submodule's repository, allowing you to pull an updated version in the future.

Steps to Create the SubModule Relationship

The first step is to add the submodule to the parent project. If you have two projects that consume the share repository it means that you need to call that command in both parent projects. We define parent the consumer of the shared library. The shared library is defined as the child.

git submodule add https://github.com/MrDesjardins/demo_submodule_child ./packages/demo_submodule_child

The second step is to modify the tsconfig.json to have an alias. It is not required but make the code from the src folder of the parent repository to access any submodule package easily.

"paths": {
  "@demo_submodule_child": ["packages/demo_submodule_child/src"],
  "@demo_submodule_child/*": ["packages/demo_submodule_child/src/*"]
}

The third step is to add two registries to Node. The following command compiles Typescript into JavaScript and then it calls Node with the two registrys following with the JavaScript entry file.

"start": "npx tsc && node -r ts-node/register/transpile-only -r tsconfig-paths/register build/src/index.js"

How to Modify the SubModule Code

There are several patterns. You can have the repository in a completely different folder and make a change, then push the change. Then come back to the consumer folder and pull the changes. However, that is a lot of steps.

A more straightforward pattern is to modify the Submodule directly into the consuming (parent) project. Then, you can push changes from that SubModule, and the changes will be available to everyone using the child repository.

Good to Know

GitModule File

Within your parent repository, when the git submodule was performed, it created a hidden file called .gitmodules. In our example, it contains:

[submodule "packages/demo_submodule_child"]
  path = packages/demo_submodule_child
  URL = https://github.com/MrDesjardins/demo_submodule_child

Pushing Changes From SubModule

When performing changes into a SubModule folder (our package/demo_submodule_child) you will notice that the git status at the root of the parent repository shows the changes with your repository changes. For example, here I modified the readme.md from the parent and changed the content of the index.ts of the child:

git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
  (commit or discard the untracked or modified content in submodules)
        modified:   packages/demo_submodule_child (modified content)
        modified:   readme.MD

no changes added to commit (use "git add" and/or "git commit -a")

Performing at the root of the parent project git add . only adds the file from the parent.

Performing git status after the add result to:

git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   readme.MD

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
  (commit or discard the untracked or modified content in submodules)
        modified:   packages/demo_submodule_child (modified content)

To commit and push the change for the Submodule, you need to go to the Submodule folder and perform the git commands desired.

cd packages/demo_submodule_child 
git status

As the output:

 /mnt/c/code/demo_submodule_parent  master ⇡1 +1 ─────────────────────────────────────────────────── base  
cd packages/demo_submodule_child 

/mnt/c/code/demo_submodule_parent/packages/demo_submodule_child  master !1 ───────────────────────── base   
git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   src/index.ts

no changes added to commit (use "git add" and/or "git commit -a")

You can see that when we performed cd that the prompt changed with awareness that we were leaving the parent git repository. Here you can git add . and git push origin master to have the change available to everyone.

Version

Each commit in the submodule is tracked with the parent. Because the commit version is part of the submodule system, it allows a parent to refer to a previous version. Hence, there is no obligation to be on the latest version of the submodule.

For example, if there are two commits on the submodule and the parent is still on the first one. A git log in the submodule folder gives:

commit d9024dbcc10578f18abb1c29fa08f43929f80f6c (HEAD -> master, origin/master, origin/HEAD)

    Small change from the parent repository to the submodule

commit 795569b2fa0111b1db23424722793795a6f9ec4a

    First commit

Moving to the root of the project (parent) we notice that because the change was performed within the parent project that the parent knows that he is pointing to a new version (commit). A diff shows:

diff --git a/packages/demo_submodule_child b/packages/demo_submodule_child
index 795569b..d9024db 160000
--- a/packages/demo_submodule_child
+++ b/packages/demo_submodule_child
@@ -1 +1 @@
-Subproject commit 795569b2fa0111b1db23424722793795a6f9ec4a
+Subproject commit d9024dbcc10578f18abb1c29fa08f43929f80f6c

However, any other project that uses the submodule is still refering to 795569b2fa0111b1db23424722793795a6f9ec4a which allows to have your project free to update at any time.

Loading a Project with Submodules

The first time you are cloning a project (or pulling a project) from Git, the submodules are not automatically initialized. You must update the code by initializing the project locally to fetch the submodule's git repositories. A single command is necessary even if you have several submodules.

git submodule update --init

At the root of your project, execute the command. The git submodule commands read the package configuration and start to fetch the code.

External Dependencies

If the submodule has external dependencies, then there are two different approaches.

  1. You can have the parent having the same dependencies
  2. You can install the dependencies on the submodule

The first solution is attractive if you want the parent to decide the exact version that may break the build if there are breaking changes. However, it allows controlling the version.

The second solution is viable if you want to not manage dependencies from many submodules.

Solution 1: Install Child Dependencies into Parent Dependencies

You can use the npm install using the file: format or to add the dependencies manually.

npm install --save file:packages/demo_submodule_child

More often than not, in private repositories that need sharing code, all the repositories have about the same dependencies. Thus, it is manageable to rely on manually taking the dependencies.

Solution 2: Install Submodule Dependencies

You can move into the submodule directory with the current demo and perform npm install.

cd packages/demo_submodule_child/
npm install

Then running the parent npm run start works as it can compile because during compilation, it finds the node_modules directory of the submodule. You can confirm by using npx typescript --traceResolution | grep "trim". The last line shows that it fines the trim into demo_submodule_parent/packages/demo_submodule_child/node_modules/trim.

❯ npx tsc --traceResolution | grep "trim"
======== Resolving module 'trim' from '/mnt/c/code/demo_submodule_parent/packages/demo_submodule_child/src/index.ts'. ========
'baseUrl' option is set to '/mnt/c/code/demo_submodule_parent', using this value to resolve non-relative module name 'trim'.
'paths' option is specified, looking for a pattern to match module name 'trim'.
'baseUrl' option is set to '/mnt/c/code/demo_submodule_parent', using this value to resolve non-relative module name 'trim'.
Resolving module name 'trim' relative to base url '/mnt/c/code/demo_submodule_parent' - '/mnt/c/code/demo_submodule_parent/trim'.
Loading module as file / folder, candidate module location '/mnt/c/code/demo_submodule_parent/trim', target file type 'TypeScript'.
File '/mnt/c/code/demo_submodule_parent/trim.ts' does not exist.
File '/mnt/c/code/demo_submodule_parent/trim.tsx' does not exist.
File '/mnt/c/code/demo_submodule_parent/trim.d.ts' does not exist.
Directory '/mnt/c/code/demo_submodule_parent/trim' does not exist, skipping all lookups in it.
Loading module 'trim' from 'node_modules' folder, target file type 'TypeScript'.
Found 'package.json' at '/mnt/c/code/demo_submodule_parent/packages/demo_submodule_child/node_modules/trim/package.json'.

Word of Caution

There is some specific scenario, like for the graphql library, that has a runtime check to ensure a unique dependency on the library. Relying on the two node_modules (parent + child) causes a runtime (not compilation time) error. Hence, solution #1 might be only possible in some situations.

Advantages and Inconvenient

The main advantages are how easy it is to modify the code without relying on NPM. Working on another submodule is like working with local code for your project. A significant advantage is that if you rely on VsCode to debug your code, it will work flawlessly because for VsCode debugger, that code is part of your project. Another advantage is that it uses the dependencies of your project. While it is an excellent practice to compile the submodule independently and have a set of package file, it is easier to handle versioning if the submodule is used between projects you own. There is no need to rely on npm link which can be brittle and require compiling between changes with many requirements like the same NPM version.

The main disadvantages are that it works well at some scale. Therefore, I recommend using submodules only when developing a small system with a few dependencies. However, if you have a couple of Node projects that need shared files, it is easy to set up and does not require you to have a private NPM repository. Hence, this solution works well on a small scale with private intention.

Relying on NPM to have a well-tested, isolated and atomic library is ideal. Still, it requires a bundler, a private NPM repository, and tools that facilitate code modification, another continuous integration pipeline and debugging that the Git submodule solution does not need.

Conclusion

Like every tool, there are situations where it is appropriate and some are not. If you have a time constraint or infrastructure limitations, you can still share code without duplicating using the Git Submodules solution. For a larger scale, with many people involved and many projects, it might be wise to create a defined library if time or process is available.