A simple git subtree tutorial

Here is a quick overview on how to create a git subtree.

I created two public repos to play around with:

https://github.com/CariZa/testing-subtrees-main-repo
https://github.com/CariZa/testing-subtrees-sub-repo

Repository inception

Normally you would use subtrees to pull in a repo into another repo. You would have a “parent” repo that would create a subtree inside of it which basically pulled in the code of another repo.

Use this command in your terminal to see the subtree commands:

$ git subtree -h

Using subtrees to isolate code

What I tried to do is mock a working development environment with source files, and then move just the built “dist” folder into another repo for isolated use.

Empty parent repo (testing-subtrees-main-repo):

This could be where you have your src files and then where you have your dist folder after it builds. You may then want to pull the dist folder into another repo so certain users/systems only have access to dist files.

Repo:

https://github.com/CariZa/testing-subtrees-main-repo

Created a few empty folders to mimic a complex project structure

$ mkdir dist
$ mkdir src
$ mkdir someotherstuff

Add a mock final index.html in dist:

$ touch dist/index.html
$ echo “Hello World” > dist/index.html;

Push updates to parent repo:

$ git add .
$ git commit -am “Added some test folders and file”
$ git push origin master

Turn dist/ into a subtree on a second repo.

Empty sub repo (testing-subtrees-sub-repo):

Sub Repo:

https://github.com/CariZa/testing-subtrees-sub-repo

Cloned the second repo and navigated to the root of that project and added the main repo’s /dist folder to this repo. The “prefix” is basically the folder you want to pull into your repo.

$ git subtree add –prefix=dist https://github.com/CariZa/testing-subtrees-main-repo master

This pulls down just the “dist” folder from “testing-subtrees-main-repo”, in this case it created a dist folder and put the dist folder inside that folder.

Make a change to the sub repo, commit the change, and push it.

$ vi dist/dist/index.html;

Then commit the change

$ git commit -am “Updated text”

Then push the commited change back to the parent:

$ git subtree push –prefix=dist [email protected]:CariZa/testing-subtrees-main-repo.git master

Go to the main repo and pull latest changes and you should see the same change in the main repo.

You don’t need to do anything fancy in the main repo. You should just need to run the normal “git pull origin master” to get the changes.

Using git subtree to deploy a dist folder

I have a simpler version here 🙂 http://www.yeahshecodes.com/git/a-simple-git-subtree-tutorial

Defining the problem

We have to deploy code from within a VPN environment. This has meant we are reliant on a proxy to access external links. This has proven to be messy, and we often spend over a day trying to deploy a project from within the VPN and attempting to connect to external resources (npm, bower, github etc).

Additionally our vpn users have a data allowance which is capped.

Some problems we had to solve for:

  • We need to be able to deploy code without relying on a proxy and without getting restricted by a data allowance.
  • We need to be able to deploy to more than one environment, eg a staging environment and a production environment.
  • We need to deploy a “dist” folder that is in the .gitignore file and not push to our repo on github, only to the server we want to deploy from.
  • We did not want to use tcp to just copy files over, we wanted something more structured than that.
  • Write an ansible script that will automate the process (not covered in this post)

Deciding on a solution

Git subtrees sounded like a good solution. I glanced over basic tutorials and skim read through some resources.

I went through a few attempts to use subtrees and realised no one pointed out how NOT to use subtrees, and this left me spending a considerable amount of time doing the wrong thing. So here is a quick run down of what not to do. And then what you can do instead.

Wrong approach

I have to emphasise that this proved to be a wrong approach to using subtrees. Move a specific folder into a subtree within the same repo AND to an external repo.

I added two destinations to my origin remote, so that when I committed I push my changes from a folder “dist” into a subtree on my current repo and a repo on a different server.

WRONGAPPROACH

After a lot of struggling I realised this was the wrong approach.

Better approach

From origin repo push a specific folder into a subtree of an external repo only.

Have two remotes, one will be “origin” and one will be “server”. When I push the subtree, I want to push it to “server” when I commit my working/source code I want that to push to the origin (aka github).

Both remotes can be hosted on github, in our case we have 1 github hosted account and 1 bare git repo on a private server we manage.

I set up a bare repo on a private server, that could only be accessed when you were logged in on the vpn.

BETTERAPPROACH

The approach in motion

Setting up a bare github repo

On a server go to the folder you want to keep your bare repos.

Hint: the command “pwd” will tell you the file path where you currently are.

Then run this command

1
git init --bare REPONAME.git

Replace REPONAME with the name you want to give the repo

This will create a folder in the folder you are currently located.

Eg /REPONAME.git/

Then go into that folder:

1
cd REPONAME.git

And run

1
git symbolic-ref HEAD

This sets the heads of the repo (else you will keep getting an error when you try use this bare repo).

Preparing to use git subtree

Now you have a location for the subtree. As mentioned you can use a github hosted repo as your location, copy that location.

In our case, because we are using a private server for our subtree we will use ssh to access the location.

1
[ssh_username]@[ip_address_or_domain]:/file/path/to/repo/from/root/REPONAME.git

You will need to replace the following values with your own:

  • ssh_username
  • ip_address_or_domain
  • /file/path/to/repo/from/root/
  • REPONAME.git

Add a new remote

1
git remote add [remote-name] [location]

eg

1
git remote add server [email protected]:/file/path/to/repo/from/root/REPONAME.git

Updating an existing remote

If you already have added the remote. And you want to set the url of that remote (not origin, a different one that is separate to your main project). Then you will use this command to set the the remote url:

1
git remote set-url [remote-name] [location]

e.g.

1
 git remote set-url server [email protected]:/file/path/to/repo/from/root/REPONAME.git

I have chosen to call the remote “server” you can call it whatever you choose, just not origin, as origin is your original github project you are pushing from.

If you type this to see your current remotes

1
git remote —v

You will see your current origin values pointing to your current repo.

Add the server remote:

1
git remote set-url server [ssh_username]@[ip_address_or_domain]:/file/path/to/repo/from/root/REPONAME.git

Replacing the values as mentioned above.

Now you can insert again

1
git remote —v

And you will see your origin and your server locations.

Do not set the remote origin to two locations (this is a mistake I did at first). You should not use your existing origin for two different locations (in this case at least). Remote origin should only be pointing to your current github repo (not the subtree repo).

Some notes on subtrees

The environments are now prepared for subtrees.

In your project decide what folder you want to move into the subtree. In our case we wanted to use “/dist” in the root of our repo.

If you want to use a subtree in a sub folder structure, you need to reference the folder using that subtree path. Eg “files/app/dist”

I’ve seen a youtube video of a wordpress project referencing a plugin using a subtree. This meant they pointed to the wordpress plugins folder and then the specific plugin they wanted to subtree into the project. “wp-content/plugins/plugin-folder”

Using git subtree

If you type in this command, you will see the following help information

1
git subtree -h

usage: git subtree add   –prefix=<prefix> <commit>

   or: git subtree add   –prefix=<prefix> <repository> <ref>

   or: git subtree merge –prefix=<prefix> <commit>

   or: git subtree pull  –prefix=<prefix> <repository> <ref>

   or: git subtree push  –prefix=<prefix> <repository> <ref>

   or: git subtree split –prefix=<prefix> <commit…>

    -h, –help            show the help

    -q                    quiet

    -d                    show debug messages

    -P, –prefix …      the name of the subdir to split out

    -m, –message …     use the given message as the commit message for the merge commit

options for ‘split’

    –annotate …        add a prefix to commit message of new commits

    -b, –branch …      create a new branch from the split subtree

    –ignore-joins        ignore prior –rejoin commits

    –onto …            try connecting new tree to an existing one

    –rejoin              merge the new branch back into HEAD

options for ‘add’, ‘merge’, and ‘pull’

    –squash              merge subtree changes as a single commit

The main commands to keep in mind:

1
2
3
git subtree add —prefix [path_to_folder]
git subtree pull —prefix [path_to_folder] [remote] [remote-branch]
git subtree push —prefix [path_to_folder] [remote] [remote-branch]

Eg (remember I called the remote server above, you can call it anything you prefer):

1
2
3
git subtree add —prefix dist
git subtree pull —prefix dist server staging-dist
git subtree push —prefix dist server staging-dist

I have chosen to push to branches using the naming convention “origin-branch”-“folder”. “staging-dist” is the name of the subtree, but it is also the branch that the subtree will exist in.

So the branches I would subtree from would be staging (for staging deploys) and master (for production deploys)

One of the problems we had to solve using this approach was not deploying “dist” to our main repo. With subtrees, you can deploy code there without deploying that code to your main repo.

From within the project, “dist” is added to .gitignore

Then when your dist folder is ready for deploy (usually after running “grunt build”) you can force commit dist using -f.

First you will need to add your subtree (if you have not already)

1
git subtree add —prefix dist

Then add your dist files and commit them (do not push to your current branch, only push to the subtree)

1
2
git add dist -f
git commit -am “Added the dist files”

Then push to the subtree

1
git subtree push —prefix dist server staging-dist

If you have an existing subtree with files you will need to pull first by running

1
git subtree pull —prefix dist server staging-dist

Again, just replacing the values as you need to (mentioned above).

Deploying from the subtree

So now we have code in a subtree, we just need to push those files to the location you need them. I won’t go into how to setup nginx to point to this location, that will be for another post, but if you do want to learn how to do that just google using nginx for file management.

If you want to automate that when you push to the subtree your files are pushed to the right location, you can read through this digital ocean post

https://www.digitalocean.com/community/tutorials/how-to-set-up-automatic-deployment-with-git-with-a-vps

I would not recommend that approach for a production project, but perhaps for a staging one. I will always enforce a manual deploy step for production using a system like jenkins.

For this project I created an ansible script that would be run by jenkins in order to deploy.

1
2
3
4
5
- name: Checkout git repo into destination
git: repo=/file/path/to/repo/from/root/{{repo_name}}
dest=file/path/to/destination/from/root/{{repo_dest}}
version={{github_branch}}
force=yes

This script makes sure the files in the repo, in the subtree branch (e.g. staging-dist) gets copied to the location required.

Questions?

This process took me a little while to wrap my head around. Especially going down the wrong path at first, and then going back and relooking the solution.

The solution should always feel like it fits. And this second approach fitted for our project.

This post focused on subtrees, but the overall solution required ansible, nginx, a private server, git and github.

If you have any questions (or suggested improvements) feel free to pop them in the comments section.

Onwards, to more solutions.

Configuring github to work through a proxy

Edit the .gitconfig file on Windows to work with a proxy

Open your .gitconfig file. You can find this file in your C:/Users/[your user] folder.

Edit the .gitconfig file on Ubuntu/Linux to work with a proxy

Open your .gitconfig file. You can find this file in your home ~/ folder.

You can type in the command line:

sudo nano ~/.gitconfig

Edit the .gitconfig file

Insert the following lines:

[http]
proxy = http://username:password@proxyurl:port
[https]
proxy = http://username:password@proxyurl:port
sslVerify = false

Username would be your proxy username.

Password would be your proxy password.

Proxyurl would be the proxy url.

Port would be the proxy port.

View your git settings

You can check your git settings in the command line by running the following:

git config -l

 

Commit just one file to git

Run this command if you would like to commit just one file to git:

git commit -m ‘commit comment’ file-path

With ‘commit comment’ being the commit message you want to add. And ‘file-path’ being the path to your file and the file name.