Get up to 50% off! Limited time only: Learn More.

Keeping Local Repos in Sync With Open Source GitHub Repos

Theo Despoudis October 24, 2019

DevOps Release Collaboration
Keeping Local Repos in Sync With Open Source GitHub Repositories Blog Banner Image

Git is a version control system widely used by organizations and millions of developers worldwide to securely and consistently manage their private codebases. In many cases, businesses adopt open-source projects for their infrastructure and library requirements. Given the convenience, it’s not an overstatement to say that we stand on the shoulders of giants.

In many cases though, depending on the open-source library, it’s not always enough. Oftentimes, the people who maintain those projects don’t have time to fix existing bugs or add new features right when we want them.

This process can cause friction in both the development and the incident management lifecycle. Eventually, we may find a bug in something that’s critical to our business and try to fix it locally by cloning the project and then adding the fixes, so we go ahead and add those missing features on top of a branch that we control.

This is the point where we could potentially end up introducing more problems – let’s dive into it.

The problem with keeping local and GitHub repositories in sync

The main problem we have when we clone an external open-source repository is that we have to maintain our own commit history when we add extra functionality. The commits we add are part of our separate history, so if the original repository ever merges different commits into the branch we’re using, then we’ll end up with a different lineage. So, there could be conflicts if we wanted to merge any updates or changes from the remote branch into our local branch.

To showcase what we mean, let’s simulate this problem by creating a repo and adding some changes:

1) Create a repo on GitHub using the UI – I named mine sync-example.

2) Clone the repo locally:

$ git clone git@github.com:theodesp/sync-example.git

3) Add a README.md file and add some text:

$ touch README.md
$ echo "example" >> README.md
$ git add .
$ git commit -m”example”
$ git push origin master

4) Once the remote repo is in sync in GitHub, go to the Project page and modify the README.md in place using the UI, and then click the Commit Changes button.

README file modification screenshot in GitHub

5) Now go back to the README.md locally and add a different modification to the same file, then try to push to master:

$ vi README.md
$ git add .
$ git commit -m "feature"
$ git push origin master
To github.com:theodesp/sync-example.git
    ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'git@github.com:theodesp/sync-example.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Typically, the remote branch would be an external repo which we don’t own, so we wouldn’t be able to push to that repo. The external repo can be updated at any time. In our local repo, we want to have a set of changes (bug fixes or features) that are not yet available in the remote repo. Thus, our local git history would be like this:

$ git log
commit __7801fa64f7d03719e0b39ac977015dfc24a33903__ (HEAD -> master)
Author: Theo <theo.despoudis@teckro.com>
Date:   Mon Sep 16 16:52:29 2019 +0100

  feature

commit c2a6a4b4ec0da69a2e3d5c6351def0cccb3730a0 (origin/master)
Author: Theo <theo.despoudis@teckro.com>
Date:   Mon Sep 16 16:47:53 2019 +0100

But the remote history would be:

$ git fetch
$ git log origin/master
commit __404aeebe02d46da222b83848a26c9e4b432a7035__ (origin/master)
Author: Theofanis Despoudis <thdespou@hotmail.com>
Date:   Mon Sep 16 16:48:45 2019 +0100

    Update README.md

commit c2a6a4b4ec0da69a2e3d5c6351def0cccb3730a0
Author: Theo <theo.despoudis@teckro.com>
Date:   Mon Sep 16 16:47:53 2019 +0100

    Example

I’ve highlighted the hashes of the two commits that are currently in conflict with each other. Git won’t merge those two because they’re unrelated, so we have to instruct git to do something else. (P.s. You can learn more about tracking service ownership with Git here.)

Let’s now take a peek at how we can solve those out of sync problems in the best way possible.

Why DevOps Matters

The solution(s) to out of sync GitHub and local repos

Let’s see how we can overcome those issues and keep all of our eggs in the basket. We have a few options, each with its own pros and cons. Let’s start with the less favourable one:

Technique 1: Manually updating the new commits on top of the latest changes:

When the remote repo gets ahead of other bug fixes that we need to include in our local repo, we need to create a patch file and save our commits in it. For example:

1) Create a patch with our latest changes:

$ git format-patch -1 HEAD
0001-feature.patch

$ cat 0001-feature.patch
From 7801fa64f7d03719e0b39ac977015dfc24a33903 Mon Sep 17 00:00:00 2001
From: Theo <theo.despoudis@teckro.com>
Date: Mon, 16 Sep 2019 16:52:29 +0100
Subject: [PATCH] feature

---
README.md | 3 +++
1 file changed, 3 insertions(+)

diff --git a/README.md b/README.md
index 33a9488..912f505 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,4 @@
    example
+
+- This is my local changes
+- I added new features as well
--
2.20.1 (Apple Git-117)

2) Delete our repo and clone the remote repo again:

$ mv 0001-feature.patch ../
$ cd ..
$ rm -rf sync-example
$ git clone git@github.com:theodesp/sync-example.git

3) Check if we can apply the patch with our changes:

$ git apply --check ../0001-feature.patch
error: patch failed: README.md:1
error: README.md: patch does not apply

4) Uh oh! We cannot apply the patch automatically since there are conflicts. Let’s use the –reject flag to let GitHub apply only the parts that have no conflicts:

$ git apply --reject ../0001-feature.patch
Checking patch README.md...
error: while searching for:
example

error: patch failed: README.md:1
Applying patch README.md with 1 reject...
Rejected hunk #1.

5) Finally, inspect the README.md.rej file and apply the other changes manually:

$ cat README.md.rej
diff a/README.md b/README.md	(rejected hunks)
@@ -1 +1,4 @@
    example
+
+- This is my local changes
+- I added new features as well

$ vi README..md

// … after fixes

$ cat README.md
example

- Made some modifications here
- This is my local changes
- I added new features as well

Technique 2: Using rebase

As you can imagine, performing all of those steps every time we need to sync with the remote branch would be very cumbersome and prone to errors. Luckily for us, we can do it more quickly and with more guidance using the rebase command. Let’s follow the steps again:

1) Reset our current branch to our previous changes:

$ git reset HEAD~1
Unstaged changes after reset:
M	README.md

$ git apply --reject ../0001-feature.patch
Checking patch README.md...
Applied patch README.md cleanly.

$ git add README.md
$ git commit -m "Our local Features"

2) Rebase the remote branch into our local branch:

$ git rebase origin/master
First, rewinding head to replay your work on top of it...
Applying: Our local Features
Using index info to reconstruct a base tree...
M	README.md
Falling back to patching base and 3-way merge...
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
error: Failed to merge in the changes.
Patch failed at 0001 Our local Features
hint: Use 'git am --show-current-patch' to see the failed patch

Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

3) Good news! We’re now at step 5 of our previous way of handling conflicts. Now, we just fix them and run git rebase –continue:

$ cat README.md
example

<<<<<<< HEAD
- Made some modifications here
=======
- This is my local changes
- I added new features as well
>>>>>>> Our local Features
~

$ vi README.md
// … after fixes

$ cat README.md
    example

    - Made some modifications here
    - This is my local changes
    - I added new features as well

$ git add README.md
$ git rebase --continue

Eventually, rebasing does the same thing as before, but now it gives us a more streamlined approach and a better UI. We also could have used cherry-picking if we wanted to apply a specific commit in our history:

$ git cherry-pick  __404aeebe02d46da222b83848a26c9e4b432a7035__
error: could not apply 404aeeb... Update README.md
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'

sync-example on   (git)-[master|cherry]- [⇕=]
$ cat README.md
example

<<<<<<< HEAD
- This is my local changes
- I added new features as well
=======
- Made some modifications here
>>>>>>> 404aeeb... Update README.md

Rebasing is suitable if we have two or more commits to apply on top of the remote branch and we want to keep the history in that order.

The Commit Strip

Commit Strip comic for open source code and repositories

(Image Source: CommitStrip)

Having to maintain a local Git repository in sync with a public open-source project is not ideal, since it substantially increases the technical debt of our codebase. In some cases where we have to offer extra functionality not provided by the original library, we have a few options that we can implement, as we described in this tutorial.

However, it doesn’t have to be that way – we can always contribute to the community by fixing bugs and adding those features that we implemented in our repository. Ultimately, actively participating in popular open-source projects will not only give us more credibility and respect in the community, but also a chance to share our values with the world.

Whether you build and maintain an open source project or not, you need a solution for detecting, responding to, and remediating incidents quickly. Sign up for a 14-day free trial or request a free personalized demo of VictorOps to learn why integrating all of your monitoring and alerting, collaboration tools and on-call schedules in one place can make on-call incident management suck less.

Let us help you make on-call suck less.

Get Started Now