Moving Files And History Between Git Repos

Although primarily used for revision control of source code, git repos are also often used to store a wide array of other things. For example, I use it for notes, a wiki, my website and even an audit-able means to share my public keys.

When troubleshooting, having a revision log showing when I changed something and (if I've written a decent commit message) why is invaluable.

However, just occasionally, you may find that you want to move a file out of one repo into another - whether to absorb it into a broader project, or simply as a result of a tidy up leading to some form of repository consolidation.

Copying files between repos with cp takes seconds, but, may be something that you later come to regret: git log on the copied files will no longer reveal why you chose to write $obviously_bizzare_code in the way that you did, destroying a lot of the value that using a git repo is supposed to provide.

It's therefore desirable to preserve histories where possible - in this documentation I'll detail how to copy a file from one repo to another without losing history.


Process Overview

The process itself is quite simple in principle: we're basically just going to push from the source repo into the destination.

There are, however, some extra steps involved if we want to do any of the following

  • Push the files into a subdirectory of the destination repo
  • Only push certain files across

In practice, it's quite likely that you're going to want to do at least one of these.


Setup

To begin with, we're going to fetch a copy of the source repo. We're potentially going to be making changes in it that we don't want accidentally pushed back to the origin, so it's much safer to use a copy

Clone the source repo into a directory called source_repo and then cd into it:

git clone https://example.invalid/myrepo.git source_repo
cd source_repo

Remove the origin so that anything we do here can't accidentally be pushed back

git remote rm origin

Organise files: Only Push Subset of Files

We might only want to copy some of the files in source_repo to the destination.

If that's the case, we need to filter the history to only include relevant files and directories. Traditionally, this was done with git filter-branch, but that comes with enough headaches that git now warns about using it:

WARNING: git-filter-branch has a glut of gotchas generating mangled history
     rewrites.  Hit Ctrl-C before proceeding to abort, then use an
     alternative filtering tool such as 'git filter-repo'
     (https://github.com/newren/git-filter-repo/) instead.  See the
     filter-branch manual page for more details; to squelch this warning,
     set FILTER_BRANCH_SQUELCH_WARNING=1.

Instead, it's better to use the filter-repo plugin:

pip3 install git-filter-repo

Before proceeding though, it's important to note that using either will result in history being rewritten, with the result that commit hashes will very likely change.

When ready, call filter-repo passing in the paths to any file or directory that you want to retain:

git filter-repo \
--path /dir1 \
--path README.md \
--path /dir2

The plugin will then work through the repo's history removing anything that doesn't match your filters. Once it's complete, if you run ls you should only see the files and directories that you passed into filter-repo


Organise files: pushing to a subdirectory

Whether the repo is filtered with filter-repo or not, it's sometimes desirable to push files into a different directory within the destination repository - although not impossible, it's probably fairly unlikely that we're going to want files to be in the exact same location.

In this example, we want everything in source_repo to appear in a directory called notes.

Achieving this is quite simple, we just need to move files within the source repo and then commit that move

mkdir notes
mv * notes
git add .
git commit -m "chore: moving files into subdir ready for repo migration"

Pushing To The Remote Repo

We now have our source_repo organised, with things laid out the way that we want them to appear in the destination repo.

The next step, then, is to clone a copy of the destination repo down next to the source repo

cd ..
git clone https://example.invalid/mynewrepo.git dest_repo
cd dest_repo

Next, we're going to create a branch to push the changes into. This is good practice for a number of reasons:

  • If you're using something like Github, Gitlab or Codeberg you can create a Pull Request as an additional record of the move
  • It allows easier rollback (just checkout main and create a new branch) if something goes wrong
  • Committing directly to master/main is not good practice

Create and checkout a branch

git branch repo-import
git checkout repo-import

Now, we're going to add the local copy of source_repo as a remote

git remote add src ../source_repo

With that added, we can pull history in, essentially copying the filtered files and histories into this repo

git pull src main --allow-unrelated-histories

Now, we need to push to the origin, so that we can create a pull request

git push origin repo-import

Or, if you're not using such a setup, you can also merge the branch into main / master and push that

git checkout main
git merge repo-import
git push origin main

Assuming no errors were returned, you can now safely remove the local copies

rm -r source_repo dest_repo