Moving Files And History Between Git Repos
Although primarily used for revision control of source code, git
repos are also often used to store a wide array of other things. For example, I use it for notes, a wiki, my website and even an audit-able means to share my public keys.
When troubleshooting, having a revision log showing when I changed something and (if I've written a decent commit message) why is invaluable.
However, just occasionally, you may find that you want to move a file out of one repo into another - whether to absorb it into a broader project, or simply as a result of a tidy up leading to some form of repository consolidation.
Copying files between repos with cp
takes seconds, but, may be something that you later come to regret: git log
on the copied files will no longer reveal why you chose to write $obviously_bizzare_code
in the way that you did, destroying a lot of the value that using a git
repo is supposed to provide.
It's therefore desirable to preserve histories where possible - in this documentation I'll detail how to copy a file from one repo to another without losing history.
Process Overview
The process itself is quite simple in principle: we're basically just going to push from the source repo into the destination.
There are, however, some extra steps involved if we want to do any of the following
- Push the files into a subdirectory of the destination repo
- Only push certain files across
In practice, it's quite likely that you're going to want to do at least one of these.
Setup
To begin with, we're going to fetch a copy of the source repo. We're potentially going to be making changes in it that we don't want accidentally pushed back to the origin, so it's much safer to use a copy
Clone the source repo into a directory called source_repo
and then cd
into it:
git clone https://example.invalid/myrepo.git source_repo
cd source_repo
Remove the origin so that anything we do here can't accidentally be pushed back
git remote rm origin
Organise files: Only Push Subset of Files
We might only want to copy some of the files in source_repo
to the destination.
If that's the case, we need to filter the history to only include relevant files and directories. Traditionally, this was done with git filter-branch
, but that comes with enough headaches that git
now warns about using it:
WARNING: git-filter-branch has a glut of gotchas generating mangled history
rewrites. Hit Ctrl-C before proceeding to abort, then use an
alternative filtering tool such as 'git filter-repo'
(https://github.com/newren/git-filter-repo/) instead. See the
filter-branch manual page for more details; to squelch this warning,
set FILTER_BRANCH_SQUELCH_WARNING=1.
Instead, it's better to use the filter-repo
plugin:
pip3 install git-filter-repo
Before proceeding though, it's important to note that using either will result in history being rewritten, with the result that commit hashes will very likely change.
When ready, call filter-repo
passing in the paths to any file or directory that you want to retain:
git filter-repo \
--path /dir1 \
--path README.md \
--path /dir2
The plugin will then work through the repo's history removing anything that doesn't match your filters. Once it's complete, if you run ls
you should only see the files and directories that you passed into filter-repo
Organise files: pushing to a subdirectory
Whether the repo is filtered with filter-repo
or not, it's sometimes desirable to push files into a different directory within the destination repository - although not impossible, it's probably fairly unlikely that we're going to want files to be in the exact same location.
In this example, we want everything in source_repo
to appear in a directory called notes
.
Achieving this is quite simple, we just need to move files within the source repo and then commit that move
mkdir notes
mv * notes
git add .
git commit -m "chore: moving files into subdir ready for repo migration"
Pushing To The Remote Repo
We now have our source_repo
organised, with things laid out the way that we want them to appear in the destination repo.
The next step, then, is to clone a copy of the destination repo down next to the source repo
cd ..
git clone https://example.invalid/mynewrepo.git dest_repo
cd dest_repo
Next, we're going to create a branch to push the changes into. This is good practice for a number of reasons:
- If you're using something like Github, Gitlab or Codeberg you can create a Pull Request as an additional record of the move
- It allows easier rollback (just checkout
main
and create a new branch) if something goes wrong - Committing directly to
master
/main
is not good practice
Create and checkout a branch
git branch repo-import
git checkout repo-import
Now, we're going to add the local copy of source_repo
as a remote
git remote add src ../source_repo
With that added, we can pull history in, essentially copying the filtered files and histories into this repo
git pull src main --allow-unrelated-histories
Now, we need to push to the origin, so that we can create a pull request
git push origin repo-import
Or, if you're not using such a setup, you can also merge the branch into main
/ master
and push that
git checkout main
git merge repo-import
git push origin main
Assuming no errors were returned, you can now safely remove the local copies
rm -r source_repo dest_repo