How do you merge two Git repositories while preserving history?
I have developed a small experimental project A in its own Git repository. It has now matured, and I’d like to add it as a subdirectory to a larger project B, which has its own repository. How can I merge repository A into repository B without losing the history of either repository?
To merge two Git repositories while preserving history, you can use the git subtree command or the git filter-repo tool to integrate one repository into a subdirectory of another while maintaining complete commit history. The subtree approach is simpler and more widely compatible, while filter-repo offers more advanced functionality for complex repository transformations.
Contents
- Understanding Repository Merging Options
- Using Git Subtree for Repository Integration
- Alternative Methods with Git Filter-Repo
- Step-by-Step Implementation Guide
- Best Practices and Considerations
- Troubleshooting Common Issues
Understanding Repository Merging Options
When you need to merge two Git repositories while preserving complete history, you have several approaches to consider. Each method has its own advantages and trade-offs in terms of complexity, compatibility, and functionality.
The most common approaches are:
- Git Subtree - Integrates one repository as a subdirectory of another, preserving all commit history
- Git Filter-Repo - More powerful tool for rewriting repository history before merging
- Git Submodule - Links to the external repository as a reference, but doesn’t integrate history
- Manual Import with Git Archive - Creates a clean slate but loses some history context
For your specific use case of adding project A as a subdirectory to project B while preserving history, git subtree is often the most straightforward and effective solution.
Key Consideration: Unlike simple file copying, these methods preserve the complete commit history, allowing you to see the full evolution of your experimental project within the larger project structure.
Using Git Subtree for Repository Integration
The git subtree command is specifically designed to integrate one repository into a subdirectory of another while preserving all commit history. It’s part of Git’s contrib scripts and provides a clean way to merge repositories.
Basic Subtree Commands
First, ensure you have git subtree available. It comes with Git but may need to be explicitly enabled:
git subtree --help # Test if it's available
The key commands you’ll need are:
# Add repository A as a subdirectory in repository B
git subtree add --prefix=projectA <repositoryA_url> <branch_or_tag>
# Pull updates from repository A into the subdirectory
git subtree pull --prefix=projectA <repositoryA_url> <branch_or_tag>
# Push changes from the subdirectory back to repository A
git subtree push --prefix=projectA <repositoryA_url> <target_branch>
How Subtree Preserves History
Unlike simple file copying, git subtree maintains the original commit history by creating merge commits that reference both repositories. Each commit from repository A becomes part of repository B’s history, with the file paths prefixed according to the subdirectory location.
This approach creates a unified history where you can see the complete evolution of both projects within a single repository structure.
Alternative Methods with Git Filter-Repo
For more complex scenarios, git filter-repo offers powerful capabilities for repository history rewriting before merging. This tool is particularly useful when you need to:
- Rewrite author information or commit dates
- Filter out specific files or directories
- Change commit messages en masse
- Restructure repository layout before merging
Filter-Repo Workflow
# First, install git-filter-repo if not available
pip install git-filter-repo
# Clone repository A and rewrite its history
git clone <repositoryA_url> temp-repo
cd temp-repo
git filter-repo --to-subdirectory-filter projectA
# Now add the rewritten repository to repository B
git remote add temp-repo ../temp-repo
git fetch temp-repo
git merge temp-repo/main --allow-unrelated-histories
Note: Git filter-repo is more powerful but also more complex. It’s recommended for situations where you need fine-grained control over the history rewriting process.
Step-by-Step Implementation Guide
Let’s walk through a complete implementation using the git subtree approach, which is ideal for your use case.
Prerequisites
Before starting, ensure both repositories are accessible and you have write permissions:
# Navigate to your main project B repository
cd /path/to/projectB
# Verify repository status
git status
git remote -v
Step 1: Add Repository A as Subdirectory
# Add repository A as a subdirectory named 'projectA'
git subtree add --prefix=projectA https://github.com/yourusername/repositoryA.git main
# The command will automatically:
# 1. Fetch the remote repository
# 2. Create a merge commit
# 3. Add all files from repository A to the projectA/ subdirectory
# 4. Preserve all commit history from repository A
Step 2: Verify the Integration
# Check that files are in the correct location
ls -la projectA/
# View the commit history to see merged commits
git log --oneline --graph --all
# Check that the original history is preserved
git log projectA/ | head -10
Step 3: Push to Remote Repository
# Push the merged repository to remote
git push origin main
# Also push any new branches created by the subtree operation
git push origin --all
Step 4: Ongoing Maintenance
As repository A evolves, you can pull updates:
# Pull latest changes from repository A
git subtree pull --prefix=projectA https://github.com/yourusername/repositoryA.git main
# If you make changes in projectA/ that should go back to repository A
git subtree push --prefix=projectA https://github.com/yourusername/repositoryA.git main
Best Practices and Considerations
When merging repositories while preserving history, consider these important best practices:
Branch Strategy
- Consider creating a feature branch before performing the merge to isolate the work
- Test the merge in a staging environment before committing to main
- Document the merge process for future reference
Conflict Resolution
- Be prepared for merge conflicts, especially if both repositories have files with similar names
- Resolve conflicts carefully, preserving the intent of both codebases
- Test thoroughly after resolving conflicts to ensure functionality is preserved
History Management
- Keep commit messages clear and descriptive of what was merged
- Consider tagging important points in the history before major operations
- Regularly backup your repositories before performing complex operations
Performance Considerations
- Large repositories may take longer to merge due to history processing
- Network connectivity is crucial for remote repository operations
- Disk space requirements increase with preserved history
Troubleshooting Common Issues
Subtree Not Available
If git subtree is not available:
# For macOS with Homebrew
brew install git
# For Ubuntu/Debian
sudo apt-get install git
# Or use the contrib scripts directly
git contrib/subtree/git-subtree.sh
Merge Conflicts
If you encounter merge conflicts:
# Check conflicted files
git status
# Resolve conflicts manually in your editor
git add projectA/path/to/conflicted/file
# Complete the merge
git commit
History Issues
If you need to rewrite history after a subtree merge:
# Interactive rebase to clean up commit history
git rebase -i HEAD~3
# Or use git filter-repo for more complex operations
git filter-repo --path projectA/ --force
Remote Repository Issues
If you have trouble accessing the remote repository:
# Verify remote URL
git remote -v
# Update remote URL if needed
git remote set-url origin https://new-url.com/repository.git
# Test connectivity
git fetch origin
Sources
- Git Subtree Documentation - Official Git Pro Book
- Git Filter-Repo Documentation
- Stack Overflow - How to merge two Git repositories
- Atlassian Git Tutorial - Merging Repositories
- GitHub Docs - Adding a repository as a subtree
Conclusion
Merging two Git repositories while preserving history is entirely achievable using modern Git tools. For your specific use case of integrating project A into project B as a subdirectory, the git subtree approach provides the best balance of simplicity and functionality.
Key takeaways:
- Git subtree maintains complete commit history while integrating repositories as subdirectories
- The process creates a unified history where both projects’ evolution is visible
- Regular maintenance allows you to pull updates from the original repository or push changes back
- Proper planning and testing help avoid common issues like merge conflicts
Recommended next steps:
- Create a backup of both repositories before starting
- Test the merge in a local clone first
- Document the process for future reference
- Consider establishing a regular sync schedule if both repositories continue to evolve independently
By following these methods, you can successfully combine your experimental project with the larger project while preserving the valuable history that shows how your work has evolved over time.