Programming

How to Remove Git Commit History While Preserving Code

Learn how to completely remove Git commit history in GitHub while preserving your current code state. Step-by-step guides using orphan branches, git filter-branch, and repository re-initialization.

1 answer 1 view

How to completely remove commit history in GitHub while preserving the current code state? I have too many unused commits in my history and want to clean it up. What Git commands can I use to achieve this, such as git filter-branch or git rebase? My repository is hosted on GitHub.

Completely removing commit history in GitHub while preserving your current code state is achievable using several Git commands, with orphan branches being the most straightforward approach. The process involves creating a fresh branch with no history, committing your current code state, and then replacing your main branch with this clean version. While tools like git filter-branch can also accomplish this, they’re more complex and require careful execution to avoid losing code permanently.

Contents


Understanding Commit History Removal

When you’re dealing with excessive commit history in your GitHub repository, you might be tempted to use git rebase or other commands to clean things up. But git rebase won’t actually remove commit history—it just rewrites it differently. What you really need is a method to completely eliminate old commits while keeping your current code intact.

The key concept here is that Git doesn’t truly “delete” commits—it makes them unreachable. When you remove commit history, you’re essentially creating a new starting point for your repository, detached from all previous commits. This is why the process is often called “rewriting history” or “creating an orphan branch.”

Why would you want to remove commit history? There are several common reasons:

  • You have many experimental or test commits that are no longer relevant
  • Your repository contains sensitive information in commit messages or files
  • You want a clean, linear history for better collaboration
  • You’re starting fresh with a new project but want to keep existing code

The most important thing to understand is that this process is irreversible. Once you remove commit history, those commits are gone forever. Make sure you’ve backed up any important information before proceeding.


Method 1: Using Orphan Branches (Recommended)

Creating an orphan branch is the most straightforward and reliable method for removing commit history while preserving your current code state. An orphan branch is essentially a branch that has no parent commits—it starts fresh from your current working directory.

Here’s how it works step by step:

First, create an orphan branch. This command creates a new branch with no history, based on your current working tree:

bash
git checkout --orphan clean-history

What happens when you run this? Git creates a new branch called “clean-history” that starts from your current working directory state. All your files remain exactly as they were, but the branch has zero commit history attached to it.

Next, stage all your files. Since this is a new branch, Git treats all your existing files as untracked. You need to add them to the staging area:

bash
git add .

Now commit your current state. This will be the only commit in your new branch’s history:

bash
git commit -m "Initial commit with current code state"

The magic here is that this single commit contains all your current files and changes, but it has no parents—no connection to any previous commits in your repository.

Finally, you need to replace your main branch (usually called “main” or “master”) with this clean version:

bash
git checkout main # or git checkout master
git merge clean-history --allow-unrelated-histories
git branch -D clean-history

This approach has several advantages over other methods:

  • It’s simpler and less error-prone than git filter-branch
  • It doesn’t require complex configuration
  • It creates a clean, single-commit history
  • It preserves all your current files and changes

As the Xebia guide explains, “Create an orphan branch – no history, starts from the current working tree. Stage everything (including deletions). Commit once – this becomes the sole commit. Replace the old default branch. Force-push to GitHub – overwrites the remote history.” This method is perfect for when you need a completely clean slate while keeping your actual code intact.


Method 2: Using Git Filter-Branch

Git filter-branch is a more powerful but complex tool for rewriting Git history. While it can be used to remove commit history, it’s generally not recommended for this specific purpose unless you need to selectively remove certain commits while keeping others.

The git filter-branch command allows you to rewrite branches by applying a filter to each commit. For removing commit history, you would typically use it to rewrite the entire history, keeping only the current state.

Here’s a basic example of how you might use git filter-branch to remove history:

bash
git filter-branch --parent-filter 'test $# = 1 && echo "-p 1" || cat' HEAD

This command essentially makes each commit have only one parent (the first one), effectively creating a linear history. However, this still preserves all the commits, just in a different structure.

For completely removing commit history, a more appropriate approach would be:

bash
git filter-branch --subdirectory-filter . -- HEAD

This command moves all files to a subdirectory and then creates a new root commit. However, this is quite complex and has some limitations.

The official Git documentation on git-filter-branch explains that it’s “a powerful tool for rewriting history, but with great power comes great responsibility.” When using git filter-branch to remove commit history, you need to be extremely careful because:

  1. It’s easy to accidentally lose work if you make a mistake
  2. It can be slow for repositories with many commits
  3. It requires more configuration than orphan branches
  4. It’s overkill if you just want to remove all history and start fresh

The GitHub community guide notes that “Method 2 – Re-initialize the Repository: Clone the repository. Delete the .git directory. Re-initialize Git, add the remote, and commit. Force-push to overwrite the remote history.” This is actually a simpler alternative to git filter-branch for complete history removal.

Unless you have a specific reason to use git filter-branch (like selectively removing certain commits while preserving others), the orphan branch method is strongly recommended for completely removing commit history while preserving code state.


Method 3: Repository Re-initialization

Repository re-initialization is another effective method for completely removing commit history while preserving your current code state. This approach is particularly useful if you want to start with a completely fresh Git repository but keep all your existing files and directory structure.

Here’s how the re-initialization process works:

First, clone your repository to a temporary location if you’re working directly on the repository:

bash
git clone /path/to/your/repo temp-repo
cd temp-repo

Now, delete the .git directory. This completely removes all Git history:

bash
rm -rf .git

Re-initialize Git in the same directory:

bash
git init

Add the original remote repository:

bash
git remote add origin https://github.com/yourusername/yourrepo.git

Stage all your files:

bash
git add .

Create your initial commit:

bash
git commit -m "Initial commit with current code state"

Finally, force-push to overwrite the remote history:

bash
git push -f origin main # or git push -f origin master

This method essentially creates a brand new Git repository in your existing directory, preserving all your files but completely resetting the commit history. The force push is necessary because the branch histories don’t match—your local repository now has a completely different history than the remote one.

As the GitHub gist explains, “Delete the .git directory. Re-initialize Git, add the remote, and commit. Force-push to overwrite the remote history. This removes all previous commits from the remote.”

There are some advantages to this approach:

  • It creates a completely fresh Git repository
  • It’s straightforward to understand
  • It doesn’t require complex Git commands
  • It works well for repositories with complex history issues

However, there are also some disadvantages:

  • You need to have write access to the repository
  • All previous commit history is permanently lost
  • Collaborators will need to re-clone the repository
  • Force pushes can be dangerous if not done correctly

This method is particularly useful when:

  • Your repository has a very complex or corrupted history
  • You want to start completely fresh
  • You don’t need to preserve any commit information
  • You’re the only one working on the repository

For most cases, the orphan branch method is simpler and less disruptive. But repository re-initialization can be a good alternative when you need a completely fresh start.


Important Considerations and Warnings

Before you proceed with removing commit history from your GitHub repository, there are several critical considerations and warnings you need to understand. These operations are irreversible and can have significant consequences if not performed correctly.

Irreversibility of History Removal

Once you remove commit history, those commits are gone forever. Git doesn’t actually delete commits—it makes them unreachable through normal references. But with proper tools and enough determination, commits can potentially be recovered from the Git object database.

If there’s any chance you might need information from old commits (like commit messages, author information, or specific changes), make sure to extract that information before proceeding. You can use commands like git log to export commit information, or git format-patch to create patches of specific changes.

Impact on Collaborators

When you force-push rewritten history to GitHub, you completely change the branch structure. This creates significant problems for anyone else who has cloned the repository:

  1. They must re-clone: The easiest solution for collaborators is to delete their local repository and clone it fresh from GitHub.
  2. They can rebase: If they have local commits that haven’t been pushed, they can rebase their work onto the new history using git rebase main (or git rebase master).
  3. They can merge: They can merge the new history into their local branch, but this will create merge commits in their local history.

The Xebia guide warns that “All previous commits are permanently lost. Collaborators must re-clone or re-base.” Make sure all collaborators are aware of what you’re planning to do so they can prepare accordingly.

Force Push Risks

Force pushing (git push -f) is a dangerous operation that overwrites the remote branch with your local history. If done incorrectly, it can:

  1. Lose recent commits: If someone else has pushed commits to the remote branch after your last pull, force pushing will overwrite those commits.
  2. Create divergent histories: If multiple people are working on the same branch, force pushing can create complex merge conflicts and divergent histories.
  3. Break CI/CD pipelines: Many continuous integration and deployment systems rely on consistent branch histories. Force pushing can break these systems.

Always make sure you have the latest version of the remote branch before force pushing:

bash
git fetch origin
git reset --hard origin/main # or origin/master

GitHub Pull Requests and Issues

If you have open pull requests or issues that reference specific commits, removing those commits will break those references. The pull requests will still exist, but they may not display correctly since the referenced commits no longer exist.

Consider closing any open pull requests before rewriting history, or at least inform reviewers about what you’re planning to do.

Branch Protection Rules

If your repository has branch protection rules enabled (like requiring PR reviews or preventing force pushes), you may not be able to force push to the branch. You’ll need to temporarily disable these rules or use a different branch name.

Backup Your Work

Before performing any history rewriting operations:

  1. Create a backup of your repository
  2. Export any important commit information
  3. Verify that all important changes are in your working directory
  4. Make sure collaborators are informed

As the DEV Community guide emphasizes, “This approach creates a single new commit with only your current files, erasing all previous history (including sensitive data) but preserving your current project state.” Make sure that “current project state” includes everything you need before you erase the history.


Step-by-Step Implementation Guide

Let’s walk through a complete implementation of the recommended orphan branch method for removing commit history while preserving your current code state. This step-by-step guide will help you perform the operation safely and effectively.

Preparation Phase

Before you start, make sure you’ve completed these preparatory steps:

  1. Backup your repository: Create a backup of your current repository in case anything goes wrong.
bash
cp -r /path/to/your/repo /path/to/your/repo-backup
  1. Check your current status: Verify that all your important changes are committed and staged.
bash
git status
git log --oneline -10
  1. Update your local repository: Make sure you have the latest changes from GitHub.
bash
git fetch origin
git pull origin main # or git pull origin master
  1. Inform collaborators: If you’re working with others, let them know you’re rewriting history so they can prepare accordingly.

Implementation Steps

Now follow these steps in order:

  1. Create an orphan branch:
bash
git checkout --orphan clean-history

This creates a new branch with no history, based on your current working tree. Your files remain exactly as they were, but the branch has zero commits.
2. Stage all files:

bash
git add .

Since this is a new branch, Git treats all your existing files as untracked. This command stages all files for commit.
3. Commit your current state:

bash
git commit -m "Initial commit with current code state"

This creates the only commit in your new branch’s history, containing all your current files and changes.
4. Switch back to your main branch:

bash
git checkout main # or git checkout master
  1. Merge the clean history:
bash
git merge clean-history --allow-unrelated-histories

This merges your clean history into your main branch. The --allow-unrelated-histories flag is necessary because you’re merging branches with no common ancestor.
6. Delete the temporary branch:

bash
git branch -D clean-history
  1. Force push to GitHub:
bash
git push -f origin main # or git push -f origin master

The force push is necessary because you’re changing the branch history. This overwrites the remote branch with your clean history.
8. Verify the result:

bash
git log --oneline

You should now see only a single commit with your current code state.

Handling Common Issues

If you encounter any issues during this process:

  • “fatal: You have unstaged changes”:
bash
git add .
git commit -m "Temporary commit"
  • “fatal: refusing to merge unrelated histories”:
    Make sure you’re using --allow-unrelated-histories in your merge command.
  • “error: failed to push some refs to”:
    You need to use --force or -f with your push command because you’re rewriting history.
  • “Permission denied”:
    Make sure you have write access to the repository and that you’re authenticated to GitHub.

Post-Implementation Steps

After successfully removing the commit history:

  1. Update local clones: Anyone who has cloned the repository should re-clone it to get the clean history.
  2. Check CI/CD pipelines: Verify that your continuous integration and deployment systems still work correctly with the rewritten history.
  3. Update documentation: If you have documentation that references specific commits or commit messages, update it accordingly.
  4. Review branch protection: If you had branch protection rules, you may need to re-enable them now that the history is clean.

As the GitHub gist explains, “This removes all previous commits from the remote. Method 2 – Re-initialize the Repository: Clone the repository. Delete the .git directory. Re-initialize Git, add the remote, and commit. Force-push to overwrite the remote history.” The orphan branch method achieves the same result but is generally safer and more straightforward.


Best Practices for Clean Git History

After you’ve removed the commit history from your repository, it’s important to establish good practices to maintain a clean, manageable Git history going forward. These practices will help you avoid the need for drastic history cleanup operations in the future.

Commit Early and Often

One of the best ways to maintain a clean Git history is to make frequent, small commits rather than large, infrequent ones. This approach has several benefits:

  1. Better granularity: Small commits make it easier to understand what changed and when.
  2. Easier reverting: If something goes wrong, it’s easier to revert a small, specific commit than a large one with multiple changes.
  3. Better collaboration: Team members can review and merge changes more easily when they’re broken into logical units.

Instead of making one huge commit with ten different changes, make ten separate commits, each with a single logical change. This creates a clearer, more understandable history.

Write Clear Commit Messages

Good commit messages are crucial for maintaining a useful Git history. Follow these guidelines for effective commit messages:

  1. Be descriptive: Explain what changed and why, not just what you did.
  2. Keep it concise: Aim for one line (under 50 characters) for the summary, with additional details in the body if needed.
  3. Use the imperative mood: Write commands as if you’re telling Git what to do (e.g., “Add user authentication” instead of “Added user authentication”).
  4. Reference issues: Include GitHub issue numbers if applicable.

A good commit message might look like:

Fix user login authentication issue

- Update password hashing algorithm to use bcrypt
- Add rate limiting to prevent brute force attacks
- Fix session expiration handling

Closes #42

Use Branches Effectively

Branches are one of Git’s most powerful features for maintaining clean history. Use branches to:

  1. Isolate work: Create separate branches for features, bug fixes, and experiments.
  2. Review changes: Use pull requests to review changes before merging them.
  3. Try things out: Create branches for experiments without cluttering your main branch.

A good branching strategy might include:

  • main or master: Always stable, production-ready code
  • develop: Integration branch for completed features
  • feature/*: Branches for new features
  • bugfix/*: Branches for bug fixes
  • hotfix/*: Branches for emergency production fixes

Regularly Clean Up Local History

Even with good practices, your local Git history can accumulate unnecessary commits. Regularly clean up your local history using these techniques:

  1. Interactive rebase: Use git rebase -i to combine, reorder, or edit commits.
bash
git rebase -i HEAD~5
  1. Squash small commits: Combine multiple small commits into a single, logical commit.
bash
git rebase -i HEAD~5
# In the editor, change "pick" to "squash" for commits to combine
  1. Fixup commits: Use git commit --fixup to mark commits that should be combined with previous ones.
bash
git commit --fixup HEAD~1
git rebase --autosquash HEAD~5

Avoid Committing Sensitive Information

One common reason for removing commit history is accidentally committing sensitive information like passwords, API keys, or personal data. To avoid this:

  1. Add sensitive files to .gitignore: Create a .gitignore file to exclude sensitive files and directories.
  2. Use environment variables: Store sensitive data in environment variables rather than in code.
  3. Review commits before pushing: Always check what you’re about to commit using git diff and git status.
  4. Use Git hooks: Set up pre-commit hooks to check for sensitive information.

If you do accidentally commit sensitive information, remove it from the repository and then rewrite history to remove it completely.

Use .gitignore Effectively

A well-maintained .gitignore file is essential for keeping your repository clean. Include patterns for:

  1. Build artifacts: Compiled code, temporary files, etc.
  2. Dependencies: If you’re not committing dependencies, add them to .gitignore
  3. IDE settings: Editor-specific files and directories
  4. System files: .DS_Store, Thumbs.db, etc.
  5. Sensitive files: Configuration files with passwords, etc.

Example .gitignore:

# Dependencies
node_modules/
vendor/

# Build artifacts
dist/
build/
*.pyc

# IDE settings
.vscode/
.idea/
*.swp

# System files
.DS_Store
Thumbs.db

# Sensitive files
.env
config/secrets.json

Regular Repository Maintenance

Even with good practices, repositories can accumulate unnecessary files and clutter over time. Regular maintenance tasks include:

  1. Pruning remote references: Remove stale references to deleted branches.
bash
git remote prune origin
  1. Garbage collection: Remove unreachable objects from your local repository.
bash
git gc --aggressive
  1. Clean up local branches: Delete merged local branches.
bash
git branch --merged | grep -v '\*' | xargs git branch -d

By following these best practices, you can maintain a clean, manageable Git history that doesn’t require drastic cleanup operations. This will make collaboration easier and your repository more pleasant to work with.


Sources

  1. How to Remove Git Commit History While Keeping Your Main Branch Intact — Step-by-step guide for removing Git history while preserving current code: https://dev.to/documendous/how-to-remove-git-commit-history-while-keeping-your-main-branch-intact-4lk0

  2. How To Remove Git Commit History: Step-by-Step Guide For GitHub Users — Comprehensive guide with multiple methods for removing Git history: https://xebia.com/blog/deleting-your-commit-history/

  3. GitHub - Delete commits history with git commands — Practical commands and instructions for removing Git history: https://gist.github.com/heiswayi/350e2afda8cece810c0f6116dadbe651

  4. Git - git-filter-branch Documentation — Official documentation for Git’s history rewriting tool: https://git-scm.com/docs/git-filter-branch


Conclusion

Completely removing commit history in GitHub while preserving your current code state is achievable through several methods, with orphan branches being the most straightforward and recommended approach. The process involves creating a fresh branch with no history, committing your current code state, and then replacing your main branch with this clean version.

While tools like git filter-branch can also accomplish this, they’re more complex and require careful execution to avoid losing code permanently. The repository re-initialization method provides another alternative, especially useful when you need a completely fresh start.

Remember that these operations are irreversible and can have significant consequences if not performed correctly. Always backup your work, inform collaborators, and understand the risks before proceeding with history removal. The orphan branch method generally provides the best balance of simplicity and effectiveness for most use cases.

Authors
Verified by moderation
Moderation
How to Remove Git Commit History While Preserving Code