What Is Git? A Beginner’s Guide to Version Control

I’ve spent countless hours helping devs understand what Git is and how it can transform their workflows. Git isn’t just another tool. It’s a distributed version control system that has fundamentally changed how we build software.

Created by Linus Torvalds (the same genius behind Linux), Git tracks changes in your source code during software development.

Unlike older systems, Git gives each developer a complete copy of the repository history on their local machine. This means you can work offline, commit changes, and only push to a remote repository when you’re ready.

Git excels at branching and merging, letting teams work on multiple features simultaneously without stepping on each other’s toes. This is why it’s become essential in modern software engineering and collaborative development.

Whether you’re a solo coder or part of a massive team using GitHub, GitLab, or Bitbucket, understanding Git basics will make you more efficient. From creating a simple commit to resolving complex merge conflicts, this guide covers everything you need to know.

Want to implement continuous integration in your DevOps pipeline? Git’s got you covered. Need to track down who introduced a bug and when? Git’s powerful version tracking makes this simple.

Let’s dive into the essentials of Git and transform how you manage your codebase.

What is Git?

Git is a version control system. It’s open-source and widely used by developers for managing source code.

Whether you’re dealing with a single project or multiple repositories, Git has options for branchingmerging, and collaboration. It manages the lifecycle, from initial code all the way to deployment.

Core Concepts of Git

maxresdefault What Is Git? A Beginner's Guide to Version Control

How Git Thinks About Data

Snapshots vs. changes

Unlike older version control systems that track line-by-line changes, Git takes snapshots of your entire project each time you commit. Think of it like a series of complete project photos rather than just noting what changed.

I’ve seen this approach make branching and merging much smoother in real-world projects. When I worked on a payment processing system with 6 developers, we could branch freely without the merge headaches common in older systems.

# In old VCS: Track that line 5 changed from X to Y
# In Git: Store a complete snapshot of all files at commit time

This snapshot model is why Git can work offline and why operations feel so fast.

The Git object model

Git’s repository is built on four types of objects that work together:

  • Blobs: Store file contents (like the actual code)
  • Trees: Directory listings that point to blobs or other trees
  • Commits: Snapshots that point to trees, plus metadata
  • Tags: Named references to specific commits (often used for releases)

These objects are connected through SHA-1 hashes, creating a complete history tree. When using GitHub or GitLab, you’re working with these same objects behind the scenes.

The Three States in Git

Modified state

When you first change a file in your working directory, Git sees it as modified but not yet tracked for the next commit.

I can check which files are in this state using git status. These changes exist only on my computer at this point – my teammates using git bash on their machines won’t see anything until I move further in the process.

Staged state

This middle ground is where Git prepares changes for the next commit. It’s like a loading dock where you gather exactly what you want to include.

For example, if I’ve changed 5 files but only want to commit 3 of them:

# Add specific files to the staging area
git add file1.js file2.js file3.js

# Or add all changes in the current directory
git add .

The staging area (also called the index) gives you control over which changes become part of your project’s permanent history.

Committed state

Once you run git commit, your staged changes become part of the repository history. Each commit gets a unique identifier (a SHA-1 hash) and includes:

  • Who made the change (author)
  • When it happened (timestamp)
  • Why it happened (commit message)
  • A pointer to the previous commit

Here’s a real-world example from a team I worked with at a fintech startup:

# Good commit message with context
git commit -m "Add two-factor authentication to login flow"

# Bad commit message (avoid this!)
git commit -m "Fixed stuff"

Good commit messages make it much easier to use git log later when troubleshooting issues in your codebase.

Git’s File System Structure

Working directory

This is your project folder where you actually edit files. Some files here might be tracked by Git, while others might be ignored through .gitignore.

When integrating with continuous integration systems, I often add build artifacts and dependency folders to .gitignore:

# Example .gitignore entries
node_modules/
build/
.env

This keeps your repository clean and focused on just source code.

Staging area (Index)

The staging area holds all the changes you’ve marked to include in your next commit using git add. It’s stored as a file in the .git directory.

I use this like a checkpoint system. When implementing a complex feature in an agile development workflow, I might:

  1. Write the core function
  2. Stage and commit that
  3. Write the tests
  4. Stage and commit those
  5. Add the UI components
  6. Stage and commit those

This creates a clean history that’s easier to review in pull requests.

Git repository

The .git directory contains your complete project history and configuration. It’s what gets copied when you git clone from a remote repository.

Inside this directory, Git stores:

  • All commits
  • Branches (like main, develop, feature branches)
  • Configuration settings
  • Remote server information

When working in a team collaboration environment, each developer has their own complete copy of this repository, making Git truly distributed.

Setting Up and Using Git

maxresdefault What Is Git? A Beginner's Guide to Version Control

Installing and Configuring Git

Downloading and installing Git

Getting Git on your system is straightforward:

  • Windows: Download the installer from git-scm.com
  • Mac: Install via Homebrew with brew install git or download the package
  • Linux: Use your package manager (apt-get install git for Ubuntu)

I recommend also installing a visual Git GUI client like GitKraken or SourceTree if you prefer graphical interfaces alongside the command line interface.

Initial setup (git config, setting up username and email)

Before you make your first commit, tell Git who you are:

git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"

These details get attached to every commit you make. When using Git with GitHub platform or Bitbucket hosting, these should match your account details.

For team projects, I also configure line ending behavior:

# On Windows
git config --global core.autocrlf true

# On Mac/Linux
git config --global core.autocrlf input

This prevents annoying line ending issues when your team uses different operating systems.

Initializing a Repository

Creating a new repository (git init)

Starting from scratch? Navigate to your project folder and run:

git init

This creates a hidden .git folder that tracks everything. I used this approach when converting an existing legacy project to use source code management for the first time.

After initializing, I typically create an initial commit with the basic project structure:

git add .
git commit -m "Initial project structure"

Cloning an existing repository (git clone)

More commonly, you’ll start by cloning an existing project:

git clone https://github.com/username/repository.git

This downloads the entire project history and sets up a connection to the remote repository hosting service.

If you’re working with a large codebase, you can save time with a shallow clone:

git clone --depth=1 https://github.com/username/repository.git

This gets just the latest version without the full history, which is useful for large open source development projects when you just need the current code.

Managing Changes in Git

Checking file status (git status)

The command I use most frequently is:

git status

This shows:

  • Which files are modified but not staged
  • Which files are staged for commit
  • Which files are not being tracked

I run this constantly throughout development to keep track of where I am in the Git workflow.

Adding changes to the staging area (git add)

When your changes are ready to commit, add them to the staging area:

# Add a specific file
git add filename.js

# Add all modified files
git add .

# Add parts of files interactively
git add -p

The interactive mode (-p) is particularly useful when you’ve made multiple unrelated changes to a single file but want to commit them separately, following good DevOps practices.

Committing changes (git commit)

With changes staged, create a permanent record in your history:

git commit -m "Add user authentication feature"

For more complex changes, I skip the -m flag to open a text editor where I can write a more detailed commit message:

Add user authentication feature

- Implement JWT token generation and validation
- Create login and registration endpoints
- Add password hashing with bcrypt
- Set up refresh token rotation

Relates to issue #45

Detailed messages like this make your project history much more valuable when onboarding new team members to your software development project.

Ignoring Files in Git

Using .gitignore

Some files shouldn’t be in version control:

  • Build artifacts
  • Dependencies (node_modules, vendor)
  • User-specific settings
  • Environment files with secrets
  • Large binary files

Create a .gitignore file in your project root:

# Dependencies
node_modules/
vendor/

# Build directory
dist/
build/

# Environment variables and secrets
.env
.env.local
config.secret.js

# OS files
.DS_Store
Thumbs.db

# IDE settings
.vscode/
.idea/

This helps keep your repository clean and focused on just the important source code.

Best practices for managing ignored files

Working in a team, I’ve found it useful to:

  1. Include language-specific ignores (like *.pyc for Python)
  2. Commit example configuration files (like .env.example) to show what’s needed
  3. Use global ignores for developer-specific files:
git config --global core.excludesfile ~/.gitignore_global

In microservice architectures, each service might need its own tailored .gitignore file based on the programming language and frameworks used. This pattern works well in modern Agile development environments where each service might use different tech stacks.

Branching and Merging in Git

maxresdefault What Is Git? A Beginner's Guide to Version Control

Understanding Git Branching

What is a branch?

A branch in Git is basically a separate line of development. I use branches every day to build features without disturbing the main codebase. Technically speaking, a branch is just a lightweight movable pointer to a commit in your Git repository.

Working on a large e-commerce site recently, I created separate branches for:

  • Payment processing integration
  • User profile updates
  • Shopping cart optimization

Each branch existed completely isolated from the others. This is core to how version control systems work effectively.

# See all branches in your repository
git branch

# The * indicates your current branch
* main
  feature/payments
  feature/profiles
  bugfix/login-issue

The default branch is usually called main (or master in older repos). This is your production-ready code in most Git workflows.

Benefits of using branches

Branches aren’t just a technical feature—they transform how teams work together. The main benefits I’ve seen:

  1. Isolation: Work on new features without breaking the stable code
  2. Experimentation: Try risky changes safely
  3. Parallel development: Multiple developers can work simultaneously
  4. Code review: Review changes before merging to main
  5. Feature flags: Keep features hidden until they’re ready

When working with continuous integration systems like Jenkins or GitHub Actions, branches help ensure that only tested code reaches production.

I recently led a project where 8 developers worked on different parts of an app simultaneously. Without branching, we would have constantly broken each other’s code. Instead, we completed 2 months of work without major conflicts.

Creating and Managing Branches

Creating a new branch (git branch)

To create a new branch, I use:

git branch feature/user-auth

This doesn’t switch to the branch—it just creates it. The branch starts as a copy of your current branch.

For real-world codebase management, I follow a naming convention:

  • feature/ for new features
  • bugfix/ for bug fixes
  • hotfix/ for urgent production fixes
  • release/ for release preparation

This makes it clear what each branch is for when viewing them in GitHub platform or GitLab repository.

Switching between branches (git checkout / git switch)

After creating a branch, I switch to it with:

# Traditional way
git checkout feature/user-auth

# Newer way (Git 2.23+)
git switch feature/user-auth

I can also create and switch in one command:

# Create and checkout
git checkout -b feature/payment-gateway

# Create and switch
git switch -c feature/payment-gateway

When switching branches, Git changes the files in your working directory to match that branch’s state. It’s like having multiple versions of your project that you can flip between instantly.

Renaming and deleting branches

Sometimes I need to clean up or rename branches:

# Rename the current branch
git branch -m new-branch-name

# Rename a specific branch
git branch -m old-name new-name

# Delete a branch (after merging)
git branch -d feature/completed

# Force delete an unmerged branch
git branch -D feature/abandoned

When working in teams, I usually delete feature branches after they’re merged through a pull request. This keeps the repository tidy. On GitHub, you can set this to happen automatically when PRs are merged.

Merging and Handling Conflicts

Merging branches (git merge)

When a feature is ready, I merge it back into the main branch:

# First switch to the destination branch
git checkout main

# Then merge the feature branch
git merge feature/user-authentication

Git uses different strategies for merging:

  • Fast-forward: When there are no new changes in the main branch
  • Recursive: When both branches have changes (creates a merge commit)

For important features in agile development, I prefer to create an explicit merge commit, even when fast-forward is possible:

git merge --no-ff feature/important-feature

This makes the feature’s history clearly visible in tools like GitLens or git log --graph.

Merge conflicts and how to resolve them

Merge conflicts happen when the same part of a file is changed differently in both branches. Git can’t automatically decide which change to keep.

I hit this regularly when multiple people work on popular files. When Git reports a conflict, I:

  1. Open the conflicted file(s)
  2. Look for conflict markers (<<<<<<<=======>>>>>>>)
  3. Edit the file to keep the correct code
  4. Remove the conflict markers
  5. Save the file
  6. git add the resolved file
  7. Continue the merge with git merge --continue

Here’s what conflict markers look like:

<<<<<<< HEAD (Current change)
function getUserName() {
  return user.firstName + ' ' + user.lastName;
}
=======
function getUserName() {
  return `${user.firstName} ${user.lastName}`;
}
>>>>>>> feature/template-literals (Incoming change)

I need to choose one version or combine them, remove the markers, then save.

Modern tools like VS Code make this easier with visual conflict resolution. When working with GitHub, their web interface also offers tools to resolve conflicts directly in the browser.

Rebasing vs. Merging

What is rebasing? (git rebase)

Rebasing is an alternative to merging that rewrites history. Instead of creating a merge commit, it replays your branch’s commits on top of the target branch.

git checkout feature/user-profiles
git rebase main

This makes it look like you started your work from the current state of main, creating a cleaner, linear history.

I use this mostly for:

  • Keeping feature branches up-to-date with main
  • Cleaning up messy commits before sharing
  • Maintaining a linear project history

It’s like saying “pretend I started this branch from the latest main, not from three weeks ago.”

Differences and best use cases for merging and rebasing

The choice between merging and rebasing affects your project’s history:

MergeRebase
Preserves history exactly as it happenedCreates a cleaner, linear history
Easier to understand for beginnersMore complex but cleaner result
Non-destructive (safer)Rewrites commit history (can be risky)
Creates “merge bubbles” in history graphCreates a straight line in history graph

My personal rules for software development teams:

  1. Use rebase to keep feature branches updated with main
  2. Use merge (with --no-ff) when completing a feature

The most important rule: Never rebase commits that you’ve shared with others (pushed to a remote repository). This can cause major headaches for your team.

Working with Remote Repositories

Understanding Remote Repositories

The role of remote repositories

maxresdefault What Is Git? A Beginner's Guide to Version Control

Remote repositories are copies of your project hosted on a server. They’re crucial for collaborative development because they:

  • Act as a central point for sharing code
  • Back up your work
  • Enable CI/CD pipelines
  • Allow issue tracking and code reviews
  • Facilitate open source contributions

When I work with clients, we always set up a remote repository before coding starts. It’s where all my branches and commits get shared with teammates.

My typical workflow involves constant interaction with remote repositories:

# Start the day by getting the latest changes
git pull

# Work on code locally and commit
git commit -am "Add feature X"

# Share my work at the end of the day
git push

This pattern ensures my code is backed up and visible to the team.

Common Git hosting services (GitHub, GitLab, Bitbucket)

Several platforms provide repository hosting:

  • GitHub: Owned by Microsoft, very popular for open source
  • GitLab: Complete DevOps platform with CI/CD built in
  • Bitbucket: Popular in enterprise, integrates with Jira and other Atlassian tools
  • Azure DevOps: Microsoft’s enterprise offering

I’ve used all of these professionally. Each has pros and cons:

GitHub has the largest community and best open source visibility. Its pull request system is the industry standard for code reviews.

GitLab offers the most complete built-in CI/CD pipeline system. I’ve set up entire deployment processes without leaving the platform.

Bitbucket works great if you’re already using Jira for issue tracking, giving you tight integration between code and tickets.

The underlying Git commands work the same regardless of which service you use—the differences are in the web interfaces and additional features.

Synchronizing Changes with Remote Repositories

Fetching updates (git fetch)

git fetch downloads changes from a remote repository without integrating them into your working files:

# Fetch updates from origin
git fetch origin

# Fetch from all remotes
git fetch --all

I use fetch when I want to see what’s changed before merging. After fetching, I can:

  • Compare my branch with the remote (git diff origin/main)
  • Decide whether to merge or rebase
  • Check out remote branches

This is safer than git pull because it doesn’t automatically merge changes into my working branch.

Pulling changes (git pull)

git pull is a combination of fetch + merge. It gets remote changes and updates your current branch:

# Pull from origin
git pull origin main

# Pull from current branch's tracking branch
git pull

If you want to use rebase instead of merge:

git pull --rebase origin main

This creates a cleaner history by applying your local commits on top of the remote commits.

I set my global Git config to always rebase on pull for certain projects:

git config --global pull.rebase true

This setting helps avoid unnecessary merge commits when keeping branches in sync.

Pushing changes (git push)

When I’m ready to share my commits, I use git push:

# Push current branch to its tracking branch
git push

# Push to a specific remote and branch
git push origin feature/user-profiles

# Push a local branch to a differently named remote branch
git push origin local-branch:remote-branch

The first time you push a new branch, you need to set its upstream tracking:

git push -u origin feature/shopping-cart

The -u (or --set-upstream) flag links your local branch to the remote branch, so future pushes and pulls don’t need to specify the remote.

If someone else has pushed to the same branch, Git will reject your push. You’ll need to pull their changes first, then push again.

Working with Multiple Remote Repositories

Adding a remote (git remote add)

Projects can connect to multiple remotes. This is useful when:

  • Forking open source projects
  • Migrating between hosting services
  • Setting up multiple deployment targets

To add a new remote:

git remote add gitlab https://gitlab.com/username/project.git

Now I can push to or pull from either remote:

git push origin main  # Push to GitHub
git push gitlab main  # Push to GitLab

Working on open source, I often have:

  • origin pointing to my fork
  • upstream pointing to the original project

This lets me keep my fork updated:

git fetch upstream
git rebase upstream/main
git push origin main

Changing or removing a remote

I sometimes need to update a remote’s URL or remove it entirely:

# Change a remote URL
git remote set-url origin https://github.com/new-username/repo.git

# Remove a remote
git remote remove gitlab

Changing URLs is common when:

  • Moving repositories
  • Switching between HTTPS and SSH
  • Changing usernames

To view all configured remotes:

# List remotes
git remote -v

This shows names and URLs for fetch and push (they’re sometimes different).

For complex projects, like microservices with distributed systems, I might maintain multiple remotes for different deployment environments or related services. This flexibility is one reason Git has become the standard for source code management.

Git Workflow and Best Practices

Basic Git Workflow

Modifying files in the working directory

The working directory is where I spend most of my day. This is where I edit code, add features, and fix bugs. Git doesn’t track these changes automatically—they stay local until I explicitly tell Git about them.

I often have multiple changes happening at once:

// Adding a new feature
function authenticateUser(credentials) {
  // Implementation details
}

// Fixing a bug elsewhere
function calculateTotal(items) {
  return items.reduce((sum, item) => sum + item.price, 0); // Fixed
}

When working in a team environment, I’ve learned to keep related changes together. This makes the version history more useful when someone needs to understand why a particular change was made months later.

Files in the working directory can be:

  • Tracked (Git knows about them)
  • Untracked (New files Git hasn’t seen)
  • Ignored (Matching patterns in .gitignore)

I check this with git status regularly during development.

Staging changes and committing them

Once I’ve made changes, I move them to the staging area before committing:

# Stage specific changes
git add src/auth.js src/cart.js

# Stage all changes in the current directory and subdirectories
git add .

The staging area lets me carefully select which changes belong together in a commit. I can make 20 edits but commit them as 3 logical groups.

Real world example:

# First logical change
git add src/api/users.js
git commit -m "Add support for case-insensitive username lookup"

# Second logical change
git add src/components/LoginForm.js src/styles/auth.css
git commit -m "Improve login form validation feedback"

This compartmentalization makes the codebase history much more useful. When reviewing code later, I can see exactly why changes were made at the same time.

The staging area is also perfect for catching mistakes before they become permanent:

# See what you're about to commit
git diff --staged

Synchronizing with the remote repository

After making local commits, I share them with the team through the remote repository:

# Push commits to the remote (usually origin)
git push

# If this is a new branch
git push -u origin feature/auth-improvements

Throughout the day, I also pull changes others have made:

# Get latest changes
git pull

When working with a continuous integration setup, pushes often trigger automated tests and deployments. In some teams I’ve worked with, we configured Slack notifications for key branches, so everyone knows when new code is available.

For open source development, I adjust this workflow slightly:

# For contributors to open source projects
git fetch upstream
git rebase upstream/main
# Make changes, commit
git push origin feature-branch
# Then create a pull request on GitHub/GitLab

Centralized workflow

The simplest Git workflow uses a single branch (usually main). Everyone pulls, commits, and pushes to this branch.

I’ve seen this work for small teams (2-3 people) on simple projects. The advantage is simplicity—there’s only one branch to manage.

The big downside? No isolation. If I commit broken code, everyone gets blocked. This lack of separation makes it risky for anything but the smallest projects.

# Centralized workflow
git pull origin main
# Make changes
git commit -am "Add feature X"
git push origin main

Feature branch workflow

This is what I recommend for most teams. Each feature or bug fix gets its own branch off main.

# Create a feature branch
git checkout -b feature/user-authentication

# Work, commit changes
git add .
git commit -m "Implement JWT authentication"

# When ready, merge back to main
git checkout main
git pull
git merge feature/user-authentication
git push

Benefits:

  • Isolation – broken code doesn’t affect others
  • Code review – PRs before merging
  • Clear organization – each branch has a purpose

When using GitHub platform or GitLab repository, this workflow naturally integrates with pull/merge requests, which add a layer of review before code reaches the main branch.

Gitflow workflow

For larger projects with regular releases, Gitflow adds structure with specialized branches:

  • main – production code only
  • develop – next release features
  • feature/* – new features
  • release/* – preparing a release
  • hotfix/* – emergency fixes for production

The workflow looks like:

# Start a feature
git checkout -b feature/login develop

# Complete feature
git checkout develop
git merge feature/login

# Prepare release
git checkout -b release/1.0 develop

# Finalize release
git checkout main
git merge release/1.0
git tag -a v1.0

# Hotfix for production
git checkout -b hotfix/login-crash main
# Fix bug
git checkout main
git merge hotfix/login-crash
git checkout develop
git merge hotfix/login-crash

I’ve implemented this at companies with formal release cycles and QA processes. The structure helps, but it adds complexity. For smaller teams, it’s often overkill.

Forking workflow

Common in open source development, this workflow has each developer fork the main repository, rather than working directly with it.

# Clone your fork
git clone https://github.com/your-username/project.git

# Add the original as upstream
git remote add upstream https://github.com/original-org/project.git

# Keep your fork updated
git fetch upstream
git rebase upstream/main

# Create feature branch, work on it
git checkout -b feature/improvement

# Push to your fork
git push origin feature/improvement

Then create a pull request from your fork to the original repository.

I use this workflow when contributing to open source. It gives project maintainers control over what gets merged, while allowing anyone to propose changes.

Writing Meaningful Commit Messages

Importance of clear commit messages

Good commit messages are like documentation that writes itself. They explain why changes happened, not just what changed (the diff already shows that).

Bad messages I’ve seen in real projects:

"Fix stuff"
"WIP"
"asdfghjkl"
"It works now"

These are useless months later when trying to understand why code changed.

Clear messages help with:

  • Troubleshooting (finding when bugs were introduced)
  • Knowledge transfer (helping new team members)
  • Code reviews (understanding intent)
  • Release notes (summarizing changes)

Best practices for writing commit messages

After years of working with different teams, here’s what I recommend:

  1. Use the imperative mood (“Add feature” not “Added feature”)
  2. Start with a concise summary line (50 chars or less)
  3. Leave a blank line after the summary
  4. Add a detailed explanation if needed
  5. Reference issue numbers if applicable

Example of a good commit message:

Add two-factor authentication to login process

- Generate and send verification codes via SMS
- Add verification step to login flow
- Store backup codes for recovery

Closes #142

For major features, I even include what was considered but not implemented, and why:

Implement JWT-based authentication

Uses RS256 signing for tokens with 1-hour expiration.
Considered OAuth but decided against it due to complexity
for our use case.

Ref #97

This context saves hours of confusion later when someone wonders “why didn’t they just use OAuth?”

For routine software development, I follow a convention where commit messages start with:

  • feat: for new features
  • fix: for bug fixes
  • docs: for documentation
  • style: for formatting/style changes
  • refactor: for code restructuring
  • test: for adding tests
  • chore: for maintenance tasks

This makes the repository history scannable at a glance.

Undoing Changes and Managing History

Reverting and Resetting Changes

Undoing changes in the working directory (git checkout / git restore)

maxresdefault What Is Git? A Beginner's Guide to Version Control

Made changes but want to discard them? I have two options:

# Traditional way (Git < 2.23)
git checkout -- filename.js

# Modern way (Git >= 2.23)
git restore filename.js

These commands discard uncommitted changes in specific files, replacing them with the last committed version. I use this when exploring ideas that didn’t work out.

For multiple files:

# Restore all unstaged files (discard all changes)
git restore .

This is permanent, so be careful! I’ve made the mistake of discarding hours of work this way. No undo button exists for this.

Resetting changes (git reset)

git reset moves the branch pointer to a different commit. It has three modes:

# Soft reset: Keep changes in staging area
git reset --soft HEAD~1

# Mixed reset (default): Keep changes in working directory
git reset HEAD~1

# Hard reset: Discard all changes
git reset --hard HEAD~1

I use these for different purposes:

  • --soft: “Oops, I committed to the wrong branch” (keeps changes staged)
  • --mixed: “I want to re-organize these changes” (keeps changes unstaged)
  • --hard: “I need to completely abandon these commits” (discards everything)

For example, when I accidentally committed to main instead of a feature branch:

# Save the commit message to a text file
git log -1 --pretty=%B > msg.txt

# Reset to remove the commit but keep changes staged
git reset --soft HEAD~1

# Create and switch to new branch
git switch -c feature/user-profiles

# Commit with the same message
git commit -F msg.txt

# Clean up
rm msg.txt

Reverting a commit (git revert)

Unlike reset, git revert creates a new commit that undoes a previous commit. This is safer for shared branches because it doesn’t rewrite history.

# Revert the most recent commit
git revert HEAD

# Revert a specific commit
git revert abc123

I recently used this when we accidentally merged a feature that wasn’t ready for production. Rather than trying to manipulate history, we reverted the merge commit, which was cleaner and safer.

For multiple commits:

# Revert a range of commits
git revert HEAD~3..HEAD

This creates separate revert commits for each original commit.

Stashing Changes

What is git stash?

git stash temporarily shelves changes so you can switch tasks without committing incomplete work.

I use this constantly when I need to switch contexts:

# Save current changes
git stash

# Switch to another branch for urgent fix
git checkout hotfix/critical-bug

# Make fix, commit, push
git commit -am "Fix critical login bug"
git push

# Return to original branch
git checkout feature/profiles

# Restore stashed changes
git stash pop

This workflow has saved me countless times when priorities suddenly change.

Stashes can include:

  • Modified tracked files
  • Staged changes
  • Untracked files (with --include-untracked)
  • Ignored files (with --all)

Applying and managing stashed changes

If you stash frequently, they can pile up:

# List all stashes
git stash list

# stash@{0}: WIP on feature/user-profiles: abc123 Add form validation
# stash@{1}: WIP on feature/payment: def456 Integrate Stripe API

To apply a specific stash:

# Apply but keep the stash
git stash apply stash@{1}

# Apply and remove the stash
git stash pop stash@{1}

I sometimes create a branch directly from a stash:

git stash branch feature/from-stash stash@{0}

This creates a new branch, applies the stash, then drops it if successful.

To remove stashes:

# Remove a specific stash
git stash drop stash@{1}

# Clear all stashes
git stash clear

Inspecting and Modifying History

Viewing commit history (git log)

The basic git log shows commits in reverse chronological order, but I rarely use it plain.

Instead, I customize it:

# Compact one-line format
git log --oneline

# Show branching graph
git log --graph --oneline --all

# Filter by author
git log --author="Jane"

# Filter by date
git log --since="2 weeks ago"

# Filter by content
git log -S"getUserProfile"

That last one, -S, is particularly useful. It finds commits that added or removed the specified string, perfect for tracking down when a specific function changed.

For more complex queries:

# Find commits that modified a specific file
git log -- filename.js

# Show the changes in each commit
git log -p

# Show stats (files changed, insertions, deletions)
git log --stat

I often create aliases for commonly used log formats:

git config --global alias.overview "log --oneline --graph --decorate --all"

Then I can just type git overview.

Interactive rebasing (git rebase -i)

Interactive rebasing lets you modify a series of commits. I use this to clean up my work before sharing it:

# Start an interactive rebase for the last 3 commits
git rebase -i HEAD~3

This opens an editor where you can:

  • pick – keep the commit as is
  • reword – change the commit message
  • edit – pause to amend the commit
  • squash – combine with previous commit
  • fixup – combine with previous commit, discard message
  • drop – remove the commit

For example, I often convert a series of “WIP” commits into a single clean commit before creating a pull request:

pick abc123 Add user profile database model
squash def456 WIP on profile form
squash ghi789 Fix validation issues
reword jkl012 WIP profile image upload

This would combine the first three commits, then stop to let me rename the fourth one.

Be careful though! Only rebase commits that haven’t been pushed to a shared repository. Rewriting public history causes major problems for teammates.

Recovering lost commits (git reflog)

Git keeps a record of all reference changes in the reflog. This has saved me multiple times after making mistakes with reset or rebase.

# View the reflog
git reflog

For example, after a hard reset:

# Oops, deleted some commits
git reset --hard HEAD~3

# View reflog to find the lost commit
git reflog
# e1f234d (HEAD -> feature/profiles) HEAD@{0}: reset: moving to HEAD~3
# abc123d HEAD@{1}: commit: Add avatar upload
# def456d HEAD@{2}: commit: Fix validation
# ghi789d HEAD@{3}: commit: Add form fields

# Recover by resetting to the commit before the mistake
git reset --hard ghi789d

The reflog only exists locally and expires eventually (default 90 days), so it’s not for long-term recovery. But for “oops” moments in the last few days or weeks, it’s invaluable.

I once recovered an entire feature branch this way after accidentally deleting it during cleanup:

git checkout -b recovered-feature abc123d

Where abc123d was the commit hash found in the reflog.

Advanced Git Techniques

Using Git Tags

Lightweight vs. annotated tags

maxresdefault What Is Git? A Beginner's Guide to Version Control

Tags in Git mark specific points in your project history. They’re commonly used for version releases, but serve different purposes based on their type.

Lightweight tags are simple pointers to commits:

# Create lightweight tag
git tag v1.0.0-beta

They’re just names stored in your repository with no extra information.

Annotated tags store additional metadata:

# Create annotated tag with message
git tag -a v1.0.0 -m "First stable release"

These include:

  • Tagger name and email
  • Creation date
  • A message explaining the tag
  • A checksum

I prefer annotated tags for official releases because they give more context. On a recent project, we used annotated tags to mark sprint releases, including full notes about key features and bug fixes in the tag message.

Creating and managing tags (git tag)

To list tags in your repository:

# List all tags
git tag

# List tags matching a pattern
git tag -l "v1.8.*"

# Show tag details
git show v1.0.0

By default, git push doesn’t transfer tags to remote repositories. You need to explicitly push them:

# Push a specific tag
git push origin v1.0.0

# Push all tags
git push origin --tags

If you need to update a published tag (rare, but happens), you can force it:

# Create a new tag pointing to a different commit
git tag -a v1.0.0 -f -m "Updated release" commit-hash

# Force push the updated tag
git push origin v1.0.0 --force

Warning: Changing published tags can cause problems for others. I only do this when absolutely needed, like when a critical bug was found immediately after tagging.

Working with Submodules

What are Git submodules?

maxresdefault What Is Git? A Beginner's Guide to Version Control

Submodules let you include other Git repositories within your main repository. They’re useful for:

  • Including third-party libraries
  • Sharing code between projects
  • Breaking large projects into smaller pieces

Each submodule is a pointer to a specific commit in the external repository. This ensures everyone working on your project gets the exact same version of the dependency.

In a recent microservices architecture, I used submodules to include shared utilities across several services. This kept the code DRY (Don’t Repeat Yourself) while still maintaining separate repositories for each service.

Adding and updating submodules

To add a submodule to your project:

git submodule add https://github.com/username/library.git libs/library

This creates a .gitmodules file tracking the submodule configuration and clones the repository to the specified path.

When cloning a project with submodules, you need additional steps:

# Clone the main repo
git clone https://github.com/username/project.git

# Initialize submodules 
git submodule init

# Fetch submodule contents
git submodule update

Or more simply:

git clone --recurse-submodules https://github.com/username/project.git

To update a submodule to its latest version:

# Navigate to submodule directory
cd libs/library

# Pull latest changes
git pull origin main

# Return to main project
cd ../..

# Commit the updated submodule reference
git add libs/library
git commit -m "Update library submodule to latest version"

For distributed systems with shared components, submodules provide a way to maintain consistency across services. I’ve used this approach in enterprise environments where different teams owned different parts of the codebase.

Git Hooks

Understanding Git hooks

maxresdefault What Is Git? A Beginner's Guide to Version Control

Git hooks are scripts that run automatically when specific Git events occur. They help enforce policies, automate tasks, and integrate with other systems.

Hooks live in the .git/hooks directory of your repository. Each hook corresponds to a specific Git command and runs either:

  • Before the command executes (pre-hooks)
  • After the command completes (post-hooks)

Common hook points include:

  • pre-commit: Runs before commit is created
  • prepare-commit-msg: Runs before commit message editor opens
  • commit-msg: Validates the commit message
  • post-commit: Runs after commit is completed
  • pre-push: Runs before pushing commits
  • post-checkout: Runs after checking out a branch

To use a hook, you create an executable script with the hook name. For example, a basic pre-commit hook in bash:

#!/bin/bash
# .git/hooks/pre-commit

echo "Running pre-commit checks..."
npm run lint

if [ $? -ne 0 ]; then
  echo "Linting failed! Commit aborted."
  exit 1
fi

Common use cases for pre-commit and post-commit hooks

I’ve implemented various hooks in professional environments to improve code quality and team workflow:

Pre-commit hooks:

  • Run linters to ensure code style
  • Check for debugging statements (console.log, debugger)
  • Run unit tests affected by changes
  • Validate commit message format
  • Prevent direct commits to protected branches
  • Check for sensitive information (API keys, credentials)
# Example: Preventing commits to main branch
#!/bin/bash
# .git/hooks/pre-commit

branch="$(git rev-parse --abbrev-ref HEAD)"
if [ "$branch" = "main" ]; then
  echo "Cannot commit directly to main branch!"
  exit 1
fi

Post-commit hooks:

  • Notify team members of changes
  • Trigger CI/CD pipelines
  • Update documentation
  • Generate changelogs
# Example: Notifying team on Slack
#!/bin/bash
# .git/hooks/post-commit

author=$(git log -1 --pretty=format:'%an')
message=$(git log -1 --pretty=format:'%s')
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"$author just committed: $message\"}" \
https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK

When working with multiple developers, hooks in the .git/hooks directory aren’t automatically shared. To solve this, I store hooks in a shared directory in the repository and use a setup script to symlink them:

#!/bin/bash
# setup-hooks.sh

ln -sf ../../hooks/pre-commit .git/hooks/pre-commit
ln -sf ../../hooks/commit-msg .git/hooks/commit-msg
chmod +x .git/hooks/pre-commit
chmod +x .git/hooks/commit-msg

This approach helps maintain consistent standards across the team when developing with Git workflow practices.

Managing Large Repositories

maxresdefault What Is Git? A Beginner's Guide to Version Control

Using Git Large File Storage (LFS)

Git wasn’t designed for large binary files. When dealing with assets like videos, images, or datasets, Git LFS (Large File Storage) helps maintain repository performance.

It works by replacing large files with text pointers in the repository, while storing the actual file content on a separate server.

To start using Git LFS:

# Install Git LFS
git lfs install

# Track specific file types
git lfs track "*.psd"
git lfs track "*.mp4"
git lfs track "datasets/*.csv"

# Make sure .gitattributes is tracked
git add .gitattributes

When you commit and push, Git LFS automatically:

  1. Uploads the large files to the LFS server
  2. Replaces them with pointers in your repository

To clone a repository using LFS:

git lfs clone https://github.com/username/repository.git

Or with a standard clone:

git clone https://github.com/username/repository.git
cd repository
git lfs pull

I implemented LFS on a project with 3D models and texture assets, reducing our repository size from 2.2GB to 50MB while keeping the assets accessible. This made common Git commands run much faster.

Optimizing performance with git gc and git prune

Over time, Git repositories accumulate unnecessary objects that slow down operations. Two maintenance commands help keep things fast:

git gc (garbage collection) compresses file revisions and removes unnecessary files:

# Run garbage collection
git gc

# More aggressive collection
git gc --aggressive

git prune removes objects that aren’t referenced by any commit:

# Remove unreachable objects older than 2 weeks
git prune

# Specify a different grace period
git prune --expire=1.day.ago

I typically run these commands when:

  • The repository feels sluggish
  • After large merges or rebases
  • Before sharing a repository

A practical maintenance script I use for large projects:

#!/bin/bash
# repo-maintenance.sh

echo "Cleaning up Git repository..."
# Remove branches already merged to main
git branch --merged main | grep -v "^\* main" | xargs -n 1 git branch -d

# Pack references
git pack-refs --all

# Run garbage collection
git gc --aggressive --prune=now

echo "Done! Repository optimized."

For very large monorepo projects, you might consider Git partial clone and shallow clone features to further improve performance:

# Partial clone (only fetch certain files)
git clone --filter=blob:none https://github.com/username/large-repo.git

# Shallow clone (only recent history)
git clone --depth=1 https://github.com/username/large-repo.git

These techniques have helped me manage repositories with thousands of commits and files while maintaining reasonable performance.

Git Security and Integrity

How Git Ensures Data Integrity

SHA-1 Hashing and its role in Git

maxresdefault What Is Git? A Beginner's Guide to Version Control

Every object in Git (commits, trees, blobs) is identified by a SHA-1 hash, which is a 40-character string like:

a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0

This hash is generated based on the content of the object. Even a tiny change produces a completely different hash. This design provides several benefits:

  1. Content verification: Git can detect if a file has been corrupted or tampered with
  2. Deduplication: Identical files share the same hash and are stored only once
  3. Distributed trust: Everyone has the same hashes for the same content

The hash includes not just the content, but also metadata like commit messages, author information, and parent commits. This creates a chain where each commit’s hash depends on all previous commits.

I’ve seen this save projects when hard drives failed. We were able to verify that the recovered data wasn’t corrupted because the SHA-1 hashes matched across all developer machines.

Preventing accidental or malicious changes

Git’s hash-based structure makes it tamper-evident. If someone modifies a previous commit, all subsequent commit hashes change. This makes it impossible to alter history secretly.

This integrity system helps in several ways:

  • Detecting corruption: If a file gets damaged, its hash won’t match
  • Preventing tampering: Can’t change history without leaving evidence
  • Ensuring consistency: Everyone has identical copies of the repository

While SHA-1 has known theoretical vulnerabilities, Git is moving toward more secure SHA-256. In practical terms, the current system remains secure for most version control needs.

To verify the integrity of your repository:

# Check for corruption
git fsck

# Verify connectivity and validity of objects
git fsck --full

I run this periodically, especially before major releases, to ensure everything is intact.

Securing a Git Repository

Managing access control and permissions

Security isn’t just about technology—it’s about people and processes. When working on sensitive projects, proper access controls are essential.

For remote repositories on platforms like GitHub, GitLab, or Bitbucket, I implement:

Repository-level permissions:

  • Viewers: Can only read code
  • Contributors: Can submit pull requests
  • Maintainers: Can merge code
  • Administrators: Full control

Branch protection rules:

  • Require pull request reviews before merging
  • Require status checks to pass
  • Prevent force pushes
  • Restrict who can push to specific branches

On GitHub, you can set this up under Settings → Branches → Branch protection rules:

Branch name pattern: main
✓ Require pull request reviews before merging
  ✓ Require approvals (2)
  ✓ Dismiss stale pull request approvals when new commits are pushed
✓ Require status checks to pass before merging
  ✓ Require branches to be up to date before merging
✓ Include administrators

For self-hosted Git repositories, similar controls can be implemented using server-side hooks and authentication systems.

Using signed commits for verification

Signed commits prove that code changes come from trusted contributors. This adds another layer of security to your version control system.

To set up commit signing with GPG:

  1. Generate a GPG key:
gpg --full-generate-key
  1. Tell Git about your key:
# List your keys
gpg --list-secret-keys --keyid-format LONG

# Configure Git to use your key
git config --global user.signingkey YOUR_KEY_ID
  1. Sign commits:
# Sign a single commit
git commit -S -m "Add secure feature"

# Sign all commits by default
git config --global commit.gpgsign true
  1. Share your public key with your team or upload it to GitHub/GitLab.

In GitHub, signed commits show a “Verified” badge, giving teammates confidence in the code’s origin.

On a recent healthcare project with strict compliance requirements, we made signed commits mandatory for all developers. This gave us an audit trail of exactly who made each change, which was important for regulatory purposes.

When working with continuous integration systems, you can add checks that reject unsigned commits to sensitive branches, further enhancing your security posture.

For even stronger security in enterprise environments, consider implementing:

  • Hardware security keys for Git authentication
  • Pre-receive hooks to validate all pushed code
  • Regular security audits of repository access
  • Automated secret scanning to prevent credential leaks

The right approach depends on your security needs, but even basic measures like branch protection and signed commits significantly improve your repository’s security.

Conclusion

Understanding what Git is isn’t just helpful, it’s necessary for any modern developer. After spending years working with dozens of teams across various projects, I’ve seen how Git transforms the entire software development process.

Git’s version control capabilities go beyond just saving history. They create a safety net that lets developers try new ideas without fear. I’ve watched junior devs grow more confident simply by knowing they could always return to a previous working state.

# Quick reference for daily Git commands
git status        # What's changed?
git add .         # Stage all changes
git commit -m "Description of changes"  # Save changes
git push          # Share with team
git pull          # Get team changes

The branching system alone revolutionizes how teams collaborate. On a recent healthcare project, our team of 12 developers worked simultaneously on different features without stepping on each other’s toes. We created over 200 feature branches, each isolated until ready for review.

GitHubGitLab, and Bitbucket extend Git’s capabilities by adding:

  • Issue tracking
  • Code review tools
  • CI/CD pipelines
  • Project management features
  • Team collaboration spaces

But the real power comes from understanding Git’s underlying concepts:

  • Repository structure: Working directory, staging area, commit history
  • Object model: Blobs, trees, commits, and tags
  • Remote vs. local: The distributed nature that makes Git resilient

When you grasp these fundamentals, even complex operations like resolving merge conflicts or performing an interactive rebase become logical steps rather than mysterious incantations.

Git shines in these real-world scenarios:

  1. Debugging production issues by identifying exactly when and why a bug was introduced
  2. Collaborating across time zones without waiting for teammates to come online
  3. Implementing continuous integration where every commit triggers automatic tests
  4. Managing multiple release versions simultaneously
  5. Contributing to open source development through forks and pull requests

For new developers, I recommend starting with basic commands and gradually exploring more advanced features like Git hooks or submodules as needed. Git’s power comes from its flexibility—you can use just the parts that solve your current problems.

As software engineering practices evolve, Git remains at the center, enabling modern DevOps workflows and supporting teams of all sizes. It’s not just a tool—it’s the foundation of how we build software together.

7328cad6955456acd2d75390ea33aafa?s=250&d=mm&r=g What Is Git? A Beginner's Guide to Version Control
Related Posts