Beginner 5 min · March 28, 2026

Git Clone — Silent Corruption from Disk Limits

Q: How do I git clone a specific branch instead of main?

Pass the --branch flag with the branch name: 'git clone --branch feature/my-branch https://github.com/org/repo.git'. Git clones the full repo but checks out that branch immediately instead of the default. If you want to minimise what's downloaded, combine it with --single-branch to fetch only that branch's history.

Q: What's the difference between git clone and git pull?

Clone creates a brand new local repository from a remote — you use it exactly once, when you don't have the repo locally yet. Pull is for an existing local repo that needs to sync new commits from the remote. The rule: no local repo yet → clone. Local repo already exists → pull (or fetch + merge).

Q: How do I clone a private repository?

For HTTPS, you'll be prompted for credentials — use a personal access token as the password, not your actual account password. For SSH, add your public key to the account that has access to the repo, then clone with the SSH URL format: git@github.com:org/private-repo.git. Most CI systems use HTTPS with a machine token stored as an environment variable, never hardcoded.

Q: Can a shallow clone cause problems if you later need the full history?

Yes, and this catches people who only test the happy path. A shallow clone stores a 'shallow boundary' marker — Git knows the history is intentionally truncated. Commands that traverse history (git log on old files, git blame, git bisect, git merge-base) either fail or give wrong results. You can deepen a shallow clone later with 'git fetch --unshallow', which downloads the missing history, but on a large repo that can take minutes and defeats the original purpose. If there's any chance you'll need history, don't shallow clone.

Q: What is --filter=blob:none and when should I use it?

--filter=blob:none tells Git to download commit and tree objects during clone but defer blob (file content) downloads until checkout. On a monorepo with 100,000 files, this reduces initial clone size by 90%+. Use it for developers who work in specific directories of a large monorepo. For CI pipelines that checkout the entire tree, --depth 1 is simpler and faster.

A 40GB monorepo clone on a 10GB CI disk caused silent corruption and intermittent 500 errors.

Naren · Founder

Plain-English first. Then code. Then the interview question.

About

● Production Incident 🔎 Debug Guide

⚡Quick Answer

Downloads the entire object database (commits, trees, blobs)
Creates a remote called 'origin' pointing to the source URL
Checks out the default branch so you have files to work with
Wires up remote-tracking references for all branches
--depth 1 — shallow clone, only latest commit (CI pipelines)
--branch — check out a specific branch or tag on clone
--single-branch — fetch only one branch's history
--no-tags — skip downloading release tag objects

Plain-English First

Imagine a Google Doc that your whole team works on, but instead of everyone editing the same live file, Git hands each person a complete printed copy of the entire history — every draft, every edit, every version ever saved. Git clone is the moment you walk up to the printer and say 'give me my copy.' You now have everything offline, locally, and nothing you do to your copy touches anyone else's until you deliberately send changes back.

git clone creates a complete local copy of a remote repository. It downloads every commit, every branch, every tag — the entire object database going back to the first commit. This is not a file download; it's a full history replication.

In production, clone misconfigurations cause real outages. A shallow clone in a pipeline that later needs full history breaks git blame and git bisect. A clone on a disk without enough space leaves a corrupted repository that passes CI silently. Understanding what clone actually does under the hood prevents these failures.

Common misconceptions: that clone only downloads one branch (it downloads all branch data but only checks out the default), that shallow clones are always safe for CI (they break anything that traverses history), and that HTTPS and SSH clones are interchangeable (they have different authentication models and network requirements).

What Git Clone Actually Does (And Why You Need to Know)

Before you touch a terminal, understand what you're asking Git to do. Because if you think clone just 'downloads code,' you're going to make bad decisions later.

Every Git repository is a database of snapshots. Every time someone commits, Git stores a compressed snapshot of the entire project — not just the diff — plus metadata: who, when, what message, and a pointer to the parent commit. Clone copies all of it. Every snapshot. Every commit. Every branch tip. Every tag. The full history going back to the very first commit, potentially years ago.

When you run git clone <url>, Git does five things in sequence: connects to the remote server, downloads every object in the repo's object database (commits, trees, blobs), reconstructs the history graph locally, creates a remote called origin that points back to the URL you used, and checks out the default branch so you have actual files to work with. That last step — the checkout — is why you see files appear. But the real value is everything Git stored before that step.

Why does this matter for you right now? Because understanding that clone downloads history explains every flag you'll need: why --depth exists, why --branch is useful, and why cloning without thinking can pull gigabytes you'll never need.

io/thecodeforge/git/BasicClone.shBASH

# io.thecodeforge — Git Clone Basics

# The most basic clone — downloads the full repo with all history
# Replace the URL with any real repository URL you have access to
git clone https://github.com/your-org/your-repo.git

# By default, this creates a folder named after the repo (your-repo)
# and puts all files inside it. cd into it to start working.
cd your-repo

# Verify the clone worked — see which branch you're on
# and confirm the remote 'origin' was configured automatically
git status
git remote -v

# Check that you have the full history
# This shows the last 5 commits on the current branch
git log --oneline -5

# Inspect what clone actually stored locally
git count-objects -vH
# Shows: count (loose objects), size (disk usage), in-pack (packed objects)
# This tells you how much space the clone is using.

Output

Cloning into 'your-repo'...

remote: Enumerating objects: 1482, done.

remote: Counting objects: 100% (1482/1482), done.

remote: Compressing objects: 100% (731/731), done.

remote: Total 1482 (delta 619), reused 1389 (delta 540), pack-reused 0

Receiving objects: 100% (1482/1482), 4.23 MiB | 3.11 MiB/s, done.

Resolving deltas: 100% (619/619), done.

On branch main

Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

origin https://github.com/your-org/your-repo.git (fetch)

origin https://github.com/your-org/your-repo.git (push)

a3f91c2 Add retry logic to payment processor

88c4d01 Fix null pointer in order validation

3b2e771 Refactor checkout flow into separate service

9d0a114 Add integration tests for cart service

1ca8823 Initial commit

Origin Is a Nickname, Not a Server

Origin is a named reference stored in .git/config
You can have multiple remotes: origin, upstream, fork, etc.
Renaming origin breaks every script and teammate workflow that assumes the convention
git remote set-url origin <new-url> changes where origin points without renaming it

Production Insight

Understanding that clone downloads the full object database explains why monorepo clones are slow and why shallow clones exist. A five-year-old monorepo with 50,000 commits and large binary assets can be 40GB+ in object data. Clone downloads all of it — every snapshot, every blob, every tree. There is no 'partial clone' without explicit flags. This is why CI pipelines on large repos must use --depth 1 — the full history is never needed for a build.

Key Takeaway

Clone copies the entire object database — all commits, all branches, all tags. It's not a file download; it's a full history replication. Understanding this explains every flag, every failure mode, and why monorepo clones can be gigabytes. Use git count-objects -vH to see how much space your clone is using.

Cloning with Control: The Flags That Actually Matter in Production

The basic clone works. But in production environments, CI pipelines, and large teams, naked git clone is often the wrong tool. Here's why: it downloads everything, always, unconditionally. A repo with five years of history and large binary assets can be several gigabytes. On a CI server spinning up a fresh container for every build, that's minutes of wasted time on every single pipeline run.

The fix isn't clever — it's just flags most people never learn about. --depth creates a shallow clone: it only fetches the most recent N commits instead of the full history. For a CI pipeline that just needs to build and test the current code, a depth of 1 is all you ever need. I've seen pipeline times drop from 4 minutes to 40 seconds on repos with long histories, just by adding --depth 1.

--branch lets you clone directly onto a specific branch or tag instead of the default. This is critical when your pipeline needs to build a release tag, or when a developer needs to start work on a feature branch without switching after the clone. --single-branch pairs with --depth to tell Git not to fetch any branch information except the one you asked for — keeping the clone tight and fast.

There's also --no-tags, which stops Git from downloading all the tag objects. Tags can add surprising size to a repo with lots of releases. And cloning into a specific directory name — by passing a path as the second argument — is underused. Your folder name should communicate intent, not just inherit whatever name the repo happened to have.

io/thecodeforge/git/ProductionCloneFlags.shBASH

# io.thecodeforge — Production Clone Flags

# --- SCENARIO: CI/CD pipeline building a Node.js checkout service ---
# We only need the current state of main. Full history wastes time and disk.

# Shallow clone: only fetch the single most recent commit (depth=1)
# --single-branch: skip all other branch refs — keeps the fetch minimal
# --no-tags: skip downloading release tags — we don't need them for a build
git clone \
  --depth 1 \
  --single-branch \
  --no-tags \
  https://github.com/your-org/checkout-service.git

# --- SCENARIO: Developer needs to start work on a specific feature branch ---
# --branch accepts a branch name OR a tag name
# Clones directly onto the feature branch — no need to checkout after
git clone \
  --branch feature/payment-retry \
  https://github.com/your-org/checkout-service.git

# --- SCENARIO: Clone into a custom directory name ---
# Second positional argument overrides the folder name
# Useful when the repo name is generic or conflicts with another local folder
git clone \
  https://github.com/your-org/checkout-service.git \
  checkout-service-v2

# --- SCENARIO: Clone a specific release tag for a deployment ---
# Perfect for reproducible deployments — you get exactly what was tagged
git clone \
  --depth 1 \
  --branch v2.4.1 \
  --single-branch \
  https://github.com/your-org/checkout-service.git \
  checkout-service-release

# Verify the shallow clone only has 1 commit in history
cd checkout-service-release
git log --oneline

# --- SCENARIO: Partial clone — download large files on demand ---
# Git 2.25+ supports filter-based partial clones
# --filter=blob:none: don't download file contents until needed
# Saves massive space on repos with large binary assets
git clone \
  --filter=blob:none \
  https://github.com/your-org/monorepo.git

# Verify partial clone status
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | grep blob | head -5
# Shows: blob objects are listed but not downloaded until checkout

Output

# Output for shallow clone:

Cloning into 'checkout-service'...

remote: Enumerating objects: 47, done.

remote: Counting objects: 100% (47/47), done.

remote: Compressing objects: 100% (41/41), done.

remote: Total 47 (delta 3), reused 36 (delta 0), pack-reused 0

Receiving objects: 100% (47/47), 312.44 KiB | 5.22 MiB/s, done.

Resolving deltas: 100% (3/3), done.

# Output for the release tag clone + git log:

Cloning into 'checkout-service-release'...

remote: Enumerating objects: 47, done.

remote: Total 47 (delta 3), reused 36 (delta 0), pack-reused 0

Receiving objects: 100% (47/47), 312.44 KiB | 5.22 MiB/s, done.

Note: switching to 'v2.4.1'.

You are in 'detached HEAD' state.

f91a3c8 (HEAD, tag: v2.4.1) Release v2.4.1 — payment retry with backoff

Production Trap: Shallow Clone Plus Push Equals Rejected

--depth 1 downloads only the latest commit — perfect for CI builds
--single-branch skips all other branches — reduces fetch size further
--no-tags skips release tag objects — useful on repos with hundreds of releases
--filter=blob:none (Git 2.25+) defers large file downloads until checkout

Production Insight

The --filter=blob:none flag is the most underused production clone optimization. It tells Git to download commit and tree objects but defer blob (file content) downloads until checkout. On a monorepo with 100,000 files, this can reduce initial clone size by 90%+. The blobs are fetched on-demand as you checkout files. The trade-off: first checkout of any file triggers a network fetch, which adds latency. For CI pipelines that checkout the entire tree anyway, --depth 1 is simpler. For developers who only work in specific directories of a monorepo, --filter=blob:none saves significant time and disk.

Key Takeaway

Production clones need flags: --depth 1 for CI (read-only builds), --branch for specific release tags, --single-branch to minimize fetch, --no-tags to skip tag objects. Shallow clones are read-only — never push from them. --filter=blob:none (Git 2.25+) defers large file downloads for massive space savings on monorepos.

SSH vs HTTPS: Pick the Right Protocol Before You Waste an Hour

Every repository URL comes in two flavours and the choice between them matters more than most beginners realise. The wrong choice means re-entering passwords on every push, broken CI pipelines, or authentication failures that are genuinely confusing to debug.

HTTPS URLs look like https://github.com/your-org/repo.git. They work everywhere — through corporate proxies, firewalls, and restricted networks. The downside: they require credential authentication on every push and pull unless you configure a credential helper or use a personal access token baked into the URL (which is a security hazard you should never do — I've seen tokens committed to Dockerfiles this way and rotated in a panic).

SSH URLs look like git@github.com:your-org/repo.git. They use a keypair: a private key that stays on your machine, and a public key you register with GitHub/GitLab/Bitbucket once. After that, every clone, push, and pull is seamless — no passwords, no tokens, no prompts. For daily development, SSH is almost always the right choice. For CI/CD systems, HTTPS with a machine-level access token scoped to read-only is the standard — because private keys on ephemeral containers are operational debt.

You can always switch after the fact with git remote set-url, so getting this wrong isn't permanent. But getting it right from the start saves you the detour.

io/thecodeforge/git/CloneProtocolComparison.shBASH

# io.thecodeforge — Clone Protocol Comparison

# --- HTTPS clone ---
# Works immediately, no setup required
# GitHub will prompt for username + personal access token on push
git clone https://github.com/your-org/inventory-service.git

# --- SSH clone ---
# Requires SSH key already added to your GitHub/GitLab account
# If your key is set up, this never prompts for a password
git clone git@github.com:your-org/inventory-service.git

# --- Check which URL your clone is currently using ---
cd inventory-service
git remote -v

# --- Switch from HTTPS to SSH after cloning ---
# Useful if you cloned HTTPS and now want seamless pushes
git remote set-url origin git@github.com:your-org/inventory-service.git

# --- Switch from SSH back to HTTPS (common fix in restricted networks) ---
git remote set-url origin https://github.com/your-org/inventory-service.git

# --- Verify the change took effect ---
git remote -v

# --- Test your SSH key is correctly configured BEFORE cloning ---
# This handshakes with GitHub without needing a repo
# Look for: "Hi your-username! You've successfully authenticated"
ssh -T git@github.com

# --- CI/CD pattern: HTTPS with machine token via environment variable ---
# Never hardcode tokens. Store as CI secret, inject at clone time.
git clone https://x-access-token:${GITHUB_TOKEN}@github.com/your-org/repo.git
# GITHUB_TOKEN is a CI environment variable, never in source code.

Output

# After HTTPS clone, git remote -v:

origin https://github.com/your-org/inventory-service.git (fetch)

origin https://github.com/your-org/inventory-service.git (push)

# After switching to SSH, git remote -v:

origin git@github.com:your-org/inventory-service.git (fetch)

origin git@github.com:your-org/inventory-service.git (push)

# SSH test output:

Hi your-username! You've successfully authenticated, but GitHub does not provide shell access.

# If your SSH key isn't set up, you'll see:

Permission denied (publickey).

Senior Shortcut: Test SSH Before Your First Clone

ssh -T git@github.com — one command to verify SSH is working
ed25519 keys are preferred over RSA — shorter, faster, more secure
GitHub deprecated password auth in 2021 — HTTPS now requires personal access tokens
CI systems use HTTPS with machine tokens injected as environment variables, never hardcoded

Production Insight

The protocol choice has security implications beyond convenience. HTTPS with tokens stored in URLs or config files is a common source of credential leaks — I've seen tokens committed to Dockerfiles, .gitconfig files, and CI YAML. SSH keys are safer because the private key never leaves your machine. But SSH on ephemeral CI containers creates operational debt: you need to inject the private key, ensure correct permissions (chmod 600), and clean up after the build. The industry standard for CI is HTTPS with a short-lived machine token scoped to read-only access, stored as a CI secret and injected at runtime.

Key Takeaway

SSH for daily development (seamless auth after key setup). HTTPS with machine tokens for CI/CD (no private key management on ephemeral containers). Always test SSH with ssh -T git@github.com before your first clone. You can switch protocols anytime with git remote set-url. Never hardcode tokens in source code or Dockerfiles.

What Happens After Clone: Getting Oriented Fast

Cloning is step one. Where developers get lost — especially when joining an existing project — is what to do immediately after. You have a local copy of the repo, but you might be missing context: which branches exist, what the project structure looks like, and how remote tracking actually works.

Right after cloning, you're on the default branch (usually main or master). But there are almost certainly other branches on the remote that aren't checked out locally yet. A common misconception: beginners think git clone only downloads one branch. It doesn't — it downloads all branch data, but only checks out the default one. The other branches exist as remote-tracking references like origin/feature/payment-retry. You can create a local branch from any of them without another network call.

Understanding remote-tracking branches is what separates someone who's memorised clone from someone who actually knows Git. A remote-tracking branch like origin/main is Git's local snapshot of where main was on the remote the last time you fetched. It doesn't update automatically. That's what git fetch is for — and it's completely separate from git pull. Pull fetches and then merges. Fetch just updates your picture of the remote without touching your working files. In a codebase with active collaborators, git fetch before you start work is discipline, not optional.

io/thecodeforge/git/PostCloneOrientation.shBASH

# io/thecodeforge — Post-Clone Orientation

# --- After cloning a team repo, orient yourself immediately ---
cd your-repo

# See all branches — local AND remote-tracking
# -a flag shows both; remote branches appear as remotes/origin/branch-name
git branch -a

# See just the remote-tracking branches that exist
git branch -r

# --- Check out a remote branch to work on it locally ---
# Git is smart enough to create the local branch and track the remote one
# automatically when the branch name is unambiguous
git checkout feature/order-validation

# The long-form version of the above — explicit about what's happening:
# Creates local branch 'feature/order-validation' tracking 'origin/feature/order-validation'
git checkout -b feature/order-validation origin/feature/order-validation

# --- Update your view of the remote without touching your local files ---
# Do this at the start of every working session on a shared repo
git fetch origin

# After fetching, see what commits exist on origin/main that aren't in your local main
# Double-dot notation: show commits reachable from origin/main but NOT from main
git log main..origin/main --oneline

# --- See the full project layout immediately after cloning ---
# Shows top-level structure — helps you find entry points fast on an unfamiliar repo
ls -la
git log --oneline --graph --decorate -10

# --- Understand the remote configuration ---
git remote show origin
# Shows: fetch URL, push URL, HEAD branch, remote branches, local branches tracking remote

Output

# git branch -a output:

* main

remotes/origin/HEAD -> origin/main

remotes/origin/main

remotes/origin/feature/order-validation

remotes/origin/feature/payment-retry

remotes/origin/hotfix/cart-null-check

# After: git checkout feature/order-validation

Branch 'feature/order-validation' set up to track remote branch 'feature/order-validation' from 'origin'.

Switched to a new branch 'feature/order-validation'

# git fetch output:

remote: Enumerating objects: 7, done.

remote: Counting objects: 100% (7/7), done.

remote: Total 12 (delta 2), reused 11 (delta 1)

Unpacking objects: 100% (12/12), done.

From https://github.com/your-org/your-repo

a3f91c2..d88b41c main -> origin/main

# git log main..origin/main:

d88b41c Add rate limiting to order submission endpoint

c71f903 Update README with new environment variables

Clone vs Fetch vs Pull

Clone: one-time operation to create a local repo from a remote
Fetch: updates origin/main, origin/feature-x, etc. — no working directory changes
Pull: fetch + merge in one step — convenient but hides what's about to change
Production preference: fetch first, review with git log origin/main..main, then merge explicitly

Production Insight

The fetch-then-review workflow prevents 'surprise merges' where git pull brings in changes that break your local working tree. In teams with high commit velocity, pulling without fetching first means you merge blind. The safer workflow: git fetch origin to update your remote-tracking branches, git log main..origin/main to see what's incoming, review the commits, then git merge origin/main explicitly. This takes 30 seconds more and prevents the 'my code was working, I pulled, now it's broken' debugging sessions.

Key Takeaway

After cloning, run git branch -a to see all available branches and git fetch origin to update your remote-tracking references. Remote-tracking branches (like origin/main) are your local snapshot of the remote — they don't update automatically. Use git fetch before starting work, not git pull, so you can review incoming changes before merging.

● Production incidentPOST-MORTEMseverity: high

40GB Monorepo Clone on 10GB CI Disk: Silent Corruption During Deploy

Symptom

The deploy pipeline completed with green status. The service started but returned 500 errors on 30% of requests. No error logs from the application — the errors were from missing source files that the checkout step never completed. Monitoring showed intermittent failures, not total failure, because only some files were missing.

Assumption

The team initially assumed a code bug in the latest release. They rolled back to the previous release, but the rollback deploy used the same pipeline — which ran the same clone onto the same full disk. The rollback also had missing files. The team spent two hours investigating application code before checking the CI disk.

Root cause

1. The CI server had a 10GB disk. The monorepo was 40GB (five years of history with large binary assets). 2. The clone started downloading objects. At approximately 9.2GB, the disk filled up. 3. Git's clone operation failed mid-transfer but did not exit with a clear error in the CI environment (the CI runner swallowed the exit code). 4. The partial clone left a .git directory with incomplete objects and a working tree with missing files. 5. Subsequent git commands (checkout, status) operated on the corrupted repository without detecting the corruption. 6. The build step compiled whatever files were present — the Java compiler skipped missing files silently (they were in different modules). 7. The deploy step deployed the partial build. The service started but crashed on requests that hit missing code paths.

Fix

1. Immediate: added a disk space check before clone in the CI pipeline: df -h / | awk 'NR==2 {print $4}' | grep -q '^[0-9]*G' && echo 'OK' || (echo 'INSUFFICIENT DISK' && exit 1). 2. Changed the clone command to use --depth 1 --single-branch --no-tags for all CI builds — reduced clone size from 40GB to 200MB. 3. Added a post-clone verification step: git fsck --full to detect repository corruption before proceeding. 4. Increased the CI server disk to 50GB as a safety margin. 5. Added set -o pipefail to the CI shell scripts so that failed git commands would stop the pipeline instead of being silently swallowed.

Key lesson

Always check disk space before cloning large repositories. A pre-clone disk check costs nothing and prevents silent corruption.
Shallow clones (--depth 1) are essential for CI pipelines on large repos. The full history is never needed for a build.
Post-clone verification (git fsck) detects corruption that git status and git checkout miss. Add it to your CI pipeline.
CI shell scripts must use set -o pipefail to catch command failures. Without it, failed git commands are silently ignored.

Production debug guideSystematic recovery paths for clone failures, disk issues, and authentication problems.5 entries

Symptom · 01

Clone fails with 'fatal: destination path already exists and is not an empty directory'

→

Fix

1. A previous clone or directory creation left a partial .git folder. 2. Remove the directory: rm -rf <directory-name> and re-clone. 3. Or clone into a new directory: git clone <url> <new-directory-name>. 4. If the directory has uncommitted work you need: copy it elsewhere before deleting.

Symptom · 02

Clone fails with 'Permission denied (publickey)' on SSH URL

→

Fix

1. Test SSH connectivity: ssh -T git@github.com. 2. If it fails, your SSH key isn't registered or isn't being found. 3. Check if key exists: ls ~/.ssh/id_ed25519.pub. 4. If no key: generate with ssh-keygen -t ed25519 -C your@email.com and add to GitHub. 5. If key exists but not found: check ~/.ssh/config for correct IdentityFile setting.

Symptom · 03

Clone succeeds but git log shows truncated history (only recent commits)

→

Fix

1. You likely cloned with --depth N (shallow clone). 2. Verify: git rev-parse --is-shallow-repository returns true. 3. To fetch full history: git fetch --unshallow. 4. Warning: on a large repo, this can take minutes and download gigabytes. 5. Prevention: don't use --depth for development clones where you need full history.

Symptom · 04

Clone hangs or is extremely slow on a large repository

→

Fix

1. Check network: ping github.com and traceroute github.com. 2. Check if Git is using the optimal protocol: git config --global protocol.version 2. 3. Try a shallow clone first: git clone --depth 1 <url> to verify connectivity. 4. If behind a corporate proxy: configure git config --global http.proxy http://proxy:port. 5. If cloning via SSH is slow: try HTTPS instead (or vice versa) to isolate protocol issues.

Symptom · 05

Push from cloned repo fails with 'fatal: shallow update not allowed'

→

Fix

1. You cloned with --depth (shallow clone). Shallow clones cannot push. 2. Option A: deepen the clone: git fetch --unshallow then push. 3. Option B: delete and re-clone without --depth. 4. Prevention: never use --depth for repos where you'll commit and push.

★ Git Clone Triage Cheat SheetFast recovery for clone failures in CI and development environments.

Clone fails — disk full mid-transfer−

Immediate action

Free disk space and re-clone with shallow depth.

Commands

df -h / (check available disk space)

du -sh <repo-dir> (check partial clone size)

Fix now

Re-clone with --depth 1 --single-branch --no-tags. Add disk check to CI pipeline.

'Permission denied (publickey)' on SSH clone+

Clone hangs for minutes on large repository+

'fatal: shallow update not allowed' on push+

CI pipeline clone succeeds but build fails with missing files+

HTTPS vs SSH Clone Protocol

Aspect	HTTPS Clone	SSH Clone
URL format	https://github.com/org/repo.git	git@github.com:org/repo.git
Initial setup required	None — works immediately	SSH key generation + GitHub registration
Authentication on push	Username + personal access token prompt	Seamless — no prompt after key setup
Works through corporate proxy/firewall	Yes — uses port 443	Sometimes blocked — uses port 22
Best for	CI/CD pipelines, quick one-off clones	Daily development on your own machine
Credential storage risk	Token can leak if stored in URL	Private key stays on your machine only
Switching after clone	git remote set-url origin <ssh-url>	git remote set-url origin <https-url>

Key takeaways

Git clone doesn't just download files

it copies the entire object database, all history, and all branch references. Understanding that is why every flag and every failure mode makes sense.

Shallow clones with --depth 1 are a legitimate production tool for CI pipelines

but they are read-only. Commit and push from them and you will hit 'fatal: shallow update not allowed' at the worst possible moment.

Reach for SSH when you're doing daily development on your own machine. Reach for HTTPS with a scoped access token when you're configuring a CI/CD system. The decision is about the environment, not personal preference.

Remote-tracking branches like origin/main are Git's local memory of what the remote looked like last time you talked to it. They don't update automatically. Run 'git fetch' at the start of every session

'git pull' is a shortcut that skips the moment where you check what you're about to merge.

Always verify disk space before cloning large repositories. A pre-clone disk check and post-clone git fsck prevent silent corruption that passes CI with green status.

--filter=blob:none (Git 2.25+) defers large file downloads until checkout

the most underused optimization for monorepo clones.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 5 QUESTIONS

Frequently Asked Questions

How do I git clone a specific branch instead of main?

What's the difference between git clone and git pull?

How do I clone a private repository?

Can a shallow clone cause problems if you later need the full history?

What is --filter=blob:none and when should I use it?

🔥

That's Git. Mark it forged?

5 min read · try the examples if you haven't