Beginner 5 min · March 28, 2026

Git Clone — Silent Corruption from Disk Limits

A 40GB monorepo clone on a 10GB CI disk caused silent corruption and intermittent 500 errors.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Downloads the entire object database (commits, trees, blobs)
  • Creates a remote called 'origin' pointing to the source URL
  • Checks out the default branch so you have files to work with
  • Wires up remote-tracking references for all branches
  • --depth 1 — shallow clone, only latest commit (CI pipelines)
  • --branch — check out a specific branch or tag on clone
  • --single-branch — fetch only one branch's history
  • --no-tags — skip downloading release tag objects
Plain-English First

Imagine a Google Doc that your whole team works on, but instead of everyone editing the same live file, Git hands each person a complete printed copy of the entire history — every draft, every edit, every version ever saved. Git clone is the moment you walk up to the printer and say 'give me my copy.' You now have everything offline, locally, and nothing you do to your copy touches anyone else's until you deliberately send changes back.

git clone creates a complete local copy of a remote repository. It downloads every commit, every branch, every tag — the entire object database going back to the first commit. This is not a file download; it's a full history replication.

In production, clone misconfigurations cause real outages. A shallow clone in a pipeline that later needs full history breaks git blame and git bisect. A clone on a disk without enough space leaves a corrupted repository that passes CI silently. Understanding what clone actually does under the hood prevents these failures.

Common misconceptions: that clone only downloads one branch (it downloads all branch data but only checks out the default), that shallow clones are always safe for CI (they break anything that traverses history), and that HTTPS and SSH clones are interchangeable (they have different authentication models and network requirements).

What Git Clone Actually Does (And Why You Need to Know)

Before you touch a terminal, understand what you're asking Git to do. Because if you think clone just 'downloads code,' you're going to make bad decisions later.

Every Git repository is a database of snapshots. Every time someone commits, Git stores a compressed snapshot of the entire project — not just the diff — plus metadata: who, when, what message, and a pointer to the parent commit. Clone copies all of it. Every snapshot. Every commit. Every branch tip. Every tag. The full history going back to the very first commit, potentially years ago.

When you run git clone <url>, Git does five things in sequence: connects to the remote server, downloads every object in the repo's object database (commits, trees, blobs), reconstructs the history graph locally, creates a remote called origin that points back to the URL you used, and checks out the default branch so you have actual files to work with. That last step — the checkout — is why you see files appear. But the real value is everything Git stored before that step.

Why does this matter for you right now? Because understanding that clone downloads history explains every flag you'll need: why --depth exists, why --branch is useful, and why cloning without thinking can pull gigabytes you'll never need.

io/thecodeforge/git/BasicClone.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# io.thecodeforge — Git Clone Basics

# The most basic clone — downloads the full repo with all history
# Replace the URL with any real repository URL you have access to
git clone https://github.com/your-org/your-repo.git

# By default, this creates a folder named after the repo (your-repo)
# and puts all files inside it. cd into it to start working.
cd your-repo

# Verify the clone worked — see which branch you're on
# and confirm the remote 'origin' was configured automatically
git status
git remote -v

# Check that you have the full history
# This shows the last 5 commits on the current branch
git log --oneline -5

# Inspect what clone actually stored locally
git count-objects -vH
# Shows: count (loose objects), size (disk usage), in-pack (packed objects)
# This tells you how much space the clone is using.
Output
Cloning into 'your-repo'...
remote: Enumerating objects: 1482, done.
remote: Counting objects: 100% (1482/1482), done.
remote: Compressing objects: 100% (731/731), done.
remote: Total 1482 (delta 619), reused 1389 (delta 540), pack-reused 0
Receiving objects: 100% (1482/1482), 4.23 MiB | 3.11 MiB/s, done.
Resolving deltas: 100% (619/619), done.
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
origin https://github.com/your-org/your-repo.git (fetch)
origin https://github.com/your-org/your-repo.git (push)
a3f91c2 Add retry logic to payment processor
88c4d01 Fix null pointer in order validation
3b2e771 Refactor checkout flow into separate service
9d0a114 Add integration tests for cart service
1ca8823 Initial commit
Origin Is a Nickname, Not a Server
  • Origin is a named reference stored in .git/config
  • You can have multiple remotes: origin, upstream, fork, etc.
  • Renaming origin breaks every script and teammate workflow that assumes the convention
  • git remote set-url origin <new-url> changes where origin points without renaming it
Production Insight
Understanding that clone downloads the full object database explains why monorepo clones are slow and why shallow clones exist. A five-year-old monorepo with 50,000 commits and large binary assets can be 40GB+ in object data. Clone downloads all of it — every snapshot, every blob, every tree. There is no 'partial clone' without explicit flags. This is why CI pipelines on large repos must use --depth 1 — the full history is never needed for a build.
Key Takeaway
Clone copies the entire object database — all commits, all branches, all tags. It's not a file download; it's a full history replication. Understanding this explains every flag, every failure mode, and why monorepo clones can be gigabytes. Use git count-objects -vH to see how much space your clone is using.

Cloning with Control: The Flags That Actually Matter in Production

The basic clone works. But in production environments, CI pipelines, and large teams, naked git clone is often the wrong tool. Here's why: it downloads everything, always, unconditionally. A repo with five years of history and large binary assets can be several gigabytes. On a CI server spinning up a fresh container for every build, that's minutes of wasted time on every single pipeline run.

The fix isn't clever — it's just flags most people never learn about. --depth creates a shallow clone: it only fetches the most recent N commits instead of the full history. For a CI pipeline that just needs to build and test the current code, a depth of 1 is all you ever need. I've seen pipeline times drop from 4 minutes to 40 seconds on repos with long histories, just by adding --depth 1.

--branch lets you clone directly onto a specific branch or tag instead of the default. This is critical when your pipeline needs to build a release tag, or when a developer needs to start work on a feature branch without switching after the clone. --single-branch pairs with --depth to tell Git not to fetch any branch information except the one you asked for — keeping the clone tight and fast.

There's also --no-tags, which stops Git from downloading all the tag objects. Tags can add surprising size to a repo with lots of releases. And cloning into a specific directory name — by passing a path as the second argument — is underused. Your folder name should communicate intent, not just inherit whatever name the repo happened to have.

io/thecodeforge/git/ProductionCloneFlags.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# io.thecodeforge — Production Clone Flags

# --- SCENARIO: CI/CD pipeline building a Node.js checkout service ---
# We only need the current state of main. Full history wastes time and disk.

# Shallow clone: only fetch the single most recent commit (depth=1)
# --single-branch: skip all other branch refs — keeps the fetch minimal
# --no-tags: skip downloading release tags — we don't need them for a build
git clone \
  --depth 1 \
  --single-branch \
  --no-tags \
  https://github.com/your-org/checkout-service.git

# --- SCENARIO: Developer needs to start work on a specific feature branch ---
# --branch accepts a branch name OR a tag name
# Clones directly onto the feature branch — no need to checkout after
git clone \
  --branch feature/payment-retry \
  https://github.com/your-org/checkout-service.git

# --- SCENARIO: Clone into a custom directory name ---
# Second positional argument overrides the folder name
# Useful when the repo name is generic or conflicts with another local folder
git clone \
  https://github.com/your-org/checkout-service.git \
  checkout-service-v2

# --- SCENARIO: Clone a specific release tag for a deployment ---
# Perfect for reproducible deployments — you get exactly what was tagged
git clone \
  --depth 1 \
  --branch v2.4.1 \
  --single-branch \
  https://github.com/your-org/checkout-service.git \
  checkout-service-release

# Verify the shallow clone only has 1 commit in history
cd checkout-service-release
git log --oneline

# --- SCENARIO: Partial clone — download large files on demand ---
# Git 2.25+ supports filter-based partial clones
# --filter=blob:none: don't download file contents until needed
# Saves massive space on repos with large binary assets
git clone \
  --filter=blob:none \
  https://github.com/your-org/monorepo.git

# Verify partial clone status
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | grep blob | head -5
# Shows: blob objects are listed but not downloaded until checkout
Output
# Output for shallow clone:
Cloning into 'checkout-service'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (41/41), done.
remote: Total 47 (delta 3), reused 36 (delta 0), pack-reused 0
Receiving objects: 100% (47/47), 312.44 KiB | 5.22 MiB/s, done.
Resolving deltas: 100% (3/3), done.
# Output for the release tag clone + git log:
Cloning into 'checkout-service-release'...
remote: Enumerating objects: 47, done.
remote: Total 47 (delta 3), reused 36 (delta 0), pack-reused 0
Receiving objects: 100% (47/47), 312.44 KiB | 5.22 MiB/s, done.
Note: switching to 'v2.4.1'.
You are in 'detached HEAD' state.
f91a3c8 (HEAD, tag: v2.4.1) Release v2.4.1 — payment retry with backoff
Production Trap: Shallow Clone Plus Push Equals Rejected
  • --depth 1 downloads only the latest commit — perfect for CI builds
  • --single-branch skips all other branches — reduces fetch size further
  • --no-tags skips release tag objects — useful on repos with hundreds of releases
  • --filter=blob:none (Git 2.25+) defers large file downloads until checkout
Production Insight
The --filter=blob:none flag is the most underused production clone optimization. It tells Git to download commit and tree objects but defer blob (file content) downloads until checkout. On a monorepo with 100,000 files, this can reduce initial clone size by 90%+. The blobs are fetched on-demand as you checkout files. The trade-off: first checkout of any file triggers a network fetch, which adds latency. For CI pipelines that checkout the entire tree anyway, --depth 1 is simpler. For developers who only work in specific directories of a monorepo, --filter=blob:none saves significant time and disk.
Key Takeaway
Production clones need flags: --depth 1 for CI (read-only builds), --branch for specific release tags, --single-branch to minimize fetch, --no-tags to skip tag objects. Shallow clones are read-only — never push from them. --filter=blob:none (Git 2.25+) defers large file downloads for massive space savings on monorepos.

SSH vs HTTPS: Pick the Right Protocol Before You Waste an Hour

Every repository URL comes in two flavours and the choice between them matters more than most beginners realise. The wrong choice means re-entering passwords on every push, broken CI pipelines, or authentication failures that are genuinely confusing to debug.

HTTPS URLs look like https://github.com/your-org/repo.git. They work everywhere — through corporate proxies, firewalls, and restricted networks. The downside: they require credential authentication on every push and pull unless you configure a credential helper or use a personal access token baked into the URL (which is a security hazard you should never do — I've seen tokens committed to Dockerfiles this way and rotated in a panic).

SSH URLs look like git@github.com:your-org/repo.git. They use a keypair: a private key that stays on your machine, and a public key you register with GitHub/GitLab/Bitbucket once. After that, every clone, push, and pull is seamless — no passwords, no tokens, no prompts. For daily development, SSH is almost always the right choice. For CI/CD systems, HTTPS with a machine-level access token scoped to read-only is the standard — because private keys on ephemeral containers are operational debt.

You can always switch after the fact with git remote set-url, so getting this wrong isn't permanent. But getting it right from the start saves you the detour.

io/thecodeforge/git/CloneProtocolComparison.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# io.thecodeforge — Clone Protocol Comparison

# --- HTTPS clone ---
# Works immediately, no setup required
# GitHub will prompt for username + personal access token on push
git clone https://github.com/your-org/inventory-service.git

# --- SSH clone ---
# Requires SSH key already added to your GitHub/GitLab account
# If your key is set up, this never prompts for a password
git clone git@github.com:your-org/inventory-service.git

# --- Check which URL your clone is currently using ---
cd inventory-service
git remote -v

# --- Switch from HTTPS to SSH after cloning ---
# Useful if you cloned HTTPS and now want seamless pushes
git remote set-url origin git@github.com:your-org/inventory-service.git

# --- Switch from SSH back to HTTPS (common fix in restricted networks) ---
git remote set-url origin https://github.com/your-org/inventory-service.git

# --- Verify the change took effect ---
git remote -v

# --- Test your SSH key is correctly configured BEFORE cloning ---
# This handshakes with GitHub without needing a repo
# Look for: "Hi your-username! You've successfully authenticated"
ssh -T git@github.com

# --- CI/CD pattern: HTTPS with machine token via environment variable ---
# Never hardcode tokens. Store as CI secret, inject at clone time.
git clone https://x-access-token:${GITHUB_TOKEN}@github.com/your-org/repo.git
# GITHUB_TOKEN is a CI environment variable, never in source code.
Output
# After HTTPS clone, git remote -v:
origin https://github.com/your-org/inventory-service.git (fetch)
origin https://github.com/your-org/inventory-service.git (push)
# After switching to SSH, git remote -v:
origin git@github.com:your-org/inventory-service.git (fetch)
origin git@github.com:your-org/inventory-service.git (push)
# SSH test output:
Hi your-username! You've successfully authenticated, but GitHub does not provide shell access.
# If your SSH key isn't set up, you'll see:
Permission denied (publickey).
Senior Shortcut: Test SSH Before Your First Clone
  • ssh -T git@github.com — one command to verify SSH is working
  • ed25519 keys are preferred over RSA — shorter, faster, more secure
  • GitHub deprecated password auth in 2021 — HTTPS now requires personal access tokens
  • CI systems use HTTPS with machine tokens injected as environment variables, never hardcoded
Production Insight
The protocol choice has security implications beyond convenience. HTTPS with tokens stored in URLs or config files is a common source of credential leaks — I've seen tokens committed to Dockerfiles, .gitconfig files, and CI YAML. SSH keys are safer because the private key never leaves your machine. But SSH on ephemeral CI containers creates operational debt: you need to inject the private key, ensure correct permissions (chmod 600), and clean up after the build. The industry standard for CI is HTTPS with a short-lived machine token scoped to read-only access, stored as a CI secret and injected at runtime.
Key Takeaway
SSH for daily development (seamless auth after key setup). HTTPS with machine tokens for CI/CD (no private key management on ephemeral containers). Always test SSH with ssh -T git@github.com before your first clone. You can switch protocols anytime with git remote set-url. Never hardcode tokens in source code or Dockerfiles.

What Happens After Clone: Getting Oriented Fast

Cloning is step one. Where developers get lost — especially when joining an existing project — is what to do immediately after. You have a local copy of the repo, but you might be missing context: which branches exist, what the project structure looks like, and how remote tracking actually works.

Right after cloning, you're on the default branch (usually main or master). But there are almost certainly other branches on the remote that aren't checked out locally yet. A common misconception: beginners think git clone only downloads one branch. It doesn't — it downloads all branch data, but only checks out the default one. The other branches exist as remote-tracking references like origin/feature/payment-retry. You can create a local branch from any of them without another network call.

Understanding remote-tracking branches is what separates someone who's memorised clone from someone who actually knows Git. A remote-tracking branch like origin/main is Git's local snapshot of where main was on the remote the last time you fetched. It doesn't update automatically. That's what git fetch is for — and it's completely separate from git pull. Pull fetches and then merges. Fetch just updates your picture of the remote without touching your working files. In a codebase with active collaborators, git fetch before you start work is discipline, not optional.

io/thecodeforge/git/PostCloneOrientation.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# io/thecodeforge — Post-Clone Orientation

# --- After cloning a team repo, orient yourself immediately ---
cd your-repo

# See all branches — local AND remote-tracking
# -a flag shows both; remote branches appear as remotes/origin/branch-name
git branch -a

# See just the remote-tracking branches that exist
git branch -r

# --- Check out a remote branch to work on it locally ---
# Git is smart enough to create the local branch and track the remote one
# automatically when the branch name is unambiguous
git checkout feature/order-validation

# The long-form version of the above — explicit about what's happening:
# Creates local branch 'feature/order-validation' tracking 'origin/feature/order-validation'
git checkout -b feature/order-validation origin/feature/order-validation

# --- Update your view of the remote without touching your local files ---
# Do this at the start of every working session on a shared repo
git fetch origin

# After fetching, see what commits exist on origin/main that aren't in your local main
# Double-dot notation: show commits reachable from origin/main but NOT from main
git log main..origin/main --oneline

# --- See the full project layout immediately after cloning ---
# Shows top-level structure — helps you find entry points fast on an unfamiliar repo
ls -la
git log --oneline --graph --decorate -10

# --- Understand the remote configuration ---
git remote show origin
# Shows: fetch URL, push URL, HEAD branch, remote branches, local branches tracking remote
Output
# git branch -a output:
* main
remotes/origin/HEAD -> origin/main
remotes/origin/main
remotes/origin/feature/order-validation
remotes/origin/feature/payment-retry
remotes/origin/hotfix/cart-null-check
# After: git checkout feature/order-validation
Branch 'feature/order-validation' set up to track remote branch 'feature/order-validation' from 'origin'.
Switched to a new branch 'feature/order-validation'
# git fetch output:
remote: Enumerating objects: 7, done.
remote: Counting objects: 100% (7/7), done.
remote: Total 12 (delta 2), reused 11 (delta 1)
Unpacking objects: 100% (12/12), done.
From https://github.com/your-org/your-repo
a3f91c2..d88b41c main -> origin/main
# git log main..origin/main:
d88b41c Add rate limiting to order submission endpoint
c71f903 Update README with new environment variables
Clone vs Fetch vs Pull
  • Clone: one-time operation to create a local repo from a remote
  • Fetch: updates origin/main, origin/feature-x, etc. — no working directory changes
  • Pull: fetch + merge in one step — convenient but hides what's about to change
  • Production preference: fetch first, review with git log origin/main..main, then merge explicitly
Production Insight
The fetch-then-review workflow prevents 'surprise merges' where git pull brings in changes that break your local working tree. In teams with high commit velocity, pulling without fetching first means you merge blind. The safer workflow: git fetch origin to update your remote-tracking branches, git log main..origin/main to see what's incoming, review the commits, then git merge origin/main explicitly. This takes 30 seconds more and prevents the 'my code was working, I pulled, now it's broken' debugging sessions.
Key Takeaway
After cloning, run git branch -a to see all available branches and git fetch origin to update your remote-tracking references. Remote-tracking branches (like origin/main) are your local snapshot of the remote — they don't update automatically. Use git fetch before starting work, not git pull, so you can review incoming changes before merging.
● Production incidentPOST-MORTEMseverity: high

40GB Monorepo Clone on 10GB CI Disk: Silent Corruption During Deploy

Symptom
The deploy pipeline completed with green status. The service started but returned 500 errors on 30% of requests. No error logs from the application — the errors were from missing source files that the checkout step never completed. Monitoring showed intermittent failures, not total failure, because only some files were missing.
Assumption
The team initially assumed a code bug in the latest release. They rolled back to the previous release, but the rollback deploy used the same pipeline — which ran the same clone onto the same full disk. The rollback also had missing files. The team spent two hours investigating application code before checking the CI disk.
Root cause
1. The CI server had a 10GB disk. The monorepo was 40GB (five years of history with large binary assets). 2. The clone started downloading objects. At approximately 9.2GB, the disk filled up. 3. Git's clone operation failed mid-transfer but did not exit with a clear error in the CI environment (the CI runner swallowed the exit code). 4. The partial clone left a .git directory with incomplete objects and a working tree with missing files. 5. Subsequent git commands (checkout, status) operated on the corrupted repository without detecting the corruption. 6. The build step compiled whatever files were present — the Java compiler skipped missing files silently (they were in different modules). 7. The deploy step deployed the partial build. The service started but crashed on requests that hit missing code paths.
Fix
1. Immediate: added a disk space check before clone in the CI pipeline: df -h / | awk 'NR==2 {print $4}' | grep -q '^[0-9]*G' && echo 'OK' || (echo 'INSUFFICIENT DISK' && exit 1). 2. Changed the clone command to use --depth 1 --single-branch --no-tags for all CI builds — reduced clone size from 40GB to 200MB. 3. Added a post-clone verification step: git fsck --full to detect repository corruption before proceeding. 4. Increased the CI server disk to 50GB as a safety margin. 5. Added set -o pipefail to the CI shell scripts so that failed git commands would stop the pipeline instead of being silently swallowed.
Key lesson
  • Always check disk space before cloning large repositories. A pre-clone disk check costs nothing and prevents silent corruption.
  • Shallow clones (--depth 1) are essential for CI pipelines on large repos. The full history is never needed for a build.
  • Post-clone verification (git fsck) detects corruption that git status and git checkout miss. Add it to your CI pipeline.
  • CI shell scripts must use set -o pipefail to catch command failures. Without it, failed git commands are silently ignored.
Production debug guideSystematic recovery paths for clone failures, disk issues, and authentication problems.5 entries
Symptom · 01
Clone fails with 'fatal: destination path already exists and is not an empty directory'
Fix
1. A previous clone or directory creation left a partial .git folder. 2. Remove the directory: rm -rf <directory-name> and re-clone. 3. Or clone into a new directory: git clone <url> <new-directory-name>. 4. If the directory has uncommitted work you need: copy it elsewhere before deleting.
Symptom · 02
Clone fails with 'Permission denied (publickey)' on SSH URL
Fix
1. Test SSH connectivity: ssh -T git@github.com. 2. If it fails, your SSH key isn't registered or isn't being found. 3. Check if key exists: ls ~/.ssh/id_ed25519.pub. 4. If no key: generate with ssh-keygen -t ed25519 -C your@email.com and add to GitHub. 5. If key exists but not found: check ~/.ssh/config for correct IdentityFile setting.
Symptom · 03
Clone succeeds but git log shows truncated history (only recent commits)
Fix
1. You likely cloned with --depth N (shallow clone). 2. Verify: git rev-parse --is-shallow-repository returns true. 3. To fetch full history: git fetch --unshallow. 4. Warning: on a large repo, this can take minutes and download gigabytes. 5. Prevention: don't use --depth for development clones where you need full history.
Symptom · 04
Clone hangs or is extremely slow on a large repository
Fix
1. Check network: ping github.com and traceroute github.com. 2. Check if Git is using the optimal protocol: git config --global protocol.version 2. 3. Try a shallow clone first: git clone --depth 1 <url> to verify connectivity. 4. If behind a corporate proxy: configure git config --global http.proxy http://proxy:port. 5. If cloning via SSH is slow: try HTTPS instead (or vice versa) to isolate protocol issues.
Symptom · 05
Push from cloned repo fails with 'fatal: shallow update not allowed'
Fix
1. You cloned with --depth (shallow clone). Shallow clones cannot push. 2. Option A: deepen the clone: git fetch --unshallow then push. 3. Option B: delete and re-clone without --depth. 4. Prevention: never use --depth for repos where you'll commit and push.
★ Git Clone Triage Cheat SheetFast recovery for clone failures in CI and development environments.
Clone fails — disk full mid-transfer
Immediate action
Free disk space and re-clone with shallow depth.
Commands
df -h / (check available disk space)
du -sh <repo-dir> (check partial clone size)
Fix now
Re-clone with --depth 1 --single-branch --no-tags. Add disk check to CI pipeline.
'Permission denied (publickey)' on SSH clone+
Immediate action
Test SSH key registration before troubleshooting further.
Commands
ssh -T git@github.com (test SSH connectivity)
ls ~/.ssh/id_ed25519.pub (check if key exists)
Fix now
If no key: ssh-keygen -t ed25519. If key exists: add to GitHub Settings > SSH Keys.
Clone hangs for minutes on large repository+
Immediate action
Kill the clone and try shallow clone to verify connectivity.
Commands
Ctrl+C to kill, then git clone --depth 1 <url> (test with shallow)
git config --global protocol.version 2 (use Git v2 protocol)
Fix now
If shallow works: use --depth 1 for CI. If still slow: check proxy/network config.
'fatal: shallow update not allowed' on push+
Immediate action
You pushed from a shallow clone. Deepen or re-clone.
Commands
git rev-parse --is-shallow-repository (confirm shallow status)
git fetch --unshallow (download full history)
Fix now
After unshallow: push normally. Prevention: never --depth for development clones.
CI pipeline clone succeeds but build fails with missing files+
Immediate action
Check if the clone completed fully — disk may have filled mid-transfer.
Commands
git fsck --full (detect repository corruption)
git status (check for missing or incomplete files)
Fix now
If corrupted: delete and re-clone. Add git fsck to CI as post-clone verification.
HTTPS vs SSH Clone Protocol
AspectHTTPS CloneSSH Clone
URL formathttps://github.com/org/repo.gitgit@github.com:org/repo.git
Initial setup requiredNone — works immediatelySSH key generation + GitHub registration
Authentication on pushUsername + personal access token promptSeamless — no prompt after key setup
Works through corporate proxy/firewallYes — uses port 443Sometimes blocked — uses port 22
Best forCI/CD pipelines, quick one-off clonesDaily development on your own machine
Credential storage riskToken can leak if stored in URLPrivate key stays on your machine only
Switching after clonegit remote set-url origin <ssh-url>git remote set-url origin <https-url>

Key takeaways

1
Git clone doesn't just download files
it copies the entire object database, all history, and all branch references. Understanding that is why every flag and every failure mode makes sense.
2
Shallow clones with --depth 1 are a legitimate production tool for CI pipelines
but they are read-only. Commit and push from them and you will hit 'fatal: shallow update not allowed' at the worst possible moment.
3
Reach for SSH when you're doing daily development on your own machine. Reach for HTTPS with a scoped access token when you're configuring a CI/CD system. The decision is about the environment, not personal preference.
4
Remote-tracking branches like origin/main are Git's local memory of what the remote looked like last time you talked to it. They don't update automatically. Run 'git fetch' at the start of every session
'git pull' is a shortcut that skips the moment where you check what you're about to merge.
5
Always verify disk space before cloning large repositories. A pre-clone disk check and post-clone git fsck prevent silent corruption that passes CI with green status.
6
--filter=blob:none (Git 2.25+) defers large file downloads until checkout
the most underused optimization for monorepo clones.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 5 QUESTIONS

Frequently Asked Questions

01
How do I git clone a specific branch instead of main?
02
What's the difference between git clone and git pull?
03
How do I clone a private repository?
04
Can a shallow clone cause problems if you later need the full history?
05
What is --filter=blob:none and when should I use it?
🔥

That's Git. Mark it forged?

5 min read · try the examples if you haven't

Previous
Git Hooks Explained
10 / 19 · Git
Next
Git Pull: Fetch and Merge Remote Changes