Mid-level 9 min · March 28, 2026

Python range() — range(1, n) Skips Index 0 Silently

range(1, n) on zero-indexed list silently skips index 0 — no exception.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • range() is a lazy sequence object — stores only start, stop, step as three integers, not a list of numbers
  • range(1_000_000) costs 48 bytes of memory; list(range(1_000_000)) costs ~8MB in shallow pointer storage alone — the true allocation including integer objects is closer to 35MB. Never convert unnecessarily
  • Stop is always exclusive — range(0, 5) gives 0,1,2,3,4, never 5. This is the #1 source of off-by-one bugs in production Python
  • Membership testing (x in range(n)) is O(1) via arithmetic formula, not O(n) like list scanning — this is the interview differentiator most candidates miss
  • Use range(n) for counted loops, range(len(x)) only for in-place index writes, enumerate() for index+value pairs, and zip() for parallel sequences — zip() stops at the shorter sequence, so use itertools.zip_longest() when lengths may differ
  • Biggest production trap: range(1, len(collection)) silently skips index 0 — first record never processed, no exception raised, no warning emitted
  • range() only accepts integers — float steps raise TypeError; use a list comprehension with round() for approximate decimal sequences, or decimal.Decimal arithmetic for financial precision
Plain-English First

Imagine you are a factory floor supervisor telling a worker: 'Start at station 3, work through to station 9, skip every other station.' You are not handing them a written list of stations — you are giving them a rule they follow as they go. That is range(). It is an instruction set for counting, not a physical list of numbers sitting in memory. Your program follows the rule step by step without ever storing all the numbers at once — which is why range(1, 1_000_000) does not eat your RAM the way a list of a million numbers would. The rule takes three things to define: where to start, when to stop, and how big each step is. Change any of those three and you get a completely different counting pattern, all with the same 48 bytes of overhead.

The off-by-one error killed a batch job at a fintech I consulted for — 99,999 records processed instead of 100,000, one customer's nightly billing accumulation never updated, and nobody noticed for six weeks because the total count was close enough to pass the eyeball test. No exception was raised. No alert fired. The job logged 'success' every single night. The culprit was a misunderstood range() call. One wrong number in the start argument. Six weeks of silent wrong data.

range() is the engine behind almost every loop you write in Python. Get it wrong and your loops silently skip data, process one record too many, or run forever with no complaint. Get it right and you have precise, memory-efficient control over iteration that scales from five items to five billion without changing a single line of logic. This is not academic — every data pipeline, every retry loop, every pagination handler, every batch processor in Python touches range().

By the end of this guide you will know exactly how range() works under the hood, why it does not store numbers in memory, how to count backwards without any workaround, how membership testing in range() is O(1) when list scanning is O(n), and the specific mistake patterns that cause silent data corruption in production loops. You will write range() calls with confidence and spot broken ones in code review on sight.

All code examples and memory figures in this article were verified on CPython 3.12. The sys.getsizeof values you see reflect CPython's internal representation for that version — results will differ slightly on Python 3.10 or 3.11, and will differ more substantially on PyPy or Jython. The algorithmic properties — O(1) membership testing, lazy evaluation, 48-byte range object size — hold across all CPython versions from 3.2 onwards.

What range() Actually Is — And Why It's Not a List

Before range() existed in its modern form, Python 2 had two functions: range() that returned an actual list and xrange() that returned a lazy iterator. Developers who needed to loop a million times with the old range() would inadvertently build a list of a million integers in memory just to drive the iteration — pure overhead with no payoff. Python 3 collapsed them: range() is now always lazy. It never builds the full list. It stores exactly three integers — start, stop, step — and calculates each value on demand as the loop advances.

This matters the moment you are paginating through a database result set, iterating over file offsets, or running a retry loop in a distributed system where worker memory is constrained. You are not paying a memory cost proportional to the count — you are paying for three integers, always, regardless of whether the range covers five numbers or five billion.

One important precision that most articles skip: sys.getsizeof() on a list returns the shallow size — the size of the container and its array of pointers to integer objects, but not the integer objects themselves. For list(range(1_000_000)), sys.getsizeof() reports roughly 8MB. But CPython allocates heap memory for every integer outside the small-integer cache (-5 to 256). The 999,744 integers from 257 to 999,999 each occupy roughly 28 bytes, adding approximately 27MB to the true footprint. Total real allocation: closer to 35MB, not 8MB. The range() object stays at 48 bytes regardless, because it stores three Python integers — all of which fall in the cached range or cost negligible overhead. The memory efficiency argument for range() is even stronger than the shallow-size comparison suggests.

The implementation detail that surprises most developers: range() is not just an iterator. It is a full sequence type. You can index into it directly (range(10)[7] == 7), slice it (range(0, 100, 2)[3:6] returns a new range object), get its length in constant time (len(range(100)) == 100), and test membership in O(1). That last property has a behaviour almost nobody learns until an interview forces it. Membership testing in a range() object applies an arithmetic formula: is the value an integer, does (value - start) % step == 0, and does the value fall within the [start, stop) bounds? Three constant-time calculations. It never iterates through the range to check. A list scan is O(n) and gets slower as the list grows; a range membership check takes the same time whether the range has 10 elements or 10 billion.

Think of range() as a bookmark rule, not a bookshelf. 'Start at page 10, read every third page, stop before page 40' — you do not photocopy those pages in advance. You follow the rule as you go. range() is the rule. The loop is you following it.

io/thecodeforge/python/range_memory_demo.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# io.thecodeforge — Python tutorial
# Verified on CPython 3.12

import sys
import tracemalloc
import time

# ── Shallow vs true memory comparison ─────────────────────────────────────────
# sys.getsizeof() returns the SHALLOW size — the container and its pointer array.
# It does not include the memory consumed by the integer objects themselves.
# tracemalloc captures the true peak allocation including all object creation.

print("=== Shallow size (sys.getsizeof) ===")
million_range = range(1_000_000)
print(f"range(1_000_000)       : {sys.getsizeof(million_range):>12,} bytes (3 integers — start, stop, step)")

# Measure true allocation of list(range(1_000_000)) using tracemalloc
tracemalloc.start()
million_list = list(range(1_000_000))
_, peak_bytes = tracemalloc.get_traced_memory()
tracemalloc.stop()

shallow_bytes = sys.getsizeof(million_list)
print(f"list shallow size      : {shallow_bytes:>12,} bytes (pointer array only)")
print(f"list true peak alloc   : {peak_bytes:>12,} bytes (includes all integer objects)")
print(f"True memory ratio      : {peak_bytes // sys.getsizeof(million_range):,}x")
# True ratio is roughly 700,000x — far more than the shallow 166,667x suggests
# because integers above 256 are heap-allocated objects, not just pointer slots

# ── range() is a full sequence type ────────────────────────────────────────────
print("\n=== range() sequence capabilities ===")
page_offsets = range(0, 10_000, 250)  # database pagination rule: 40 pages of 250

print(f"First page offset  : {page_offsets[0]}")
print(f"Second page offset : {page_offsets[1]}")
print(f"Last page offset   : {page_offsets[-1]}")
print(f"Total pages        : {len(page_offsets)}")
print(f"Slice [1:4]        : {list(page_offsets[1:4])}")

# ── O(1) membership testing — equal-size comparison ─────────────────────────
# Both collections cover the same 5,000 even numbers.
# The algorithmic difference is structural, not scale-dependent.
print("\n=== Membership testing: range() vs list() — equal size ===")
SIZE = 5_000
big_range = range(0, SIZE * 2, 2)           # even numbers 0 to 9998 — 5,000 elements
big_list  = list(range(0, SIZE * 2, 2))     # same 5,000 even numbers as a list

target_present = SIZE * 2 - 2               # last element — worst case for list scan
target_absent  = SIZE * 2 - 1              # odd number — not in either collection

# range(): O(1) — arithmetic formula regardless of where the target sits
trials = 100_000
start = time.perf_counter_ns()
for _ in range(trials):
    _ = target_present in big_range
range_ns = (time.perf_counter_ns() - start) // trials

# list(): O(n) — scans from index 0, worst case visits all 5,000 elements
start = time.perf_counter_ns()
for _ in range(trials):
    _ = target_present in big_list
list_ns = (time.perf_counter_ns() - start) // trials

print(f"range membership (5K even nums, worst-case target): {range_ns:,} ns per check")
print(f"list membership  (5K even nums, worst-case target): {list_ns:,} ns per check")
print(f"Speed ratio at 5K elements: ~{list_ns // max(range_ns, 1)}x")
print(f"At 50M elements, range stays ~{range_ns:,} ns. List would take ~{range_ns * 10_000:,} ns.")
print(f"range() is O(1) — list() is O(n). The gap widens linearly with collection size.")

# ── Never convert range() unless you need list-specific mutability ────────────
print("\n=== Valid index/membership operations without list conversion ===")
work_range = range(0, 50, 5)   # 0, 5, 10, 15, 20, 25, 30, 35, 40, 45

print(f"Index access   : work_range[3]  = {work_range[3]}")
print(f"Negative index : work_range[-1] = {work_range[-1]}")
print(f"Membership     : 25 in work_range = {25 in work_range}")
print(f"Membership     : 27 in work_range = {27 in work_range}")
print(f"Length         : len(work_range)  = {len(work_range)}")
Output
=== Shallow size (sys.getsizeof) ===
range(1_000_000) : 48 bytes (3 integers — start, stop, step)
list shallow size : 8,000,056 bytes (pointer array only)
list true peak alloc : 35,048,312 bytes (includes all integer objects)
True memory ratio : 730,173x
=== range() sequence capabilities ===
First page offset : 0
Second page offset : 250
Last page offset : 9750
Total pages : 40
Slice [1:4] : [250, 500, 750]
=== Membership testing: range() vs list() — equal size ===
range membership (5K even nums, worst-case target): 168 ns per check
list membership (5K even nums, worst-case target): 38,400 ns per check
Speed ratio at 5K elements: ~228x
At 50M elements, range stays ~168 ns. List would take ~1,680,000 ns.
range() is O(1) — list() is O(n). The gap widens linearly with collection size.
=== Valid index/membership operations without list conversion ===
Index access : work_range[3] = 15
Negative index : work_range[-1] = 45
Membership : 25 in work_range = True
Membership : 27 in work_range = False
Length : len(work_range) = 10
Production Trap: Wrapping range() in list() Defeats the Entire Point
I have seen list(range(n)) in production code with the comment 'to make it easier to debug.' That comment just cost the worker far more than the 8MB that sys.getsizeof() suggests — the true allocation on CPython 3.12 for list(range(1_000_000)) is closer to 35MB once the integer objects themselves are counted, and for list(range(50_000_000)) it approaches 1.7GB. On a batch processor sized for 512MB, that single list() wrapper is enough to OOM-kill the worker before a single iteration runs. range() is already indexable, sliceable, and membership-testable without conversion. Keep it lazy. The only legitimate reason to call list(range(n)) is when you genuinely need list-specific mutability — appending, popping, or inserting — which counting loops essentially never require.
Production Insight
sys.getsizeof() reports shallow size — the pointer array that holds references to integer objects, not the objects themselves.
For list(range(1_000_000)) on CPython 3.12: sys.getsizeof() returns ~8MB. tracemalloc reports a true peak of ~35MB because CPython caches only integers from -5 to 256 — every integer from 257 to 999,999 is a separately heap-allocated 28-byte object. Scale to 50 million and the true allocation approaches 1.7GB, enough to OOM-kill a worker sized for routine processing load.
range(1_000_000) stays at 48 bytes. Always. Because it stores three small integers — start, stop, and step — all of which are in CPython's integer cache and cost negligible overhead.
I have watched list(range(n)) take down workers at two different companies. Both times it was added during debugging to inspect values and never removed before the merge. Make it a code review rule: list(range(n)) in production code requires a comment explaining why list-specific mutability is needed.
Key Takeaway
range() is a lazy rule stored as three integers — 48 bytes for any size, from range(5) to range(10**18).
It is a full sequence type: indexable, sliceable, len()-able, and O(1) membership-testable via arithmetic formula. You get all of these without converting to a list.
sys.getsizeof() understates list() memory cost by 3-4x — it measures the pointer array, not the integer objects. Use tracemalloc to measure true allocation. The actual memory efficiency of range() versus list() is closer to 700,000x for one million integers, not the 166,667x that shallow size suggests.
range() vs list(range()) — When to Convert
IfNeed to iterate over numbers sequentially in a loop
UseUse range() directly — lazy, constant memory, supports indexing, slicing, and O(1) membership without conversion
IfNeed to modify the sequence during iteration (append, pop, insert, swap elements)
UseConvert to list() — range objects are immutable and cannot be mutated in place
IfNeed to pass numbers to a function that requires a list specifically
UseFirst check whether the function accepts any sequence or iterable — range() qualifies for both and supports len(), indexing, and slicing. Only convert if the function explicitly requires list type or calls list-specific mutation methods.
IfNeed to inspect all values at once during debugging
UseUse list(range(...)) for small debug ranges only — never on production-scale data. Inspect the first and last few values instead: list(range(n))[:5] and list(range(n))[-5:]

The Three-Argument Syntax: Start, Stop, Step Without Guessing

Every range() confusion in production code traces back to one of two things: forgetting that stop is exclusive, or not knowing that step exists. Here is the complete syntax, once and for all: range(start, stop, step). Start is where counting begins — inclusive. Stop is where counting ends — but the stop value itself is never produced. Step is how much to add on each iteration.

When you write range(5), Python treats it as range(0, 5, 1) — start defaults to 0, step defaults to 1. That is why range(5) gives you 0, 1, 2, 3, 4 — five values, none of them 5. This is not arbitrary. It means range(len(my_list)) always gives you exactly the valid indices for that list — no arithmetic needed, no off-by-one to introduce. By design.

The step argument is where range() earns its keep beyond toy loops. Batch processing every Nth record, building retry delays at a fixed interval, generating database page offsets, checking every even-numbered slot in a buffer — these all need step. For counting backwards, a negative step is all you need. There is no reversed() call required, no subtraction gymnastics. Just range(start, stop, -1) where start is numerically greater than stop. Stop is still exclusive in the negative direction — range(10, 0, -1) gives you 10 down to 1, because 0 is the stop and it is never included.

When you want to reverse a list and only need the values without indices, use reversed(my_list) — it is cleaner, requires no start/stop arithmetic, and works on any sequence regardless of whether it supports len(). Reserve range() with a negative step for situations where you genuinely need the decreasing index value — progress counters, countdown displays, decreasing batch offsets.

The one constraint worth memorising: step cannot be zero. range(0, 10, 0) raises ValueError: range() arg 3 must not be zero. A zero step would produce an infinite sequence of the start value — advancing by nothing means never terminating. Python raises an error here rather than silently returning an empty range, because the two cases have completely different meanings to the program. An empty range from an unsatisfiable start/stop condition is well-defined and may be intentional. A zero step almost always means a wrong variable was passed to the step argument — failing loudly prevents that from silently masking the bug.

io/thecodeforge/python/retry_backoff_scheduler.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# io.thecodeforge — Python tutorial
# Verified on CPython 3.12

# ── Scenario: A payment processor retry scheduler ─────────────────────────────
# Retry a failed charge at increasing intervals.
# Retry delays (seconds): 5, 10, 15, 20, 25 (linear backoff, max 5 retries)

MAX_RETRIES = 5
BASE_DELAY_SECONDS = 5

print("=== Linear Backoff Retry Schedule ===")
for attempt_number in range(1, MAX_RETRIES + 1):  # range(1, 6) → 1,2,3,4,5
    # 1-based attempt numbering for human-readable logging
    # stop is MAX_RETRIES + 1 because stop is exclusive — without the +1, attempt 5 is never reached
    delay = attempt_number * BASE_DELAY_SECONDS
    print(f"Attempt {attempt_number}: retry after {delay}s")

# ── Scenario: Counting backwards — leaderboard countdown ─────────────────────
print("\n=== Leaderboard Countdown (10 down to 1) ===")
for rank in range(10, 0, -1):  # start=10, stop=0 (exclusive), step=-1
    # stop=0 ensures rank 1 is included — stop is exclusive in both directions
    # range(10, 1, -1) would miss rank #1 — the same off-by-one in reverse
    print(f"  Rank #{rank}")

# ── Scenario: Reversing a list — when reversed() is cleaner than range ────────
print("\n=== reversed() for list reversal — no index arithmetic needed ===")
leaderboard = ["Alice", "Bob", "Charlie", "Diana", "Eve"]
for player in reversed(leaderboard):  # cleanest when you only need values, not indices
    print(f"  {player}")
# Use range(len(x)-1, -1, -1) only when you need the actual decreasing index value

# ── Scenario: Batch database writes — 100 records per batch ──────────────────
TOTAL_RECORDS = 450
BATCH_SIZE = 100

print("\n=== Batch Write Offsets ===")
for batch_start in range(0, TOTAL_RECORDS, BATCH_SIZE):  # 0, 100, 200, 300, 400
    # min() is critical here — without it, the final batch would request
    # indices 400 through 499, but only 400-449 exist
    batch_end = min(batch_start + BATCH_SIZE, TOTAL_RECORDS)
    record_count = batch_end - batch_start
    print(f"  Writing records [{batch_start}:{batch_end}] — {record_count} records")

# ── Scenario: Even-numbered port scanning for load balancer health checks ─────
PORT_START = 8080
PORT_END   = 8100
PORT_STEP  = 2

print("\n=== Even Port Range ===")
even_ports = list(range(PORT_START, PORT_END, PORT_STEP))  # small range — list is fine here
print(f"Ports to check: {even_ports}")

# ── Stop is always exclusive — demonstrating the rule across all forms ─────────
print("\n=== Stop Is Exclusive — Always ===")
print(f"range(5)         → {list(range(5))}")          # shorthand for range(0, 5, 1)
print(f"range(0, 5)      → {list(range(0, 5))}")       # 5 never appears
print(f"range(1, 6)      → {list(range(1, 6))}")       # when you need 1-5 inclusive
print(f"range(5, 0, -1)  → {list(range(5, 0, -1))}")   # 0 never appears
print(f"range(10, 10)    → {list(range(10, 10))}")     # empty when start == stop
print(f"range(10, 5)     → {list(range(10, 5))}")      # empty when start > stop with positive step
Output
=== Linear Backoff Retry Schedule ===
Attempt 1: retry after 5s
Attempt 2: retry after 10s
Attempt 3: retry after 15s
Attempt 4: retry after 20s
Attempt 5: retry after 25s
=== Leaderboard Countdown (10 down to 1) ===
Rank #10
Rank #9
Rank #8
Rank #7
Rank #6
Rank #5
Rank #4
Rank #3
Rank #2
Rank #1
=== reversed() for list reversal — no index arithmetic needed ===
Eve
Diana
Charlie
Bob
Alice
=== Batch Write Offsets ===
Writing records [0:100] — 100 records
Writing records [100:200] — 100 records
Writing records [200:300] — 100 records
Writing records [300:400] — 100 records
Writing records [400:450] — 50 records
=== Even Port Range ===
Ports to check: [8080, 8082, 8084, 8086, 8088, 8090, 8092, 8094, 8096, 8098]
=== Stop Is Exclusive — Always ===
range(5) → [0, 1, 2, 3, 4]
range(0, 5) → [0, 1, 2, 3, 4]
range(1, 6) → [1, 2, 3, 4, 5]
range(5, 0, -1) → [5, 4, 3, 2, 1]
range(10, 10) → []
range(10, 5) → []
Senior Shortcut: range(len(x)) for Reading Values Is a Code Smell
If you are writing for i in range(len(my_list)): val = my_list[i], stop. You are doing two operations per iteration — the index lookup and the subsequent list access — for zero benefit over for val in my_list. Worse, you have introduced off-by-one surface area at the range() boundary. Use enumerate(my_list) when you need both the index and the value. Reserve range(len(x)) for the one legitimate use case: when you need to write back to the list by index — swapping elements, zeroing out values, modifying in place. If you are reading, not writing, this pattern is flagged by Pylint (consider-using-enumerate) and Ruff (PERF101) for real reasons, and any competent code reviewer will ask why the index is needed.
Production Insight
Batch pagination with range(0, total, batch_size) is the standard pattern for database cursor iteration and API pagination in Python.
The mistake that consistently breaks the final batch: forgetting min(batch_start + batch_size, total). Without it, the last batch attempts to read indices that do not exist — an IndexError on the final iteration, or in cases where the database query handles it gracefully, a silent return of zero rows when there should be a partial batch. Total record counts almost never divide evenly by batch size in real data. Always clamp the upper bound with min().
Key Takeaway
range(start, stop, step) — stop is ALWAYS exclusive, never included. Every form, every direction, no exceptions. This is the design decision that makes range(len(x)) always correct for list indexing without any manual arithmetic.
range(n) is shorthand for range(0, n, 1) — start defaults to 0, step defaults to 1.
For reversing a list by value: use reversed(my_list). For a decreasing index sequence: use range(N, 0, -1). Negative step requires start > stop or the range is silently empty.
Choosing range() Arguments for Common Scenarios
IfNeed to repeat something exactly N times with no meaningful index
UseUse range(N) — shorthand for range(0, N, 1), clean and idiomatic. Use _ as the loop variable to signal the index is intentionally unused.
IfNeed 1-based counting for human-readable output or logging
UseUse range(1, N+1) for pure counting — or enumerate(collection, start=1) when iterating a sequence. The enumerate form is less prone to arithmetic errors.
IfNeed every Kth value, a fixed step interval, or batch offsets
UseUse range(start, stop, step) — step is the gap between values. Remember to clamp the final batch end with min() when the total does not divide evenly.
IfNeed to count backwards from N down to 1 and need the index value
UseUse range(N, 0, -1) — start must be greater than stop for negative step; stop=0 means 1 is the last value produced. Stop is exclusive in both directions.
IfNeed to iterate a list in reverse and only need the values
UseUse reversed(my_list) — cleaner than a negative-step range, no start/stop arithmetic, works on any sequence. Save range() with negative step for when you need the actual decreasing index.
IfNeed database page offsets for a cursor-based pagination loop
UseUse range(0, total_records, page_size) — each value is a batch start offset; clamp the upper end with min(offset + page_size, total_records)

Off-By-One Errors: The Exact Bug Pattern That Corrupts Production Data

Off-by-one errors with range() are insidious because the code runs — no exception, no crash, no obvious failure. You process one record too few or too many, the job reports success, and the corruption accumulates silently until someone notices a discrepancy or a customer surfaces the problem. The fintech incident that opened this guide was exactly this: range(1, record_count) where range(0, record_count) was correct. Index 0 never processed. Six weeks of silent nightly corruption.

There are exactly three failure modes worth memorising, because they cover the vast majority of off-by-one bugs in production Python code. First: range(1, n) when you mean range(0, n) — skips the first item, by far the most common pattern. Second: range(0, n-1) when you mean range(0, n) — skips the last item because stop is already exclusive, and subtracting 1 from it silently drops the final valid index n-1, so the loop visits indices 0 through n-2 and never touches the last element. Third: range(0, n+1) when you mean range(0, n) — processes one index past the end of the collection, causing an IndexError on the final iteration or, in dynamically-sized cases, quietly processing a sentinel or default value as real data.

All three produce wrong output with no exception, which is what makes them production-dangerous rather than merely annoying. They pass unit tests written against small test fixtures where the missing record is not checked. They pass integration tests that verify aggregate values rather than record counts. They run in production for days or weeks before the data discrepancy grows large enough to be noticed.

The rules that prevent all three: when iterating a list or array by index, always use range(len(collection)) — no manual arithmetic on start or stop. When you need a counted loop, use range(N). When you need both index and value, use enumerate(collection) — this eliminates range() from the equation entirely and makes off-by-one structurally impossible because enumerate() always generates the correct index for each item automatically.

io/thecodeforge/python/off_by_one_audit.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# io.thecodeforge — Python tutorial
# Verified on CPython 3.12

# ── Real scenario: processing invoice line items for billing ──────────────────
invoice_items = [
    {"sku": "WIDGET-A",    "qty": 3, "unit_price": 9.99},
    {"sku": "GADGET-B",    "qty": 1, "unit_price": 49.99},
    {"sku": "DOOHICKEY-C", "qty": 5, "unit_price": 4.50},
]

item_count = len(invoice_items)  # 3

# ── BUG 1: range(1, n) — skips index 0, first item silently missing ───────────
print("=== BUG 1: range(1, item_count) — skips first item ===")
bug1_total = 0.0
for i in range(1, item_count):  # produces 1, 2 — index 0 (WIDGET-A) never visited
    item = invoice_items[i]
    subtotal = item["qty"] * item["unit_price"]
    bug1_total += subtotal
    print(f"  Processed: {item['sku']} — ${subtotal:.2f}")
print(f"  Total billed: ${bug1_total:.2f}  ← WIDGET-A never billed (silent revenue loss)")

# ── BUG 2: range(0, n-1) — skips index n-1, last item silently missing ────────
# Stop is already exclusive. Subtracting 1 makes the loop visit 0 through n-2,
# skipping the final valid index n-1 (DOOHICKEY-C at index 2).
print("\n=== BUG 2: range(0, item_count - 1) — skips last item ===")
bug2_total = 0.0
for i in range(0, item_count - 1):  # produces 0, 1 — index 2 (DOOHICKEY-C) never visited
    item = invoice_items[i]
    subtotal = item["qty"] * item["unit_price"]
    bug2_total += subtotal
    print(f"  Processed: {item['sku']} — ${subtotal:.2f}")
print(f"  Total billed: ${bug2_total:.2f}  ← DOOHICKEY-C never billed (more silent revenue loss)")

# ── CORRECT: range(len(collection)) — covers every valid index ────────────────
print("\n=== CORRECT: range(len(invoice_items)) — all items processed ===")
correct_total = 0.0
for i in range(len(invoice_items)):  # produces 0, 1, 2 — every valid index visited
    item = invoice_items[i]
    subtotal = item["qty"] * item["unit_price"]
    correct_total += subtotal
    print(f"  Processed: {item['sku']} — ${subtotal:.2f}")
print(f"  Total billed: ${correct_total:.2f}  ← Correct")

# ── BETTER: enumerate() when you need index AND value ─────────────────────────
# enumerate() eliminates range() entirely — off-by-one is structurally impossible
# because enumerate() always produces the correct index for each item.
print("\n=== BETTER: enumerate() — index and value together, no range() needed ===")
enum_total = 0.0
for line_number, item in enumerate(invoice_items, start=1):  # 1-based line numbers
    subtotal = item["qty"] * item["unit_price"]
    enum_total += subtotal
    print(f"  Line {line_number}: {item['sku']} — ${subtotal:.2f}")
print(f"  Total billed: ${enum_total:.2f}  ← Correct, and off-by-one structurally impossible")

# ── Post-loop assertion: catch off-by-one before declaring success ─────────────
# This single line catches all three failure modes — range(1,n), range(0,n-1),
# range(0,n+1) — before partial results are written anywhere downstream.
print("\n=== Post-loop count assertion (add this to every batch processor) ===")
expected_count = len(invoice_items)
processed_count = 0
for item in invoice_items:
    processed_count += 1
assert processed_count == expected_count, (
    f"Expected {expected_count} records, processed {processed_count} — "
    f"possible off-by-one in range() call"
)
print(f"  Assertion passed: {processed_count}/{expected_count} records confirmed")
Output
=== BUG 1: range(1, item_count) — skips first item ===
Processed: GADGET-B — $49.99
Processed: DOOHICKEY-C — $22.50
Total billed: $72.49 ← WIDGET-A never billed (silent revenue loss)
=== BUG 2: range(0, item_count - 1) — skips last item ===
Processed: WIDGET-A — $29.97
Processed: GADGET-B — $49.99
Total billed: $79.96 ← DOOHICKEY-C never billed (more silent revenue loss)
=== CORRECT: range(len(invoice_items)) — all items processed ===
Processed: WIDGET-A — $29.97
Processed: GADGET-B — $49.99
Processed: DOOHICKEY-C — $22.50
Total billed: $102.46 ← Correct
=== BETTER: enumerate() — index and value together, no range() needed ===
Line 1: WIDGET-A — $29.97
Line 2: GADGET-B — $49.99
Line 3: DOOHICKEY-C — $22.50
Total billed: $102.46 ← Correct, and off-by-one structurally impossible
=== Post-loop count assertion (add this to every batch processor) ===
Assertion passed: 3/3 records confirmed
The Classic Production Bug: range(1, n) on a Zero-Indexed Collection
This is the most common silent data bug I encounter in Python code review, across codebases of every size and experience level. No exception is raised. The loop runs. It just silently skips index 0 on every single run. If you are processing invoice lines, user records, log entries, transaction rows, or any zero-indexed collection and the first item is mysteriously absent from the output, range(1, ...) is the first thing to check. The fix is one character: change range(1, len(collection)) to range(0, len(collection)) or simply range(len(collection)). Then add a post-loop assertion so this class of bug cannot silently accumulate again.
Production Insight
Off-by-one errors with range() produce silent data corruption — the loop runs without error, the job reports success, and the missing records accumulate until someone notices a discrepancy or a customer surfaces the problem.
The post-loop count assertion is the single most cost-effective defensive measure for any batch processor: assert processed_count == len(source_data). It catches all three failure modes — range(1,n), range(0,n-1), range(0,n+1) — with one line of code, and it costs nothing at runtime relative to the loop itself. If the assertion fires, the job fails loudly before writing partial results anywhere downstream. That is exactly what you want.
Key Takeaway
Three off-by-one failure modes: range(1,n) skips the first item, range(0,n-1) skips the last item because stop is already exclusive and subtracting 1 drops the final valid index, range(0,n+1) overshoots. All three produce silent data corruption — the code runs without error but processes the wrong set of records.
The prevention rule: use range(len(collection)) for full index coverage, enumerate() for index+value access, and never manually add or subtract from the stop value.
The detection rule: add a post-loop count assertion to every batch processor. Assert that processed_count equals the expected record count before declaring success. One line prevents weeks of silent wrong data.
Off-By-One Prevention Rules
IfIterating all indices of a collection
UseUse range(len(collection)) — no manual arithmetic on stop; add a post-loop count assertion
IfNeed both index and value from a collection
UseUse enumerate(collection) — eliminates range() entirely and makes off-by-one structurally impossible
IfNeed 1-based display numbering for output or error messages
UseUse enumerate(collection, start=1) — cleaner than range(1, len(x)+1) and still correct regardless of collection length
IfNeed a pure counted loop with no collection involved
UseUse range(n) for zero-based count, range(1, n+1) for 1-based display — remember stop is exclusive so include the +1 when you need the upper bound included

range() vs enumerate() vs zip(): Picking the Right Tool Every Time

range() is not always the right tool for looping — and using it when you should not is a reliable tell that someone learned Python through C or Java and is carrying index-based loop habits into a language that does not need them. Here is the decision framework that should be hardwired.

Use range(n) when you need a bare count: run this loop exactly n times, generate n evenly-spaced values, or you genuinely only need the index with no corresponding collection value. The clearest signal that range(n) is right: there is no collection being indexed — just a number of iterations.

Use range(len(collection)) only when you need to modify the list in-place by index — inserting at a specific position, swapping elements, zeroing out values, or when you need to access two lists simultaneously at the same index and zip() is not appropriate. This is the one use case where you genuinely need the raw index and range(len()) is the right tool.

Use a direct for item in collection loop when you only need the values — no index, no parallel list. This is the cleanest form and requires the least mental overhead when reading the code six months later.

Use enumerate(collection) when you need both the position and the value simultaneously — building numbered output, tracking which item failed in error logs, reporting progress through a large dataset. enumerate() gives you both at the same time without writing collection[i].

Use zip(list_a, list_b) when you need to walk two sequences in lockstep — pairing input records with expected outputs, merging two data streams by position, comparing before and after values side by side. One critical property of zip() that catches engineers out: it stops silently at the shorter sequence. If list_a has 10 elements and list_b has 9, zip() produces 9 pairs and the 10th element of list_a is silently ignored — no warning, no exception. When your two sequences might differ in length and you need to process all elements from both, use itertools.zip_longest(), which fills missing values with a fillvalue you specify.

These are not stylistic preferences — they are correctness decisions. range(len(x)) where a direct loop or enumerate() would do is flagged by Pylint (consider-using-enumerate) and Ruff (PERF101) for real reasons: it is harder to read, it introduces off-by-one surface area, and it signals to future maintainers that the index must be needed for something — which then requires them to trace through the loop body to discover it is just used to access the collection value.

io/thecodeforge/python/loop_tool_selector.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# io.thecodeforge — Python tutorial
# Verified on CPython 3.12

import itertools

order_ids       = ["ORD-001", "ORD-002", "ORD-003", "ORD-004"]
order_statuses  = ["shipped", "pending", "cancelled", "shipped"]
priority_scores = [72, 45, 91, 38]

# ── range(n): pure counted loop — no collection involved ─────────────────────
print("=== range(n): run exactly N times ===")
for reminder_number in range(3):  # 0, 1, 2 — the count is all that matters here
    print(f"  Sending reminder #{reminder_number + 1}")
# Use _ instead of reminder_number when the value is genuinely unused:
# for _ in range(3): send_reminder()

# ── Direct iteration: reading values only — cleanest form ─────────────────────
print("\n=== Direct for-in: cleanest when index is irrelevant ===")
for order_id in order_ids:  # no range(), no index, no collection[i]
    print(f"  Dispatching notification for {order_id}")

# ── enumerate(): need position AND value — building an audit trail ─────────────
print("\n=== enumerate(): index + value together ===")
failed_positions = []
for position, order_id in enumerate(order_ids, start=1):
    # position tells us exactly where in the batch this failed — essential for error logs
    if order_id == "ORD-003":
        failed_positions.append(position)
        print(f"  Position {position}: {order_id} — FAILED")
    else:
        print(f"  Position {position}: {order_id} — OK")
print(f"  Failed at positions: {failed_positions}")

# ── zip(): two sequences in lockstep — and the silent truncation trap ──────────
print("\n=== zip(): walking two lists together ===")
for order_id, status in zip(order_ids, order_statuses):
    print(f"  {order_id} → {status}")

# ── zip() silent truncation — the bug zip() hides when lengths differ ─────────
print("\n=== zip() silent truncation — shorter list silently wins ===")
four_orders   = ["ORD-001", "ORD-002", "ORD-003", "ORD-004"]
three_statuses = ["shipped", "pending", "cancelled"]  # one fewer than orders

print("  With zip() — ORD-004 is silently dropped, no error raised:")
for order_id, status in zip(four_orders, three_statuses):
    print(f"    {order_id} → {status}")

print("  With zip_longest() — all orders processed, missing status filled:")
for order_id, status in itertools.zip_longest(four_orders, three_statuses, fillvalue="UNKNOWN"):
    print(f"    {order_id} → {status}")

# ── range(len(x)): legitimate use — in-place modification by index ─────────────
print("\n=== range(len(x)): in-place update — the one valid use case ===")
print(f"  Before: {priority_scores}")
for i in range(len(priority_scores)):   # need the index to WRITE BACK to the list
    if priority_scores[i] < 50:
        priority_scores[i] = 0          # zero out low-priority scores in place
print(f"  After:  {priority_scores}")
# This is the one case range(len(x)) is genuinely justified — you need the index to mutate

# ── O(1) membership testing — the interview question most candidates miss ──────
print("\n=== O(1) membership testing ===")
batch_offsets = range(0, 10_000_000, 100)   # valid page start positions
print(f"  Is 5000 a valid offset?    {5000 in batch_offsets}")
print(f"  Is 5001 a valid offset?    {5001 in batch_offsets}")
print(f"  Is 9999900 a valid offset? {9999900 in batch_offsets}")
# Each check is O(1) regardless of range size — no iteration, no scanning
Output
=== range(n): run exactly N times ===
Sending reminder #1
Sending reminder #2
Sending reminder #3
=== Direct for-in: cleanest when index is irrelevant ===
Dispatching notification for ORD-001
Dispatching notification for ORD-002
Dispatching notification for ORD-003
Dispatching notification for ORD-004
=== enumerate(): index + value together ===
Position 1: ORD-001 — OK
Position 2: ORD-002 — OK
Position 3: ORD-003 — FAILED
Position 4: ORD-004 — OK
Failed at positions: [3]
=== zip(): walking two lists together ===
ORD-001 → shipped
ORD-002 → pending
ORD-003 → cancelled
ORD-004 → shipped
=== zip() silent truncation — shorter list silently wins ===
With zip() — ORD-004 is silently dropped, no error raised:
ORD-001 → shipped
ORD-002 → pending
ORD-003 → cancelled
With zip_longest() — all orders processed, missing status filled:
ORD-001 → shipped
ORD-002 → pending
ORD-003 → cancelled
ORD-004 → UNKNOWN
=== range(len(x)): in-place update — the one valid use case ===
Before: [72, 45, 91, 38]
After: [72, 0, 91, 0]
=== O(1) membership testing ===
Is 5000 a valid offset? True
Is 5001 a valid offset? False
Is 9999900 a valid offset? True
Interview Gold: Why range() Membership Testing Is O(1)
Most candidates who know range() do not know this, and it comes up in algorithm design interviews at companies that care about complexity analysis. Unlike 'x in my_list' — which scans every element from the beginning, making it O(n) — 'x in range(n)' applies three arithmetic checks in constant time: (1) is x an integer type, (2) does (x - start) % step equal zero, meaning x falls exactly on a step boundary, and (3) does x fall within the [start, stop) bounds? Three operations, constant time, regardless of range size. range(1_000_000_000) membership testing takes the same nanoseconds as range(5). Knowing this is the difference between using range() as a loop counter and actually understanding it as a sequence type.
Production Insight
range(len(x)) for reading values is flagged by Pylint (consider-using-enumerate) and Ruff (PERF101) as a code smell. Vanilla Flake8 does not flag this pattern — you need the flake8-bugbear plugin for similar checks. This matters for teams that rely on Flake8 without plugins and believe their linter would have caught it.
zip() has a production-dangerous silent behaviour: it truncates to the shorter sequence with no warning and no exception. In a billing pipeline that pairs order records with status records, a length mismatch silently drops the trailing orders. If your two lists are guaranteed equal length by contract, zip() is fine. If they might differ — due to upstream data issues, partial loads, or async race conditions — always use itertools.zip_longest() with an explicit fillvalue, and add a length assertion before the loop.
The rule is simple: use range(len(x)) exclusively when you need to write back to the list at a specific index. Any other use should be replaced with direct iteration (values only) or enumerate() (index and values).
Key Takeaway
Use range(n) for counted loops, enumerate() for index+value pairs, zip() for parallel lists where equal length is guaranteed, and direct iteration for reading values.
zip() stops silently at the shorter sequence — use itertools.zip_longest() when lists might differ in length and silent truncation would be a bug.
range(len(x)) for reading values is flagged by Pylint and Ruff — it is a code smell that adds off-by-one risk for zero benefit. Vanilla Flake8 requires the flake8-bugbear plugin to catch similar patterns.
Membership testing in range() is O(1) via arithmetic formula — not O(n) via scanning. This is the interview differentiator that separates developers who use range() from developers who understand it as the full sequence type it actually is.
● Production incidentPOST-MORTEMseverity: high

Off-by-One in Nightly Billing Accumulator Silently Skips First Customer Record for Six Weeks

Symptom
End-of-month reconciliation flagged a $0.00 variance on aggregate totals — small enough to be attributed to timing differences across systems. One customer complained that their monthly statement showed zero activity despite confirmed transactions. The nightly batch had been running with a record count of 99,999 every night for six weeks — always one short, always marked successful, never alerting. No automated check had established an expected count threshold to catch the one-record discrepancy.
Assumption
The developer who wrote the loop assumed range(1, record_count) would iterate over all records. They believed the first record was at index 1, which seemed intuitive because the customer IDs in the database started at 1. The conceptual confusion was between the customer ID — a business identifier starting at 1 — and the Python list index — a zero-based position starting at 0. These are not the same thing and conflating them is a surprisingly easy mistake to make under time pressure.
Root cause
Python list indices are zero-based. A list of 100,000 records has valid indices 0 through 99,999. range(1, 100000) produces the integers 1 through 99,999 — it never produces 0. The customer record at list position 0 was silently excluded from processing on every single nightly run. No IndexError was raised because index 1 is a perfectly valid index — the loop was processing real data, just starting one position too late. The code looked correct, the output looked mostly correct, and it was completely wrong for the one customer whose billing accumulator was never touched.
Fix
Changed range(1, record_count) to range(0, record_count) — equivalently, just range(record_count). Added a post-loop assertion that verifies processed_count == len(records) before the job is marked successful. Added a Ruff rule (PERF101) to flag range(1, ...) on zero-indexed collections during CI. Added an alerting check that fires whenever the nightly batch count deviates from the previous run's count by more than 0.1% — a threshold that would have caught this on day one.
Key lesson
  • range(1, n) on a zero-indexed collection silently skips index 0 every time — no exception, no warning, no indication that anything went wrong. The loop runs successfully and processes n-1 records.
  • Never trust that a successful loop processed everything — always add a post-loop assertion that verifies processed count equals expected count before declaring success.
  • Zero-based indexing means range(len(collection)) or range(0, len(collection)) is the only correct full-coverage pattern — no manual arithmetic on the start or stop values.
  • Add automated count assertions after any batch loop that processes records by index; aggregate total checks are insufficient because they only catch value errors, not missing records.
Production debug guideFrom silent data skips to memory exhaustion — symptom to action6 entries
Symptom · 01
First record in a collection is never processed — silent data loss with no exception
Fix
This is almost always range(1, ...) on a zero-indexed collection. Check every range() call in the loop setup and change range(1, len(collection)) to range(0, len(collection)) or simply range(len(collection)). Add a post-loop assertion: assert processed_count == len(collection).
Symptom · 02
Last record in a collection is never processed — silent drop of final item
Fix
Check for range(0, n-1) — stop is already exclusive, so subtracting 1 from it silently drops the final valid index. Change range(0, n-1) to range(0, n) or range(n). Stop is exclusive by design; you never need to subtract 1 from it.
Symptom · 03
IndexError on the final loop iteration — loop overshoots the collection
Fix
Check for range(0, n+1) or any manual addition to the stop value. Stop is already exclusive, so adding 1 causes the loop to attempt index n, which is one past the end of a zero-indexed collection of length n. Remove the +1 from the stop argument.
Symptom · 04
Worker process killed with OOM or MemoryError on a counting loop
Fix
Check for list(range(n)) where n is large. list(range(10_000_000)) allocates roughly 80MB in shallow pointer storage, and the true allocation including integer objects is closer to 280MB on CPython 3.12. Remove the list() wrapper — range() is already iterable, indexable, and sliceable without conversion.
Symptom · 05
TypeError: 'float' object cannot be interpreted as an integer on a range() call
Fix
range() only accepts integer arguments — start, stop, and step must all be integers. For display or approximate decimal sequences, use a list comprehension with round(): [round(i * 0.1, 1) for i in range(10)]. For financial precision where floating-point representation errors are unacceptable, use decimal.Decimal arithmetic instead. For numeric computing, numpy.arange(0.0, 1.0, 0.1) is the idiomatic solution if NumPy is already in the stack.
Symptom · 06
Reverse loop produces empty output — loop body never executes
Fix
When using a negative step, start must be numerically greater than stop. range(1, 10, -1) produces nothing because you cannot count down from 1 and reach 10. Fix to range(10, 0, -1) to count from 10 down to 1. Always verify with list(range(your_start, your_stop, -1)) before deploying a reverse range.
★ range() Quick Debug Cheat SheetFast diagnostics for the most common range() failures in production Python services — run these commands to confirm the issue before touching any code. All commands tested on CPython 3.12.
Loop silently skips first or last item — suspect off-by-one
Immediate action
Print the range object as a list to see exactly which indices it covers before touching the production code
Commands
python3 -c "r=range(1,5); print('Indices covered:', list(r), '— Is 0 missing?')"
python3 -c "r=range(0,5); print('Indices covered:', list(r), '— Correct full coverage')"
Fix now
Replace range(1, n) with range(0, n) or range(n) for full zero-based index coverage — add post-loop count assertion to catch future regressions
Memory spike or OOM from a counting loop — suspect list(range(n))+
Immediate action
Compare the shallow memory cost of the range object versus its list conversion, then check true allocation with tracemalloc
Commands
python3 -c "import sys; print('range() bytes:', sys.getsizeof(range(10_000_000)), '— shallow size, true cost is 3 integers')"
python3 -c "import tracemalloc; tracemalloc.start(); list(range(10_000_000)); s,p=tracemalloc.get_traced_memory(); print(f'True peak allocation: {p/1_000_000:.1f}MB')"
Fix now
Remove the list() wrapper — range() is already a sequence type that supports indexing, slicing, and O(1) membership testing without conversion. sys.getsizeof reports shallow size only; true allocation is significantly higher once integer objects are counted.
TypeError on range() call — suspect float arguments+
Immediate action
Confirm the argument types, then choose the right fix based on whether you need display precision or financial precision
Commands
python3 -c "import sys; args=(0, 1, 0.1); print('Arg types:', [type(a).__name__ for a in args])"
python3 -c "from decimal import Decimal; steps=[Decimal('0.0') + Decimal('0.1')*i for i in range(10)]; print('Decimal steps:', steps)"
Fix now
For display or approximate sequences: [round(i*0.1,1) for i in range(10)]. For financial or exact precision: use decimal.Decimal arithmetic. For numeric computing: numpy.arange(0.0, 1.0, 0.1).
Reverse range produces no iterations — loop body never runs+
Immediate action
Verify start and stop relationship before changing the production loop
Commands
python3 -c "print('Wrong (empty):', list(range(1, 10, -1)))"
python3 -c "print('Correct:', list(range(10, 0, -1)))"
Fix now
Swap start and stop: range(start, stop, -1) requires start > stop numerically — range(10, 0, -1) counts 10 down to 1. For reversing a list by value without needing indices, use reversed(my_list) — it is cleaner and eliminates the start/stop arithmetic entirely.
range() vs list() — Feature Comparison
Feature / Aspectrange()list()
Memory for 1 million integers (shallow, via sys.getsizeof)48 bytes — stores only start, stop, step regardless of range size~8,000,056 bytes (~8MB) — pointer array only, not including integer objects
Memory for 1 million integers (true allocation, via tracemalloc on CPython 3.12)48 bytes — start, stop, step are small cached integers with negligible overhead~35,000,000 bytes (~35MB) — includes heap-allocated integer objects above 256
Membership test: x in collectionO(1) — arithmetic formula: checks integer type, step alignment, and bounds. Constant time regardless of range size.O(n) — linear scan from index 0. Gets slower proportionally as the list grows.
Supports negative step (reverse)Yes — range(10, 0, -1) counts 10 down to 1 natively. Use reversed(seq) for reversing a list by value without indices.Yes — but requires reversed() or slicing [::-1] as a separate step
Indexing: collection[i]Yes — O(1) via arithmetic formula: start + i * stepYes — O(1) via direct memory offset into the pointer array
Slicing: collection[a:b]Yes — returns a new range object with no new memory allocated for the valuesYes — returns a new list, allocates new memory proportional to slice length
Can hold non-integer valuesNo — integers only; float step raises TypeError immediatelyYes — any Python object, mixed types allowed
Mutable (can add/remove items)No — immutable by design, values cannot be changed after creationYes — append, pop, insert, sort, reverse all work in place
Created lazilyYes — no upfront computation, no pre-allocation, values computed on demand as the loop advancesNo — all values computed and allocated in memory at creation time
Works with len()Yes — O(1) via arithmetic formula: (stop - start) // stepYes — O(1) via stored length attribute
Memory scales with sizeNo — always 48 bytes, whether range(5) or range(10**18)Yes — roughly 8 bytes per pointer plus ~28 bytes per integer object above 256
Best forCounted loops, index generation, pagination offsets, O(1) membership validation on numeric rangesWhen you need to store, mutate, shuffle, sort, or pass around a mutable sequence of arbitrary values

Key takeaways

1
range() is not a list
it is a lazy rule stored as three integers. range(1_000_000) costs 48 bytes no matter what. Converting it to list() throws away the only reason to use it. sys.getsizeof() understates the cost — true allocation on CPython 3.12 for list(range(1_000_000)) is closer to 35MB once integer objects are counted, not the ~8MB shallow figure. The list() wrapper is only justified when you need list-specific mutability, which counting loops essentially never require.
2
Stop is always exclusive. Every time. No exceptions. range(0, 5) gives you 0,1,2,3,4. This eliminates the need for any manual arithmetic on the stop value
range(len(collection)) covers every valid index without a single subtraction or addition. Burn this into muscle memory and you eliminate 90% of off-by-one bugs before they happen.
3
Reach for range(n) for counted loops, enumerate() for index+value pairs, direct iteration for reading values, and range(len(x)) only for in-place writes by index. Use zip() when lengths are guaranteed equal
use itertools.zip_longest() when they might differ. zip() stops silently at the shorter sequence with no warning, and that silence has corrupted production data.
4
range() membership testing is O(1) via arithmetic formula, not O(n) via scanning. This is the interview differentiator that separates developers who use range() from developers who understand it as a full sequence type. The gap versus list scanning widens linearly with collection size
at 50 million elements, range() takes the same ~168ns it always does while a list would take over a second.

Common mistakes to avoid

6 patterns
×

Writing range(1, len(collection)) intending to cover all indices

Symptom
First record in the collection is never processed. No exception is raised — the loop runs, produces results for n-1 records, and the job reports success. Silent data loss that may go undetected for hours, days, or weeks depending on how the output is validated downstream.
Fix
Always use range(len(collection)) or range(0, len(collection)) for full zero-based index coverage. If you need 1-based display numbering in output or error messages, use enumerate(collection, start=1) instead of adjusting the range start. Add a post-loop assertion: assert processed_count == len(collection).
×

Using range(0, n-1) thinking subtraction is needed because n-1 is the last valid index

Symptom
Last item in the collection is silently dropped every run. Stop is already exclusive, so subtracting 1 makes the loop visit indices 0 through n-2 and never touch the final element at n-1. No IndexError, no warning, just a missing last record that accumulates across runs.
Fix
Stop is already exclusive. range(0, n) or range(n) gives indices 0 through n-1, covering all valid indices for a collection of length n. Never subtract 1 from the stop value — the exclusivity is already built in by design.
×

Wrapping range() in list() on large datasets unnecessarily

Symptom
Worker process killed with OOM, MemoryError, or Linux kernel OOM-killer. sys.getsizeof() understates the cost — list(range(10_000_000)) appears to cost ~80MB in shallow measurement but true allocation on CPython 3.12 is closer to 280MB once the integer objects above 256 are included. On workers with memory limits of 512MB or less, this single wrapper is enough to trigger a kill.
Fix
Iterate range() directly — it is already a sequence type with full support for indexing, slicing, len(), and O(1) membership testing. Only convert to list() when you genuinely need list-specific mutability: appending, popping, inserting, or sorting. Counting loops require none of these operations.
×

Expecting range() to accept float steps like range(0, 1, 0.1)

Symptom
TypeError: 'float' object cannot be interpreted as an integer. Code that works in other languages or with NumPy fails immediately with pure Python range().
Fix
For display or approximate decimal sequences: use a list comprehension with round() — [round(i 0.1, 1) for i in range(10)]. The round() call prevents floating-point representation errors like 0.30000000000000004. For financial calculations where exact decimal precision is required: use decimal.Decimal arithmetic — [Decimal('0.0') + Decimal('0.1') i for i in range(10)]. For numeric computing: numpy.arange(0.0, 1.0, 0.1) is the idiomatic and correct solution if NumPy is already in the dependency stack.
×

Using range(start, stop, -1) where start is less than stop, expecting values to appear

Symptom
Loop body never executes — range(1, 10, -1) produces an empty sequence with no error or warning. If the loop contained a critical operation, it silently did not run. This is particularly dangerous when the empty range is a degraded edge case that only appears under specific runtime conditions.
Fix
For reverse iteration, start must be numerically greater than stop. range(10, 0, -1) produces 10, 9, 8, ..., 1. Verify your range before deployment: python3 -c "print(list(range(your_start, your_stop, -1)))" to confirm it is non-empty. For reversing a list by value without needing the index, use reversed(my_list) — it eliminates the start/stop arithmetic entirely.
×

Using zip() on two lists that may differ in length, expecting all elements to be processed

Symptom
Trailing elements from the longer list are silently dropped with no exception and no warning. In a pipeline that pairs order records with status updates, a length mismatch causes the last N orders to be processed without their corresponding status — producing incorrect output that passes all row-count checks because the shorter list's length is the count being verified.
Fix
Use itertools.zip_longest(list_a, list_b, fillvalue=sentinel) when the two sequences might differ in length and you need to process all elements from both. Add a length assertion before the zip() loop if equal length is a contract: assert len(list_a) == len(list_b), f'Length mismatch: {len(list_a)} vs {len(list_b)}'.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does Python evaluate '500000 in range(1000000)' and what is its time...
Q02SENIOR
You're building a batch processor that chunks 10 million database rows i...
Q03SENIOR
What happens when you pass a step of 0 to range() — and why does Python ...
Q04JUNIOR
Explain why range(5) gives [0, 1, 2, 3, 4] and not [1, 2, 3, 4, 5]. What...
Q05SENIOR
Given this Python function, how would you refactor it to be more idiomat...
Q01 of 05SENIOR

How does Python evaluate '500000 in range(1000000)' and what is its time complexity compared to '500000 in list(range(1000000))'? Walk me through the implementation detail that makes them differ.

ANSWER
range() membership testing is O(1) because Python applies an arithmetic formula rather than iterating through the range. The CPython implementation checks three conditions in constant time: (1) is the value an integer type, (2) does (value - start) % step equal zero — meaning the value falls exactly on a step boundary, and (3) does the value fall within the [start, stop) bounds. All three are O(1) operations regardless of how many values the range would produce if fully iterated. The range size is completely irrelevant because the algorithm never touches any of those intermediate values. In contrast, '500000 in list(range(1000000))' performs a linear scan. Python compares 500000 against elements starting from index 0 and advances one element at a time until it finds a match or exhausts the list. In the worst case — checking for an element near the end or not present at all — this visits all 1 million elements. The practical consequence: membership testing in range(1_000_000_000) takes the same time as range(5). This is why range objects are the correct tool for validating whether a value is a valid page offset, a valid port number within an allowed range, or a valid batch boundary — you never need to materialise the list just to answer that question.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
Why does range(5) start at 0 and not 1 in Python?
02
What's the difference between range() and enumerate() in Python?
03
How do I loop backwards with range() in Python?
04
Is it safe to use range() with very large numbers in Python — say, range(10 ** 18)?
🔥

That's Python Basics. Mark it forged?

9 min read · try the examples if you haven't

Previous
Python print() Function: Syntax, Formatting and Examples
15 / 17 · Python Basics
Next
Python split() Method — Syntax, Edge Cases, and Production Pitfalls