Python Memory Management — Event Bus That Ate 8GB
- CPython uses three-tier memory: arenas (256KB), pools (4KB), size classes (8-byte multiples).
- Reference counting cleans 99% of objects instantly; cyclic GC handles cycles via generational mark-and-sweep.
- Weak references and weakref.WeakValueDictionary/WeakSet break cycles without manual cleanup.
- Python uses reference counting for immediate deterministic cleanup
- Cyclic garbage collector supplements refcounting for cycles
- Objects over 512 bytes bypass pymalloc and use raw malloc
- Generational GC (3 tiers) optimises for young objects
- Weak references break cycles without manual cleanup
- tracemalloc and gc module are first-line debugging tools
Python Memory Debug Cheat Sheet
Suspected memory leak
import tracemalloc; tracemalloc.start(); snapshot = tracemalloc.take_snapshot()diff = leak_snapshot.compare_to(baseline_snapshot, 'lineno')GC not collecting cycles quickly enough
import gc; print(gc.get_threshold()); print(gc.get_count())gc.collect(2) # Force full collectionObject count growing unbounded
from collections import Counter; cnt = Counter(type(o).__name__ for o in gc.get_objects())cnt.most_common(20)High GC pause time
gc.get_stats()gc.set_threshold(500, 5, 5) # Tune for your appProduction Incident
Production Debug GuideSymptom → Action checklist for production memory issues
gc.get_objects() for surviving container objects. If counts are stable, it's free lists — not a leak.gc.get_stats() for generation 2 collection time. Tune thresholds with gc.set_threshold() to collect gen1 more frequently, reducing gen2 scan size.Python feels effortless compared to C or C++. You never call malloc, you never worry about dangling pointers, and memory just... works. But that magic has a cost, and if you don't understand what's happening under the hood, you'll hit memory leaks in long-running services, inexplicable slowdowns in data pipelines, and bugs that only reproduce under load — the worst kind. Every production Python engineer has a horror story here.
The problem memory management solves is deceptively simple: who owns this chunk of memory, and when is it safe to give it back? Python answers that question with a two-layer system — reference counting as the fast first pass, and a cyclic garbage collector as the slower safety net for the cases reference counting can't handle. Understanding both layers — and how they interact — is what separates engineers who debug memory issues in minutes from those who spend days guessing.
By the end of this article you'll be able to explain CPython's memory allocator hierarchy, predict when the garbage collector fires and how to tune it, use weak references to break memory-leaking cycles, read tracemalloc snapshots to pinpoint leaks in production, and avoid the five most common memory traps that catch even experienced Python developers off guard.
CPython's Memory Architecture: From OS Blocks to Python Objects
CPython doesn't talk directly to the OS for every tiny allocation. That would be catastrophically slow — a sys call for every integer? No. Instead it builds a three-tier pyramid.
At the base, the OS gives CPython large raw memory blocks via malloc. CPython's arena allocator carves those blocks into 256 KB arenas. Each arena is divided into pools (4 KB each), and each pool handles objects of a specific size class — in multiples of 8 bytes up to 512 bytes. This is the pymalloc subsystem, and it exists specifically to avoid the overhead of the general-purpose allocator for small, short-lived objects.
Objects larger than 512 bytes skip pymalloc entirely and go straight to malloc. This means a 600-byte bytes object and a 100-byte dict have completely different allocation paths — a fact that matters when you're profiling.
Pools maintain a free list internally. When an object is freed, its slot goes back onto the pool's free list rather than returning memory to the OS immediately. This is why Python processes sometimes look like they're holding onto memory even after you've deleted everything — the memory is logically free but still mapped to the process. Arenas are only released back to the OS when every pool inside them is completely empty, which is harder to achieve than it sounds.
import sys import tracemalloc # Start tracing memory allocations tracemalloc.start() # --- Demonstrate size classes and sys.getsizeof --- # Small integers are cached by CPython (-5 to 256) small_int = 42 large_int = 1000 print(f"Size of integer 42: {sys.getsizeof(small_int)} bytes") print(f"Size of integer 1000: {sys.getsizeof(large_int)} bytes") print(f"Size of empty list: {sys.getsizeof([])} bytes") print(f"Size of empty dict: {sys.getsizeof({}) } bytes") print(f"Size of empty str: {sys.getsizeof('')} bytes") print() # --- Show that small ints are the SAME object in memory --- # CPython caches integers from -5 to 256 to avoid repeated allocation a = 256 b = 256 print(f"a = 256, b = 256 -> same object? {a is b}") # True — cached c = 257 d = 257 print(f"c = 257, d = 257 -> same object? {c is d}") # False — not cached print() # --- Demonstrate pymalloc vs raw malloc boundary --- # Objects <= 512 bytes use pymalloc pools; larger use malloc directly small_bytes = bytes(100) # 100 bytes -> pymalloc large_bytes = bytes(600) # 600 bytes -> malloc directly print(f"Size of 100-byte object: {sys.getsizeof(small_bytes)} bytes (pymalloc pool)") print(f"Size of 600-byte object: {sys.getsizeof(large_bytes)} bytes (raw malloc)") print() # --- Snapshot: see what tracemalloc recorded --- snapshot = tracemalloc.take_snapshot() top_stats = snapshot.statistics('lineno') print("Top 3 memory allocations in this script:") for stat in top_stats[:3]: print(f" {stat}") tracemalloc.stop()
Size of integer 1000: 28 bytes
Size of empty list: 56 bytes
Size of empty dict: 64 bytes
Size of empty str: 49 bytes
a = 256, b = 256 -> same object? True
c = 257, d = 257 -> same object? False
Size of 100-byte object: 133 bytes (pymalloc pool)
Size of 600-byte object: 633 bytes (raw malloc)
Top 3 memory allocations in this script:
memory_architecture_demo.py:8: size=1024 B, count=4, average=256 B
memory_architecture_demo.py:29: size=633 B, count=1, average=633 B
memory_architecture_demo.py:28: size=133 B, count=1, average=133 B
Reference Counting and the Cyclic Garbage Collector — How Objects Actually Die
Every Python object carries an ob_refcnt field — a simple integer baked right into the PyObject C struct. Every time you bind a name, append to a list, or pass something to a function, that counter goes up. When the binding is destroyed — scope exits, del is called, the container is cleared — it goes down. Hit zero, and CPython calls the object's destructor and frees the memory immediately. No pause, no waiting. That's reference counting's superpower: instant, deterministic cleanup.
But reference counting has one fatal blind spot: cycles. If object A holds a reference to object B, and object B holds a reference back to A, both counters stay at 1 even when nothing else in the program can reach either of them. They're orphaned but immortal under pure reference counting.
This is where CPython's generational cyclic garbage collector steps in. It supplements — never replaces — reference counting. The GC tracks container objects (lists, dicts, sets, user-defined classes) that could potentially form cycles. It ignores scalars like ints and strings, which can never form cycles on their own.
The GC runs in three generations. New objects start in generation 0. If they survive a GC pass, they're promoted to generation 1, then generation 2. The idea: most objects die young (your loop variable, your temp dict), so collecting generation 0 frequently is cheap and catches most garbage. Collecting generation 2 is rare and expensive, but that's fine because long-lived objects are unlikely to be cyclic garbage.
import gc import sys import ctypes # ── PART 1: Observe reference counts directly ────────────────────────────── class TrackedNode: """A simple node we'll use to build a reference cycle.""" def __init__(self, label): self.label = label self.partner = None # Will point to another TrackedNode def __del__(self): # This fires when the object is actually destroyed print(f" [destructor] TrackedNode '{self.label}' was freed") # Create a single node and watch the refcount node_alpha = TrackedNode("alpha") # getrefcount always reports +1 because the function argument itself is a reference print(f"Refcount of node_alpha (just created): {sys.getrefcount(node_alpha) - 1}") alias = node_alpha # Second binding — refcount goes to 2 print(f"Refcount after creating alias: {sys.getrefcount(node_alpha) - 1}") del alias # Remove one binding — refcount drops to 1 print(f"Refcount after deleting alias: {sys.getrefcount(node_alpha) - 1}") print() # ── PART 2: Create an unreachable cycle and prove GC finds it ────────────── # Disable automatic GC so we can control exactly when it runs gc.disable() node_one = TrackedNode("one") node_two = TrackedNode("two") # Wire them into a cycle: one -> two -> one node_one.partner = node_two node_two.partner = node_one # Now remove the only external references to both nodes # Reference counting CANNOT free these — each has refcount 1 from the other print("Deleting external references to node_one and node_two...") del node_one del node_two print("(No destructor fired yet — cycle keeps both alive)") print() # Manually check what the GC considers unreachable unreachable_count = gc.collect() # Collect all generations print(f"GC collected {unreachable_count} unreachable objects") print() # ── PART 3: Inspect GC generations ──────────────────────────────────────── gc.enable() print("GC generation thresholds:", gc.get_threshold()) print("GC generation counts: ", gc.get_count()) # Thresholds: (700, 10, 10) means: # gen0 collects every 700 allocations # gen1 collects every 10 gen0 collections # gen2 collects every 10 gen1 collections
Refcount after creating alias: 2
Refcount after deleting alias: 1
Deleting external references to node_one and node_two...
(No destructor fired yet — cycle keeps both alive)
[destructor] TrackedNode 'two' was freed
[destructor] TrackedNode 'one' was freed
GC collected 2 unreachable objects
GC generation thresholds: (700, 10, 10)
GC generation counts: (0, 0, 0)
Weak References, __slots__, and Memory-Efficient Patterns in Production
Now that you know cycles kill you, let's talk about the tools that prevent them without manually breaking every back-reference.
A weak reference lets you hold a pointer to an object without incrementing its reference count. The object can still die normally; the weak reference just becomes None (or raises ReferenceError) when that happens. This is perfect for caches, observer patterns, and parent-child relationships where the child shouldn't keep the parent alive.
The weakref module gives you weakref.ref() for a single weak reference, weakref.WeakValueDictionary for caches where values can expire, and weakref.WeakSet for observer registries.
On a completely different axis: __slots__ is the single highest-impact optimization for memory-heavy code that creates thousands of instances of the same class. By default, every Python instance carries a __dict__ — a full hash table — even if your object only has three fixed attributes. A __dict__ costs around 200–300 bytes minimum. __slots__ replaces that dict with a fixed C-level array, dropping per-instance overhead dramatically.
The trade-off: __slots__ breaks dynamic attribute assignment, makes multiple inheritance trickier, and surprises developers who expect __dict__ to exist. Use it deliberately in hot paths — not as a default everywhere.
import weakref import sys import gc # ══════════════════════════════════════════════════════════════ # PART 1: WeakValueDictionary as a memory-safe cache # ══════════════════════════════════════════════════════════════ class ExpensiveResource: """Simulates an object that's costly to create (DB connection, parsed config).""" def __init__(self, resource_id): self.resource_id = resource_id def __repr__(self): return f"ExpensiveResource(id={self.resource_id})" # A cache where entries vanish automatically when nothing else holds them resource_cache = weakref.WeakValueDictionary() # Create a resource and store it in the cache db_connection = ExpensiveResource(resource_id="db-primary") resource_cache["db-primary"] = db_connection print(f"Cache hit: {resource_cache.get('db-primary')}") print(f"Cache size: {len(resource_cache)}") print() # When the strong reference disappears, the cache entry cleans itself up del db_connection gc.collect() # Force cleanup for demo purposes print(f"After del: {resource_cache.get('db-primary')}") print(f"Cache size: {len(resource_cache)}") print() # ══════════════════════════════════════════════════════════════ # PART 2: Breaking a parent-child cycle with weakref.ref # ══════════════════════════════════════════════════════════════ class TreeNode: def __init__(self, value): self.value = value self.children = [] self._parent_ref = None # Will hold a weak reference, not a strong one def add_child(self, child_node): child_node._parent_ref = weakref.ref(self) # Weak — child won't keep parent alive self.children.append(child_node) # Strong — parent keeps children alive @property def parent(self): # Dereference the weak ref; returns None if parent was collected if self._parent_ref is None: return None return self._parent_ref() # Calling a weakref returns the object or None def __repr__(self): return f"TreeNode({self.value})" root = TreeNode("root") child = TreeNode("child") root.add_child(child) print(f"child.parent = {child.parent}") print(f"root.children = {root.children}") print() # ══════════════════════════════════════════════════════════════ # PART 3: __slots__ memory savings — measured # ══════════════════════════════════════════════════════════════ class RegularPoint: """Standard class — every instance carries a full __dict__.""" def __init__(self, x_coord, y_coord, z_coord): self.x_coord = x_coord self.y_coord = y_coord self.z_coord = z_coord class SlottedPoint: """Slots class — fixed-size C array, no __dict__ overhead.""" __slots__ = ('x_coord', 'y_coord', 'z_coord') def __init__(self, x_coord, y_coord, z_coord): self.x_coord = x_coord self.y_coord = y_coord self.z_coord = z_coord regular = RegularPoint(1.0, 2.0, 3.0) slotted = SlottedPoint(1.0, 2.0, 3.0) regular_size = sys.getsizeof(regular) + sys.getsizeof(regular.__dict__) slotted_size = sys.getsizeof(slotted) # No __dict__ to add print(f"RegularPoint size (object + __dict__): {regular_size} bytes") print(f"SlottedPoint size (no __dict__): {slotted_size} bytes") print(f"Memory saved per instance: {regular_size - slotted_size} bytes") print() # Scale that up to a realistic data pipeline with 1M points num_instances = 1_000_000 savings_mb = (regular_size - slotted_size) * num_instances / (1024 ** 2) print(f"Projected saving across {num_instances:,} instances: {savings_mb:.1f} MB")
Cache size: 1
After del: None
Cache size: 0
child.parent = TreeNode(root)
root.children = [TreeNode(child)]
RegularPoint size (object + __dict__): 344 bytes
SlottedPoint size (no __dict__): 56 bytes
Memory saved per instance: 288 bytes
Projected saving across 1,000,000 instances: 274.7 MB
Diagnosing Memory Leaks with tracemalloc in Production
You've got a long-running Python service. RSS memory climbs slowly over hours and never comes back down. The question is: what's holding onto that memory?
tracemalloc is the right tool for this — it's in the standard library since Python 3.4, has minimal overhead when used correctly, and gives you file-and-line-number attribution for every allocation. The typical workflow: take a baseline snapshot early in the process lifecycle, take a second snapshot after the suspected leak window, and diff them. The lines with the biggest positive delta are your culprits.
For production use, keep tracemalloc off by default (it adds ~30% memory overhead for tracing metadata) and enable it only when diagnosing. Better: expose a signal handler or a debug endpoint that takes a snapshot on demand without restarting the process.
Beyond tracemalloc, the gc module is invaluable. gc.get_objects() returns every object currently tracked by the cyclic GC. Calling it before and after a suspicious operation and comparing counts tells you exactly what object types are accumulating. Pair it with collections.Counter for instant triage.
A subtler cause of production leaks is Python's internal free lists for types like floats, lists, and frames. CPython keeps recently freed objects on these lists for reuse rather than returning to the OS. This is good for performance, but it means peak memory is sticky — after a spike, your process won't shrink even after the spike objects are gone.
import tracemalloc import gc import collections import linecache # ── Helper: pretty-print a tracemalloc diff ──────────────────────────────── def display_top_allocations(snapshot, key_type='lineno', limit=5): """Print the top N memory consumers from a tracemalloc snapshot.""" stats = snapshot.statistics(key_type) print(f"{'Rank':<5} {'Size':>10} {'Count':>8} Location") print("-" * 60) for rank, stat in enumerate(stats[:limit], start=1): frame = stat.traceback[0] # Fetch the actual source line for context source_line = linecache.getline(frame.filename, frame.lineno).strip() print(f"{rank:<5} {stat.size / 1024:>8.1f} KB {stat.count:>8} " f"{frame.filename}:{frame.lineno}") print(f" {'':>10} {'':>8} -> {source_line}") print() # ── Simulate a leaking registry (classic production pattern) ─────────────── class EventBus: """ A naive event bus that never deregisters listeners. This is the #1 cause of Python service memory leaks. """ _listeners: dict = {} @classmethod def register(cls, event_name, handler_func): cls._listeners.setdefault(event_name, []).append(handler_func) @classmethod def listener_count(cls): return sum(len(v) for v in cls._listeners.values()) # ── Take baseline snapshot ───────────────────────────────────────────────── tracemalloc.start(depth=5) # depth=5 captures 5 frames of stack context gc.collect() # Clean slate before baseline baseline_snapshot = tracemalloc.take_snapshot() baseline_gc_counts = collections.Counter( type(obj).__name__ for obj in gc.get_objects() ) print("=== Simulating 500 request cycles (leaking handlers each time) ===") # Simulate a web server handling requests — each 'request' registers a # new handler but the old ones are never removed for request_number in range(500): def handle_user_event(event_data, req=request_number): """Handler closure — captures req, keeping it alive in the bus.""" return f"request {req} handled {event_data}" EventBus.register("user.login", handle_user_event) print(f"EventBus now holds {EventBus.listener_count()} handlers") print() # ── Take leak snapshot and diff ──────────────────────────────────────────── leak_snapshot = tracemalloc.take_snapshot() leak_gc_counts = collections.Counter( type(obj).__name__ for obj in gc.get_objects() ) print("=== Top memory allocations AFTER the leak ===") display_top_allocations(leak_snapshot, limit=4) print("=== Object count changes (GC-tracked objects) ===") for type_name, count in (leak_gc_counts - baseline_gc_counts).most_common(5): print(f" +{count:>6} {type_name}") print() # ── Show the diff between snapshots ─────────────────────────────────────── print("=== Snapshot diff (new allocations since baseline) ===") diff_stats = leak_snapshot.compare_to(baseline_snapshot, 'lineno') for stat in diff_stats[:4]: print(stat) tracemalloc.stop() # ── The fix: use WeakSet so the bus doesn't prevent GC ──────────────────── print() print("=== Fix: use weakref.WeakSet for listener registry ===") import weakref class SafeEventBus: _listeners: dict = {} @classmethod def register(cls, event_name, handler_func): if event_name not in cls._listeners: cls._listeners[event_name] = weakref.WeakSet() cls._listeners[event_name].add(handler_func) @classmethod def listener_count(cls): return sum(len(list(v)) for v in cls._listeners.values()) print("SafeEventBus uses WeakSet — handlers are released when they go out of scope.")
EventBus now holds 500 handlers
=== Top memory allocations AFTER the leak ===
Rank Size Count Location
------------------------------------------------------------
1 48.2 KB 500 leak_diagnosis_demo.py:52
-> def handle_user_event(event_data, req=request_number):
2 10.1 KB 1 leak_diagnosis_demo.py:30
-> _listeners: dict = {}
3 5.3 KB 500 <frozen importlib._bootstrap>:241
->
4 1.2 KB 14 leak_diagnosis_demo.py:1
-> import tracemalloc
=== Object count changes (GC-tracked objects) ===
+ 500 function
+ 1 dict
+ 1 list
=== Snapshot diff (new allocations since baseline) ===
leak_diagnosis_demo.py:52: size=48200 B (+48200 B), count=500 (+500), average=96 B
leak_diagnosis_demo.py:30: size=10136 B (+10136 B), count=1 (+1), average=10136 B
<frozen importlib._bootstrap>:241: size=5376 B (+5376 B), count=500 (+500), average=10 B
=== Fix: use weakref.WeakSet for listener registry ===
SafeEventBus uses WeakSet — handlers are released when they go out of scope.
tracemalloc.start() permanently in a production service can increase memory usage by 30–50% because it stores a traceback for every live allocation. The production-safe pattern: keep it disabled, expose a /debug/memory endpoint (behind auth) that calls tracemalloc.start(), waits 60 seconds, takes a snapshot, calls tracemalloc.stop(), and returns the diff as JSON. You get the diagnosis without the permanent cost.gc.get_objects() for lightweight object count snapshots.gc.collect() to confirm.GC Tuning and Production Trade-offs
The default GC thresholds (700 allocations for gen0, 10 gen0 collections per gen1, 10 gen1 per gen2) work for general-purpose scripts. In production, they can cause noticeable latency spikes when gen2 collects a large heap.
You can tune with gc.set_threshold(gen0_threshold, gen1_multiplier, gen2_multiplier). Lower gen0 threshold triggers more frequent collections, which keeps each collection small but raises total overhead. Higher thresholds mean less frequent but more expensive collections.
Some high-throughput services disable the GC entirely after startup — Instagram famously did this. They proved their code never created cycles. That's a risky move unless you audit every library and every codepath. You can also run gc.collect(2) manually during maintenance windows.
gc.set_debug(gc.DEBUG_LEAK) prints objects that can't be collected — invaluable for catching cycles with __del__. But don't leave it on in production; it prints to stderr and slows everything.
Another tuning lever: gc.freeze() promotes all current objects to a 'permanent' generation that the GC never scans again. This is useful for services that preload modules and config at startup — those objects never die, so scanning them every GC cycle is wasted work. Django's ASGI server uses this pattern.
- Frequent young collections (low gen0 threshold) → lower peak pause, higher total CPU.
- Infrequent young collections (high gen0 threshold) → higher peak pause, lower total CPU.
- Gen2 pause scales with the number of objects that survive to gen2.
- gc.freeze() eliminates scanning of immortal objects entirely — use it after warm-up.
- The right trade-off depends on your latency SLO and memory allocation rate.
gc.get_stats() before tuning — never guess.| Aspect | Reference Counting | Cyclic Garbage Collector |
|---|---|---|
| Mechanism | ob_refcnt field in every PyObject C struct | Mark-and-sweep over tracked container objects |
| Triggers | Every assignment, del, scope exit — immediate | After N allocations per generation (threshold-based) |
| Handles cycles? | No — orphaned cycles live forever | Yes — its entire reason for existing |
| Pause time | Zero — cleanup happens inline | Stop-the-world pause (brief but real; worse for gen2) |
| Overhead | Atomic increment/decrement on every reference op | Periodic scan of all tracked containers |
| Tunable? | No — hardwired into CPython | Yes — gc.set_threshold(), gc.disable(), gc.collect() |
| Object types covered | All objects | Only container types (list, dict, set, class instances) |
| __del__ guaranteed? | Yes, immediately when refcount hits 0 (no cycles) | Eventually, but order is undefined for cycle members |
| PyPy / Jython support | No — only CPython | Different GC implementations exist in each runtime |
🎯 Key Takeaways
- CPython uses three-tier memory: arenas (256KB), pools (4KB), size classes (8-byte multiples).
- Reference counting cleans 99% of objects instantly; cyclic GC handles cycles via generational mark-and-sweep.
- Weak references and weakref.WeakValueDictionary/WeakSet break cycles without manual cleanup.
- __slots__ eliminates per-instance __dict__ overhead — huge savings for high-count objects.
- tracemalloc diffs pinpoint leaky code lines;
gc.get_objects()shows accumulating types. - GC tuning (thresholds, freeze) lets you trade pause time for total overhead.
- Free lists cause RSS stickiness, not true leaks — always collect before diagnosing.
- Disabling GC is risky unless you can prove zero cycles exist in all code paths.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain how CPython manages memory. What are arenas, pools, and blocks, and why does this three-tier system exist?SeniorReveal
- QWhat's the difference between reference counting and the cyclic garbage collector? Why does CPython need both?SeniorReveal
- QHow would you debug a memory leak in a production Python service? Walk through the tools and steps.SeniorReveal
- QWhen would you use __slots__ in Python? What are the trade-offs?Mid-levelReveal
Frequently Asked Questions
Why does Python's memory usage not decrease after deleting large objects?
CPython keeps freed objects in internal free lists for reuse rather than returning memory to the OS. This is by design — it avoids the cost of repeated syscalls. True memory is reclaimed when an entire arena pool is empty, which may not happen if any object in that pool remains alive. Additionally, the OS may not reclaim pages immediately even after munmap. To confirm a leak vs. free list retention, call gc.collect() and check if RSS drops — if it does, it's free lists. Use tracemalloc to find real leaks.
How do I prevent memory leaks from event listeners and callbacks?
Never store callbacks in a plain list or dictionary. Use weakref.WeakSet or weakref.WeakValueDictionary so that when the callback object goes out of scope (e.g., after a request finishes), it's automatically removed from the registry. If you must keep strong references, implement an explicit deregistration mechanism (e.g., a context manager that unregisters on __exit__).
When should I call gc.collect() manually?
Manually call gc.collect() when you've just released a large cyclic structure and want to free memory immediately, such as after processing a huge batch of data. Also call it before taking a memory snapshot for profiling. Don't call it on every request — it adds overhead. In production, consider running gc.collect(2) during maintenance windows to clean generation 2 without affecting response times.
What is the difference between gc.get_objects() and sys.getsizeof() for measuring memory?
gc.get_objects() returns a list of every container object tracked by the cyclic GC — it tells you what objects exist and their types, but not their sizes. sys.getsizeof() returns the shallow size of a single object (the object itself, not its referenced objects). For real memory profiling, use tracemalloc (standard library) or pympler (third-party) to get both object counts and deep sizes.
Does disabling the GC improve Python performance?
It can, but only if you're sure your code never creates reference cycles and you've measured that GC pause time is actually hurting your latency. Without cycles, the GC does nothing useful. However, many libraries (ORM, caching, async frameworks) create cycles internally. Instagram disabled GC safely because they audited every dependency and proved zero cycles. For most teams, it's safer to tune thresholds with gc.set_threshold() than to disable the GC entirely.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.