Advanced 5 min · March 05, 2026

Python Descriptors — Missing Instance Dict Shared State Bug

Two DB servers identical counts despite 300% workload difference — store data on instance __dict__, not descriptor.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Quick Answer
  • Descriptors are objects that define __get__, __set__, and/or __delete__ to intercept attribute access
  • Data descriptors (with __set__/__delete__) always win over instance __dict__
  • Non-data descriptors (__get__ only) lose to instance __dict__ — makes methods shadowable
  • __set_name__ (Python 3.6+) auto-captures the attribute name at class creation
  • Performance cost: ~2-3x slower than direct __dict__ lookup for each access
  • Biggest mistake: storing per-instance state on the descriptor itself instead of in instance.__dict__

Every seasoned Python developer has written a @property and moved on. Far fewer have asked the obvious next question: how does @property actually work? The answer is descriptors — a protocol sitting at the very heart of Python's object model that lets you control what happens when an attribute is accessed on a class. This isn't an academic curiosity. Django model fields, SQLAlchemy's ORM columns, NumPy's array interface, and pytest fixtures all depend on descriptors for their expressive, magic-looking APIs.

The problem descriptors solve is deceptively simple: attributes are dumb by default. self.temperature = -300 happily stores an impossible value with zero complaint. You could add validation logic directly inside __init__, but that falls apart the moment you have ten classes sharing the same validation rule. Descriptors let you encapsulate attribute behaviour once, in one place, and attach it to as many classes as you like — clean, reusable, and transparent to the caller.

By the end of this article you'll understand the full descriptor protocol (__get__, __set__, __delete__, __set_name__), the crucial difference between data and non-data descriptors and why that difference changes attribute lookup priority, exactly how Python's built-in property, classmethod, and staticmethod are implemented as descriptors, the performance trade-offs to consider before using descriptors in hot paths, and the production-grade patterns that separate a toy descriptor from one you'd ship in a library.

The Descriptor Protocol — What Python Actually Does When You Access an Attribute

A descriptor is any object that defines at least one of __get__, __set__, or __delete__. That's the entire entry requirement. When Python resolves instance.attr, it doesn't just rummage through instance.__dict__. It runs a precise lookup algorithm defined in object.__getattribute__.

The algorithm goes like this: first, Python walks the MRO of the instance's type looking for attr in the class namespace. If it finds an object there that defines __get__ and (__set__ or __delete__), that object is a data descriptor and it wins unconditionally — even if instance.__dict__ has a same-named key. If the class object only defines __get__ (no __set__ or __delete__), it's a non-data descriptor and the instance __dict__ takes priority. If nothing in the class hierarchy has __get__, Python falls back to the instance __dict__ directly.

This priority order — data descriptor → instance dict → non-data descriptor → class attribute — is the single most important thing to internalise about descriptors. Getting it wrong is responsible for most descriptor bugs in the wild. property is a data descriptor (it defines all three). A plain function is a non-data descriptor (it only defines __get__), which is why instance.method works but you can still shadow it with instance.method = something_else.

Building a Production-Grade Validated Descriptor with __set_name__

__set_name__ was added in Python 3.6 and it changes everything about how you write reusable descriptors. Before it existed, you had to pass the attribute name as a constructor argument — price = Validated('price', ...) — which was redundant and error-prone. Now Python calls __set_name__(owner, name) automatically during class creation, handing you the exact name the descriptor was assigned to.

The classic mistake beginners make when building descriptors is storing per-instance data on the descriptor itself. Because the descriptor is a class-level object shared by all instances, storing self.value = x inside __set__ means every instance of the class would share the same variable. The correct pattern is to store data in the instance's __dict__ using a mangled key (commonly prefixed with an underscore plus the descriptor's own name).

Below is a complete, reusable TypeValidated descriptor you could drop into any project. It enforces type and optional range constraints, and because it's a class, you can extend it or compose it without touching the classes that use it. Notice how the same descriptor class powers three completely different attributes on WeatherReading.

How property, classmethod and staticmethod Are Just Descriptors in Disguise

One of the most illuminating exercises in Python is reimplementing the built-in property from scratch as a pure-Python descriptor. It instantly demystifies how getter/setter chaining works, and it proves that there's no magic — just the protocol you now understand.

classmethod is a non-data descriptor: __get__ returns a bound method with the class as the first argument instead of the instance. staticmethod is also a non-data descriptor: __get__ simply returns the raw underlying function, stripping both self and cls from the equation. Both are elegant proof that Python's method binding system is itself built on top of descriptors.

Understanding this has real production value. If you're writing a library and need a decorator that behaves differently depending on whether it's called on an instance or a class, you implement __get__ and return the appropriate callable. That's exactly what libraries like functools.cached_property do — and knowing the internals means you can write your own variants when the standard library doesn't quite fit.

Performance, Caching and Production Gotchas You Won't Find in the Docs

Descriptors invoke a Python-level function call on every attribute access. For attributes hit thousands of times per second in a tight loop — think coordinate getters in a physics simulation or column accessors in a data pipeline — that overhead is real and measurable. CPython's property is implemented in C, so it's faster than a pure-Python descriptor, but it's still slower than a direct __dict__ lookup.

functools.cached_property is the standard library's answer to this: it's a non-data descriptor that on first access calls the getter, then writes the result directly into instance.__dict__ under the same name. On subsequent accesses, the instance dict wins (non-data descriptor priority) and the function is never called again. No lock, no overhead. The catch: it's not thread-safe by default, and it doesn't work with __slots__ because slots eliminate the instance __dict__.

Another production pattern is the lazy descriptor: heavy initialisation (database connections, file handles, parsed configs) deferred until first access. Combine __set_name__ with cached_property-style logic and you get lazy-loaded class-level resources with zero boilerplate at the call site. The section below shows both a performance benchmark and a thread-safe lazy descriptor you can actually ship.

When Descriptors Fail: Real-World Patterns and How to Fix Them

Beyond the basic gotchas, descriptors introduce subtle failure modes that only surface under load or in complex inheritance hierarchies. Here are three patterns seen in production codebases.

1. Descriptor in an abstract base class with multiple inheritance — If two parent classes both define the same descriptor attribute, the MRO decides which one wins. If the descriptors have different implementations, the child class may behave unexpectedly. The fix: explicitly define the descriptor on the child class to resolve ambiguity.

2. Using a descriptor to replace @property for many attributes — While it's tempting to replace dozens of properties with a single descriptor factory, debugging becomes harder. The traceback shows the descriptor class, not the attribute name. The fix: override __repr__ on the descriptor to include the attribute name (stored from __set_name__).

3. Circular references in descriptor __get__ — If your descriptor's __get__ accesses another attribute on the same instance that itself triggers a descriptor get, you can create an infinite recursion. This happens when lazy-loaded descriptors reference each other. The fix: use sentinel values and check for recursion depth with a thread-local counter.

AspectData DescriptorNon-Data Descriptor
Methods required__get__ + __set__ and/or __delete____get__ only
Lookup priority vs instance dictWins — overrides instance dictLoses — instance dict takes precedence
Typical use casesproperty, validated attributes, ORM fieldsMethods, classmethod, staticmethod, cached_property
Can be shadowed per-instance?No — descriptor always interceptsYes — assigning to instance.__dict__ shadows it
Accidental override riskLow — instance dict writes go through __set__High — direct instance dict write silently shadows
Performance overheadEvery read + write goes through Python callEvery read goes through Python call; writes bypass
Thread-safe caching possible?Yes — store in instance dict inside __set__Yes, but requires careful double-checked locking

Key Takeaways

  • Descriptors control attribute access via __get__, __set__, __delete__.
  • Data descriptors win over instance dict; non-data descriptors lose.
  • Use __set_name__ to avoid boilerplate and prevent name-sync bugs.
  • Store per-instance data in instance.__dict__ with a mangled key.
  • property, classmethod, staticmethod are all built-in descriptors.
  • Add __repr__ to descriptors — saves hours of debugging.
  • Thread-safe caching requires double-checked locking or per-instance locks.
  • Multiple inheritance can silently swap descriptors — check MRO.

Common Mistakes to Avoid

  • Storing per-instance data on the descriptor itself
    Symptom: Setting obj_a.attr = 1 also changes obj_b.attr to 1 — all instances share the same value.
    Fix: Always store data in instance.__dict__ using a unique key (e.g., self.storage_key set in __set_name__). Never do self.value = x inside __set__.
  • Forgetting the `if instance is None` guard in `__get__`
    Symptom: Accessing `MyClass.descriptor_attr` raises `AttributeError: 'NoneType' object has no attribute '__dict__'`.
    Fix: Add if instance is None: return self as the first line in __get__. This handles class-level access.
  • Using `cached_property` on a class with `__slots__`
    Symptom: `TypeError: Cannot use cached_property instance without calling __set_name__` or silent AttributeError.
    Fix: Either add '__dict__' to __slots__, drop __slots__, or use a custom caching descriptor that stores data in a slot or via the descriptor's own dict with locking.
  • Creating a descriptor that doesn't handle inheritance correctly
    Symptom: A child class unexpectedly inherits the descriptor's state from the parent class, or the descriptor behaves differently depending on which parent is listed first in MRO.
    Fix: Ensure the descriptor's __set_name__ records the owner class so it can differentiate. For MRO-sensitive cases, explicitly redeclare the descriptor on the child class.
  • Omitting `__repr__` on a custom descriptor
    Symptom: During debugging, `print(MyClass.attr)` shows `<__main__.MyDescriptor object at 0x...>` — no indication of which attribute it belongs to.
    Fix: Implement __repr__ to include the attribute name (captured by __set_name__) and the owner class: return f'<{type(self).__name__} for {self.owner.__name__}.{self.name}>'.

Interview Questions on This Topic

  • QWhat is a descriptor in Python, and what are the three methods that define the descriptor protocol?JuniorReveal
    A descriptor is an object that defines at least one of __get__, __set__, or __delete__. These methods intercept attribute access on instances of the class where the descriptor is assigned as a class attribute. __get__(self, instance, owner) handles reading, __set__(self, instance, value) handles writing, and __delete__(self, instance) handles deletion. The descriptor protocol is the mechanism behind property, classmethod, staticmethod, __slots__, and many ORM field implementations.
  • QExplain the difference between a data descriptor and a non-data descriptor. Give an example of when you would choose one over the other.Mid-levelReveal
    A data descriptor defines both __get__ and at least one of __set__ or __delete__. A non-data descriptor only defines __get__. The critical difference is lookup priority: data descriptors always win over instance __dict__, while non-data descriptors lose to instance __dict__. Choose a data descriptor when you need to control attribute assignment and prevent shadowing — for example, a validated field that must always run type checking. Choose a non-data descriptor when you want to allow per-instance overrides — for example, a method that can be replaced on a specific instance (like mocking a method in tests). @property is a data descriptor because it always intercepts reads and writes. A plain function (non-data descriptor) can be shadowed by instance.method = lambda x: x.
  • QHow does __set_name__ improve descriptor design, and what was the workaround before Python 3.6?Mid-levelReveal
    __set_name__(self, owner, name) is called automatically when the class body is executed, providing the descriptor with the owning class and the attribute name it was assigned to. Before Python 3.6, you had to pass the name explicitly as a constructor argument: class Foo: bar = MyDescriptor('bar'). This was error-prone because renaming the attribute required updating the string argument. __set_name__ eliminates that duplication and reduces bugs. If you're maintaining pre-3.6 code, you must manually call descriptor.__set_name__(Owner, 'name') after class creation, or the storage key will be None and all writes will silently overwrite each other.
  • QDescribe a production scenario where a descriptor storing state on itself caused data corruption, and how you would prevent it.SeniorReveal
    Consider a descriptor that tracks how many times an attribute is accessed: class Counter: def __get__(self, instance, owner): self.count += 1; return .... If the descriptor is a class attribute, self.count is shared across all instances. When a web server processes requests for different users, the counter will be incremented globally, not per-user. The fix: store the counter in instance.__dict__ using a mangled key like '_counter_' + self.attr_name. This ensures each instance maintains its own count. The prevention rule: never store per-instance mutable state on the descriptor object itself; always use instance.__dict__ with a unique key.
  • QHow would you implement a caching decorator for a method that is safe to use in a multi-threaded environment, using descriptors?SeniorReveal
    I'd create a descriptor-based decorator similar to functools.cached_property but with thread safety. The descriptor would use a per-instance lock stored in instance.__dict__ (initialized lazily). On first access via __get__, a double-checked locking pattern is used: check if the cache key exists, if not acquire the lock, check again, then compute and store. By making it a non-data descriptor (no __set__), subsequent accesses use the instance dict directly — no lock overhead. The storage key would be mangled with the attribute name to avoid collisions. This pattern is used internally by many web frameworks for lazy-loaded connection pools or configuration objects.

Frequently Asked Questions

Can I use a descriptor on a class level (not on instances)?

Yes. When you access a descriptor on the class itself (e.g., MyClass.attr), Python calls __get__ with instance=None and owner=MyClass. You must handle this in your __get__ implementation — typically by returning self (the descriptor object) or a class-level value. If you don't, you'll get an AttributeError when trying to access instance.__dict__ on None.

What's the difference between a descriptor and a property?

A property is a specific type of descriptor — it's a built-in class that implements the full descriptor protocol (__get__, __set__, __delete__). You can write your own descriptors to do anything a property does and more: validation, type checking, lazy loading, logging, and cross-attribute coordination. Properties are syntactic sugar for simple getter/setter patterns; descriptors are a general-purpose hook into the attribute system.

Why does cached_property not work with __slots__?

cached_property works by writing the cached value directly into instance.__dict__ under the attribute name. If a class defines __slots__, instances don't have a __dict__ unless explicitly included (__slots__ = ('__dict__',)). Without __dict__, cached_property has nowhere to store the computed value and raises a TypeError. Workaround: either add '__dict__' to __slots__, drop __slots__, or implement a custom caching descriptor that stores the value in a designated slot.

Can a descriptor be used on a function or method?

No — descriptors are defined on classes, not on functions. The __get__ method of a function is what makes it a descriptor (it returns a bound method when accessed on an instance). But you cannot assign a descriptor object as an attribute of a function; it must be a class attribute. If you want to intercept attribute access on a module-level object, consider using __getattr__ on the module or a proxy class.

How do I debug why a descriptor isn't being called?

First, check if the attribute is defined on the class or on the instance. Print type(instance).__dict__.get('attr') — if it returns a descriptor object, it's a class attribute. Then check if it's a data or non-data descriptor (presence of __set__ or __delete__). If it's a non-data descriptor and the instance has the same key in __dict__, the instance value wins. Temporarily add a print() inside __get__ to confirm it's being invoked. Also verify that __set_name__ was called (the storage key should not be None).

🔥

That's Advanced Python. Mark it forged?

5 min read · try the examples if you haven't

Previous
Memory Management in Python
5 / 17 · Advanced Python
Next
Type Hints in Python