Java Stream filter() — Avoid NPE from Nulls
Production NPE in filter() from null element in stream source.
- filter() keeps elements where a Predicate returns true; it's lazy and doesn't run until a terminal op like collect() or count()
- Combine multiple filter() calls or use && for AND, || for OR — but && is more readable for complex conditions
- filter(Objects::nonNull) is the standard null-safe filter before downstream ops
- Multiple filter() calls internally merge into one operation — no performance penalty from chaining
- filter().findFirst() short-circuits; filter().count() does not — understand the difference for streaming large datasets
- Biggest mistake: forgetting filter() is lazy — debugging filter() without a terminal op shows no filtering
Stream.filter() is the Java streams version of 'keep only the elements that match this condition'. You pass a Predicate — a function that returns true or false for each element — and filter() keeps only the true ones. It's lazy: the filtering doesn't execute until a terminal operation (collect, count, findFirst) is called.
filter() is the building block of data processing in modern Java. The power comes not from filter() alone but from composing it with map(), flatMap(), sorted(), and collect() in a readable declarative pipeline that describes what you want, not how to iterate to get it.
filter() Basics — The Predicate and Lazy Execution
Stream.filter() takes a Predicate<T> — a functional interface with a single test(T) method returning boolean. Every element is tested; only those where test returns true survive. filter() is an intermediate operation: it doesn't execute until a terminal operation like collect(), forEach(), or findFirst() is called.
This means you can chain multiple intermediate ops (filter, map, sorted) and the whole pipeline runs in one pass when the terminal op is invoked. The JVM may also fuse adjacent filter() calls into a single predicate internally for performance.
filter() and not seeing output. Remember, nothing executes until a terminal operation is called.filter() alone never throws — only when a terminal op triggers the pipeline.filter() does nothing.Multiple Conditions: Chaining vs. Single Predicate
You can filter on multiple conditions either by chaining multiple filter() calls or by combining conditions with && inside one filter(). Both produce the same result, but readability and performance differ slightly.
Multiple filter() calls are often more readable — each expresses a single concern. The stream library is smart enough to fuse them into a single predicate internally. However, if one condition is much more selective than others, ordering matters: filter out the rarest condition first to reduce elements flowing through subsequent filters.
filter() calls vs. a single && predicate — performance is nearly identical because the JIT compiler can inline both.filter() calls when conditions are conceptually separate; it improves readability and makes unit testing each condition easier.filter() calls are fused by the JVM; readability wins.filter() calls for readability and easier debuggingfilter() with && for concisenessPredicate Composition: and(), or(), negate()
Java 8 introduced Predicate methods for composition: and(), or(), and negate(). These let you build complex filters without deep nesting of && and ||, especially when predicates are stored as variables or reused.
Use Predicate.not() (Java 11+) for negation. Combine predicates with .and() and .or() for more expressive pipeline construction.
Predicate.and(), .or(), and .negate() to compose filters.Null-Safe Filtering and Optional Handling
Null values in a stream source are a common source of NPE in filter() predicates. The safest pattern is to filter out nulls with filter(Objects::nonNull) immediately after the stream creation. However, if you need to keep nulls for some reason (rare), use inline null checks in the predicate: p -> p != null && condition.
For scenarios where you have a stream of Optionals (e.g., map() returning Optional), use flatMap(Optional::stream) (Java 9+) to unwrap and filter out empties in one step.
Performance: Short-Circuiting, Ordering, and Large Datasets
filter() performance depends on the predicate cost, the number of elements, and whether you use short-circuiting operations like findFirst(), anyMatch(), or limit(). These terminate early as soon as a match is found, potentially processing only a fraction of the stream.
For large datasets (millions of elements), the cost of the predicate dominates. Putting an inexpensive, selective predicate first can drastically reduce the number of elements evaluated by downstream filters. Use parallelStream() only if the predicate is stateless, the stream source is large, and the operation is CPU-intensive — otherwise parallel overhead outweighs the benefit.
filter().count() > 0 processes all orders. Use anyMatch() instead of filter().findFirst().isPresent() — it's more idiomatic and signals intent.limit() and skip() for pagination, but be aware that skip() on unordered streams may not be deterministic — use order-dependent pipelines with caution.NullPointerException in filter() predicate because upstream elements contained nulls
filter() passes every element to the predicate, including nulls. If the predicate invokes a method on the element, NPE occurs.filter() that accesses element methods. Also, enforce non-null contract at the data layer and validate object construction.- Always assume a stream source can contain nulls until proven otherwise — filter(Objects::nonNull) at the start of the pipeline is cheap insurance.
- Even if the database column is NOT NULL, object mapping or deserialization can introduce nulls. Validate at the earliest point.
filter() never executes. Add a .count() or .collect() at the end.peek() or debug logging to see the elements.Predicate.negate() to verify logic.collect(), forEach(), or count() to trigger lazy evaluationKey takeaways
collect() or count() is called.filter() calls or combine conditions with && and ||filter() calls are more readable for complex conditions.filter().count() > 0.and(), or(), negate() promotes reuseCommon mistakes to avoid
5 patternsCalling filter() on a stream with potential null elements without filtering nulls first
filter() predicate when trying to access methods on null elements — production incidents at 3 AM.stream() if the source might contain nulls. Alternatively, use inline null check: p -> p != null && condition.Forgetting filter() is lazy — debugging filter() without a terminal operation
filter() produces no output; developer thinks the condition is wrong, but actually the pipeline never executed.collect(), forEach(), or count() to trigger execution. Use .peek() for debugging, which is also lazy until terminal op.Mutating external state inside filter() predicate
filter() predicates stateless and non-interfering. If you need to collect additional information, use collect() with a custom collector instead of side-effecting in filter().Using filter().findFirst() when anyMatch() suffices
filter().findFirst().isPresent() works but is verbose and creates an Optional overhead. Also, findFirst() returns the first element even if you only care about existence.Assuming filter() on parallel stream is faster without checking predicate cost
Interview Questions on This Topic
What is the difference between Stream.filter() and Stream.anyMatch()?
filter() when you want a filtered collection; use anyMatch() when you only need to know if at least one element matches.Frequently Asked Questions
That's Collections. Mark it forged?
3 min read · try the examples if you haven't