Java Stream filter() — Avoid NPE from Nulls
Production NPE in filter() from null element in stream source.
- filter() keeps elements where a Predicate returns true; it's lazy and doesn't run until a terminal op like collect() or count()
- Combine multiple filter() calls or use && for AND, || for OR — but && is more readable for complex conditions
- filter(Objects::nonNull) is the standard null-safe filter before downstream ops
- Multiple filter() calls internally merge into one operation — no performance penalty from chaining
- filter().findFirst() short-circuits; filter().count() does not — understand the difference for streaming large datasets
- Biggest mistake: forgetting filter() is lazy — debugging filter() without a terminal op shows no filtering
filter() is the building block of data processing in modern Java. The power comes not from filter() alone but from composing it with map(), flatMap(), sorted(), and collect() in a readable declarative pipeline that describes what you want, not how to iterate to get it.
filter() Basics — The Predicate and Lazy Execution
Stream.filter() takes a Predicate<T> — a functional interface with a single test(T) method returning boolean. Every element is tested; only those where test returns true survive. filter() is an intermediate operation: it doesn't execute until a terminal operation like collect(), forEach(), or findFirst() is called.
This means you can chain multiple intermediate ops (filter, map, sorted) and the whole pipeline runs in one pass when the terminal op is invoked. The JVM may also fuse adjacent filter() calls into a single predicate internally for performance.
Multiple Conditions: Chaining vs. Single Predicate
You can filter on multiple conditions either by chaining multiple filter() calls or by combining conditions with && inside one filter(). Both produce the same result, but readability and performance differ slightly.
Multiple filter() calls are often more readable — each expresses a single concern. The stream library is smart enough to fuse them into a single predicate internally. However, if one condition is much more selective than others, ordering matters: filter out the rarest condition first to reduce elements flowing through subsequent filters.
Predicate Composition: and(), or(), negate()
Java 8 introduced Predicate methods for composition: and(), or(), and negate(). These let you build complex filters without deep nesting of && and ||, especially when predicates are stored as variables or reused.
Use Predicate.not() (Java 11+) for negation. Combine predicates with .and() and .or() for more expressive pipeline construction.
Null-Safe Filtering and Optional Handling
Null values in a stream source are a common source of NPE in filter() predicates. The safest pattern is to filter out nulls with filter(Objects::nonNull) immediately after the stream creation. However, if you need to keep nulls for some reason (rare), use inline null checks in the predicate: p -> p != null && condition.
For scenarios where you have a stream of Optionals (e.g., map() returning Optional), use flatMap(Optional::stream) (Java 9+) to unwrap and filter out empties in one step.
Performance: Short-Circuiting, Ordering, and Large Datasets
filter() performance depends on the predicate cost, the number of elements, and whether you use short-circuiting operations like findFirst(), anyMatch(), or limit(). These terminate early as soon as a match is found, potentially processing only a fraction of the stream.
For large datasets (millions of elements), the cost of the predicate dominates. Putting an inexpensive, selective predicate first can drastically reduce the number of elements evaluated by downstream filters. Use parallelStream() only if the predicate is stateless, the stream source is large, and the operation is CPU-intensive — otherwise parallel overhead outweighs the benefit.
| Operation | Method | Example |
|---|---|---|
| Keep matching elements | filter(predicate) | .filter(p -> p.amount() > 100) |
| Negate condition | filter(predicate.negate()) | .filter(Predicate.not(String::isEmpty)) |
| Null-safe filter | filter(Objects::nonNull) | .filter(Objects::nonNull) |
| Multiple conditions AND | filter with && | .filter(p -> a && b) |
| Multiple conditions OR | filter with \|\| | .filter(p -> a \|\| b) |
| Combine predicates | Predicate.and/or() | p1.and(p2) |
Key Takeaways
- filter() keeps only elements where the predicate returns true — it's lazy and doesn't execute until a terminal operation like
collect()orcount()is called. - Chain multiple
filter()calls or combine conditions with && and || — multiplefilter()calls are more readable for complex conditions. - filter(Objects::nonNull) is the standard idiom for removing nulls from a stream before further processing.
- For boolean checks (does any element match? do all elements match?), use anyMatch(), allMatch(), or noneMatch() instead of
filter().count() > 0. - Short-circuiting operations (findFirst, limit) can dramatically reduce processing for large datasets when you only need a subset.
- Predicate composition via
and(),or(),negate()promotes reuse — but keep predicates stateless for parallel safety.
Common Mistakes to Avoid
- Calling filter() on a stream with potential null elements without filtering nulls first
Symptom: NullPointerException inside filter() predicate when trying to access methods on null elements — production incidents at 3 AM.
Fix: Always add filter(Objects::nonNull) as the first operation afterstream()if the source might contain nulls. Alternatively, use inline null check: p -> p != null && condition. - Forgetting filter() is lazy — debugging filter() without a terminal operation
Symptom: System.out.println inside filter() produces no output; developer thinks the condition is wrong, but actually the pipeline never executed.
Fix: Add a terminal operation likecollect(), forEach(), orcount()to trigger execution. Use .peek() for debugging, which is also lazy until terminal op. - Mutating external state inside filter() predicate
Symptom: Inconsistent results when switching to parallelStream() — shared mutable state causes race conditions, output varies per run.
Fix: Keepfilter()predicates stateless and non-interfering. If you need to collect additional information, usecollect()with a custom collector instead of side-effecting infilter(). - Using filter().findFirst() when anyMatch() suffices
Symptom: The code filter().findFirst().isPresent() works but is verbose and creates an Optional overhead. Also, findFirst() returns the first element even if you only care about existence.
Fix: Use anyMatch(predicate) directly — it's cleaner and short-circuits as soon as it finds a match, without needing to extract an element. - Assuming filter() on parallel stream is faster without checking predicate cost
Symptom: Parallel stream performs worse than sequential on small datasets or when predicate is cheap (e.g., simple integer comparison).
Fix: Only use parallelStream() when the dataset is large (tens of thousands+) and the predicate is CPU-intensive (e.g., complex regex, database call). For most cases, sequential is fine.
Interview Questions on This Topic
- QWhat is the difference between
Stream.filter()andStream.anyMatch()?JuniorReveal - QHow would you filter a List<Payment> to only include completed payments over £100 using Java streams?JuniorReveal
- QIs
Stream.filter()eager or lazy? What are the implications?Mid-levelReveal - QHow does multiple
filter()calls affect performance compared to a single filter with &&?Mid-levelReveal - QExplain a production incident where
filter()caused a failure and how you fixed it.SeniorReveal
Frequently Asked Questions
How do I filter a list in Java 8+?
Use Stream.filter(): list.stream().filter(element -> condition).collect(Collectors.toList()). The filter() method takes a Predicate (a function returning boolean) and keeps only elements where the predicate returns true.
How do I filter null values from a Java stream?
Use filter(Objects::nonNull): stream.filter(Objects::nonNull).collect(Collectors.toList()). This removes all null elements before downstream operations, preventing NullPointerException in subsequent map() or other operations.
Can I combine multiple filter conditions in one stream?
Yes, you can either chain multiple .filter() calls or use && inside a single predicate: .filter(p -> condition1 && condition2). Both work; use multiple filters when conditions are logically separate for better readability.
What's the difference between filter().findFirst() and anyMatch()?
filter().findFirst() returns an Optional containing the first element matching the condition. anyMatch() returns a boolean — it's more concise and signals the intent of 'does any element match?' without extracting an element. Use anyMatch() for existence checks.
Does filter() support parallel streams?
Yes, filter() works with parallel streams. However, the predicate must be stateless and non-interfering. Also, parallel performance gains are only realised with large datasets and expensive predicates. Always measure before committing to parallel.
That's Collections. Mark it forged?
3 min read · try the examples if you haven't