Ansible Handlers: Production Traps & flush_handlers to Fix Them
Master Ansible handlers and notifications: notify, listen, flush_handlers, ordering, and the fail-trap.
20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.
Handlers run only at the end of a play, not immediately after the task that notifies them.
Use notify on a task to trigger one or more handlers by name.
Use listen on a handler to allow multiple tasks to notify it via a common topic.
Handlers are skipped if a task fails in the play (unless ignore_errors: yes or force_handlers: true is set).
Use meta: flush_handlers to force all pending handlers to run immediately, e.g., before a task that depends on the change.
Handler ordering is the order they are defined in the handlers block, not the order of notifications.
Idempotency: handlers only run if the notifying task reports 'changed' status.
Use changed_when: false on tasks that should never trigger handlers (e.g., checks).
Think of Ansible handlers as the 'cleanup crew' at a party. The tasks are the guests who make messes (change files, restart services). When a guest makes a mess, they send a notification (notify) to the cleanup crew. But the crew doesn't act immediately—they wait until the party ends (end of the play). This is efficient because if multiple guests make a mess, the crew only cleans once. However, there's a trap: if a guest gets into a fight (task fails) before the party ends, the cleanup crew might never show up! In production, this means a configuration change might trigger a restart, but if a later task fails, the restart never happens, leaving the system in a half-baked state. That's where flush_handlers comes in—it's like telling the crew to clean up right now, before the next guest arrives.
I'll never forget the 3 AM outage. We had an Ansible playbook that deployed a new version of our API service. It updated the config file, notified a handler to restart nginx, then ran a database migration. The migration failed—and nginx never restarted. The old config was still active, but the database schema had changed. Users got 500 errors for 45 minutes while we scrambled. The root cause? The handler never ran because the play failed before the playbook ended. That's when I truly understood Ansible handlers—and their dangerous default behavior.
Handlers were introduced in Ansible 1.0 to solve a common problem: how to restart a service only when its configuration actually changes. Instead of always restarting (which causes downtime), you notify a handler and it runs once at the end. It's elegant, but the 'run at end' behavior has burned many teams.
In this article, I'll cover everything you need to know about handlers: the notify keyword, the handlers block, the listen keyword for grouping, meta: flush_handlers for immediate execution, handler ordering, the infamous trap of handlers not running on failure, and idempotency best practices. Every example is production-tested.
The notify Keyword: How to Trigger a Handler
The notify keyword is used on a task to indicate that one or more handlers should be triggered if the task reports a 'changed' status. The value can be a single handler name or a list of handler names.
``yaml - name: Update nginx config ansible.builtin.template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf notify: restart nginx ``
If the template changes the file, the 'restart nginx' handler is added to the notification queue. If the file is already up-to-date, the task reports 'ok' and the handler is not notified.
You can notify multiple handlers: ``yaml - name: Update application config ansible.builtin.copy: src: app.conf dest: /opt/app/app.conf notify: - restart app - reload nginx ``
Gotcha: If the task uses changed_when to override the changed status, handlers only run if you explicitly set changed_when: true. For example: ``yaml - name: Check if reboot required ansible.builtin.stat: path: /var/run/reboot-required register: reboot_file changed_when: reboot_file.stat.exists notify: reboot server ``
In production, I've seen teams accidentally set changed_when: false on tasks that should trigger handlers, completely breaking the notification chain.
--syntax-check to catch duplicates.ansible-playbook --syntax-check which flagged duplicate handler names.--syntax-check.The handlers Block: Defining the Actions
Handlers are defined in a handlers block at the play level, or in a role's handlers/main.yml. Each handler is essentially a task with a name and a module action.
``yaml handlers: - name: restart nginx ansible.builtin.service: name: nginx state: restarted ``
Handlers can use any module, but common ones are service, systemd, command, or uri. They can also include variables and conditionals.
Order matters: Handlers run in the order they are defined in the handlers block, not the order they are notified. For example: ``yaml handlers: - name: restart app ansible.builtin.service: name: app state: restarted - name: restart nginx ansible.builtin.service: name: nginx state: restarted `` Even if 'restart nginx' is notified first, 'restart app' will run first because it's defined first.
Gotcha: If you use include_tasks or import_tasks for handlers, the order is determined by the order of inclusion. This can be surprising if you include multiple handler files.
The listen Keyword: Grouping Multiple Triggers
The listen keyword allows a handler to be triggered by multiple tasks via a common topic. Instead of notifying a handler by name, tasks notify a 'listen' topic, and any handler that listens to that topic runs.
``yaml handlers: - name: restart nginx listen: "restart web services" ansible.builtin.service: name: nginx state: restarted - name: restart apache listen: "restart web services" ansible.builtin.service: name: apache2 state: restarted ``
Tasks can notify the topic: ``yaml - name: Update nginx config template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf notify: "restart web services" ``
This is useful when multiple changes require the same set of actions. It also decouples the task from the exact handler name.
Gotcha: The listen value must be a string. If you use a variable, ensure it resolves to a string. Also, if two handlers listen to the same topic, they both run (in definition order).
listen to group related handlers, but be careful with naming collisions—treat listen topics like a shared namespace.meta: flush_handlers for Immediate Execution
By default, handlers run at the end of the play. But sometimes you need a handler to run immediately before a subsequent task. The meta: flush_handlers task forces all pending handlers to run right away.
```yaml - name: Update config template: src: app.conf.j2 dest: /opt/app/app.conf notify: restart app
- name: Flush handlers
- meta: flush_handlers
- name: Wait for app to be ready
- uri:
- url: http://localhost:8080/health
- status_code: 200
- until: result.status == 200
- retries: 10
- delay: 2
- ```
Without flush_handlers, the restart would happen after the play ends, and the health check would fail because the old process is still running.
Gotcha: flush_handlers only flushes handlers that have been notified up to that point. If a later task notifies a handler, it won't run until the next flush or end of play.
Production pattern: Use flush_handlers before any task that depends on the change being active, like health checks, smoke tests, or database migrations.
force_handlers: true is set.meta: flush_handlers whenever you need a handler's effect to be visible before the next task.Handler Ordering: Why Definition Order Matters
Handlers execute in the order they are defined in the handlers block, not the order they are notified. This is a common source of confusion.
Example: ``yaml handlers: - name: restart database service: name: postgresql state: restarted - name: restart app service: name: app state: restarted ``
Even if 'restart app' is notified first, 'restart database' runs first because it's defined first.
Why this matters: If your app depends on the database, you want the database handler to run first. So define it first.
Gotcha: When using include_handlers or import_tasks for handlers, the order is determined by the order of inclusion. This can be tricky if you include multiple files from different roles.
Best practice: Define all handlers in a single file with explicit ordering, or use listen topics to group them logically.
ansible-playbook --list-handlers to see the order.The Common Trap: Handlers Not Running When Tasks Fail
This is the most dangerous handler behavior: if any task in the play fails (and ignore_errors is not set), all pending handlers are skipped at the end of the play. This is by design to avoid applying changes from an incomplete run.
Example: ``yaml - hosts: all tasks: - name: Update config template: src: config.j2 dest: /etc/app/config notify: restart app - name: Run migration command: /opt/app/migrate.sh # This fails - name: Clean up file: path: /tmp/old_config state: absent handlers: - name: restart app service: name: app state: restarted ``
If the migration fails, the 'restart app' handler never runs. The config file is updated, but the app is still running the old version. This can lead to inconsistent state.
Solutions: 1. Use force_handlers: true at the play level to force handlers to run even on failure. 2. Use meta: flush_handlers before the risky task to ensure the handler runs before the potential failure. 3. Use ignore_errors: yes on tasks that shouldn't block handlers (but be careful).
force_handlers: true to override.force_handlers: true on any play that modifies critical services.force_handlers: true on plays that manage services to ensure handlers run even on failure.Idempotency and Handlers: Making Sure They Only Run When Needed
Handlers are inherently idempotent because they only run when the notifying task reports 'changed'. But you can break idempotency if your tasks always report 'changed'.
- Using
commandorshellmodules withoutcreates,removes, orchanged_when. - Using
copywithforce: yesbut the source file changes every time (e.g., from a build artifact). - Using
templatewith a source that changes every run (e.g., includes a timestamp).
- Use
changed_whento explicitly define when a task should be considered changed. - For
commandtasks, usecreatesorremovesto avoid running unnecessarily. - Use
registerand conditionals to only notify handlers when actual changes occur.
Example of a properly idempotent task: ``yaml - name: Download new artifact get_url: url: "{{ artifact_url }}" dest: /opt/app/app.jar checksum: "sha256:{{ artifact_checksum }}" register: download notify: restart app ``
If the checksum matches, the file is not downloaded, and the handler is not notified.
changed_when, creates, and checksums to ensure tasks only report 'changed' when something actually changes.Using force_handlers to Guarantee Handler Execution
The force_handlers play-level directive ensures that all notified handlers run even if a task fails. This is critical for plays that must apply changes regardless of subsequent failures.
``yaml - hosts: all force_handlers: true tasks: - name: Update config template: src: config.j2 dest: /etc/app/config notify: restart app - name: Risky migration command: /opt/app/migrate.sh # This might fail handlers: - name: restart app service: name: app state: restarted ``
With force_handlers: true, even if the migration fails, the 'restart app' handler will run. This prevents the app from running with an old config.
Caveat: force_handlers only ensures handlers run. It does not prevent the play from failing—Ansible will still report a failure. But the handler's effect is applied.
When to use: Always use on plays that manage production services. Reserve the default behavior for plays where you want to roll back on failure (e.g., database schema changes).
force_handlers: true. This single change prevented multiple incidents.force_handlers: true on any play where handler execution is critical, especially service restarts.Handlers with include_tasks and import_tasks
Handlers can be defined in separate files and included using include_tasks or import_tasks. This is common in roles.
``yaml # handlers/main.yml - name: restart app service: name: app state: restarted ``
``yaml # tasks/main.yml - name: Update config template: src: config.j2 dest: /etc/app/config notify: restart app ``
Gotcha: When using include_tasks (dynamic inclusion), handler names are evaluated at runtime. This can cause issues if the same handler name is defined in multiple included files—the last one wins.
Best practice: Use import_tasks (static) for handlers to ensure proper ordering and uniqueness checking at parse time. Dynamic inclusion with include_tasks can lead to duplicate handler names being silently overwritten.
Production pattern: In roles, always define handlers in handlers/main.yml and use import_tasks in the playbook if you need to include role handlers in a specific order.
import_tasks for handlers when possible.import_tasks and unique handler names.import_tasks for handlers to avoid runtime surprises with duplicate names.Debugging Handlers: Verbose Output and List Commands
When handlers misbehave, use these commands to debug:
- List all handlers in a playbook:
- ```bash
- ansible-playbook playbook.yml --list-handlers
- ```
- Run with verbose output to see handler notifications:
- ```bash
- ansible-playbook playbook.yml -v
- ```
- Look for lines like:
- ```
- NOTIFIED HANDLER restart nginx
- ```
- Run with step mode to confirm order:
- ```bash
- ansible-playbook playbook.yml --step
- ```
- Check if a handler was skipped due to failure:
- ```bash
- ansible-playbook playbook.yml -v 2>&1 | grep -E '(failed|skipping|handler)'
- ```
Gotcha: In Ansible 2.x, handlers that are skipped due to a failed task do not appear in the output at all. You only see the failure. This is why force_handlers: true is important.
ansible-playbook --list-handlers and discovered that a handler we thought was defined was actually missing due to a typo in the file name. The command saved us hours of debugging.--list-handlers and -v before running on production.Advanced Patterns: Conditional Handlers and Loops
Handlers can include conditionals (when) and can be used with loops, but there are nuances.
Conditional handlers: ``yaml handlers: - name: restart app service: name: app state: restarted when: ansible_facts.os_family == "Debian" ``
Handler with loop: ``yaml handlers: - name: restart services service: name: "{{ item }}" state: restarted loop: - nginx - app ``
Gotcha: If you use listen with a loop, each item in the loop is a separate handler instance. Notifying the listen topic triggers all of them.
Production pattern: Use listen with a loop to restart multiple services from a single notification. For example: ``yaml handlers: - name: restart web stack listen: "restart web" service: name: "{{ item }}" state: restarted loop: - nginx - php-fpm ``
Then any task can notify 'restart web' and both services restart.
Common Pitfalls and How to Avoid Them
Here are the most frequent mistakes I've seen with handlers:
- Assuming handlers run immediately: They don't. Use
flush_handlersif you need immediate effect. - Not using force_handlers: Leads to skipped restarts on failure.
- Duplicate handler names: Silently overwritten. Use unique names or
import_tasksto catch duplicates. - Wrong ordering: Define handlers in dependency order.
- Not checking idempotency: Tasks that always report 'changed' cause unnecessary handler runs.
- Using
listenwith variables that might be undefined: Always ensure the listen topic is a literal string or a variable that resolves. - Forgetting that handlers are skipped on unreachable hosts: If a host becomes unreachable during the play, handlers for that host are skipped.
- Always add
force_handlers: trueto plays with service restarts. - Run
--list-handlersbefore deployment. - Use
-vto verify handler notifications in CI/CD. - Test failure scenarios in staging.
force_handlers: true and handle unreachability separately.force_handlers: true.force_handlers to mitigate.The Handler That Never Ran
force_handlers: true to the play and used meta: flush_handlers before the critical migration task.- Always assume handlers might not run if any task fails.
- Use
force_handlers: trueorflush_handlersto ensure critical actions like restarts happen regardless.
force_handlers: true to the play. Use -v to see handler execution status.listenlisten topics can overlap; use ansible-playbook --syntax-check to detect duplicates.meta: flush_handlers before the dependent task.ansible-playbook playbook.yml -v | grep -E '(changed|failed|handler)'ansible-playbook playbook.yml --force-handlersforce_handlers: true to play or use meta: flush_handlers before critical tasksKey takeaways
listen to group multiple handlers under a common topic for decoupled notifications.--syntax-check.changed_when to avoid unnecessary handler triggers.ansible-playbook --list-handlers to preview handler order before deployment.force_handlers.Common mistakes to avoid
6 patternsNot using force_handlers on critical plays
force_handlers: true to the playDuplicate handler names in included files
import_tasks and verify with --syntax-checkAssuming handlers run immediately after notify
meta: flush_handlers before dependent tasksUsing command/shell without changed_when
changed_when: false or use creates/removesDefining handlers in wrong order
Using listen with a variable that might be undefined
Interview Questions on This Topic
What is the default behavior of Ansible handlers when a task fails?
force_handlers: true at the play level.Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.
That's Ansible. Mark it forged?
9 min read · try the examples if you haven't