Advanced 12 min · 2026-06-21

Ansible Inventory Management: Static, Dynamic, and Variable Precedence Pitfalls

Master Ansible inventory management: static/dynamic files, INI vs YAML, groups, host_vars, group_vars, inventory plugins, and variable precedence.

N
Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Lessons pulled from things that broke in production.

Follow
Production
production tested
June 21, 2026
last updated
1,596
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

Use ansible-inventory --list to verify inventory structure before running playbooks. Static inventory files (INI or YAML) are fine for small environments; use dynamic inventory plugins for cloud or CMDB sources. YAML format is preferred over INI for complex inventories due to better readability and nesting support. Host variables in host_vars/.yaml override group variables from group_vars/.yaml. Group variable precedence: all groups → children groups → parent groups (lower group_vars dirs) → group_vars/all is lowest. Dynamic inventory scripts must output JSON in the specific format expected by Ansible (e.g., _meta key for hostvars). Variable precedence from inventory: host_vars > group_vars of host's groups (last group wins) > group_vars/all. Common gotcha: ansible-playbook -i inventory without --diff may hide variable overrides; use -v to debug.

✦ Definition~90s read
What is Ansible Inventory Management?

Ansible inventory is a collection of hosts (servers, network devices, etc.) that Ansible manages. It defines the target nodes for playbooks and ad-hoc commands. The inventory can be static (a file) or dynamic (generated by a script or plugin). Static inventory files are typically written in INI or YAML format and stored in a directory structure with host_vars and group_vars subdirectories.

Imagine you're planning a large family reunion.

Dynamic inventory pulls host data from external sources like AWS EC2, Azure, or a CMDB.

Inventory is the first thing Ansible reads when you run ansible or ansible-playbook. It determines which hosts are available, how they are grouped, and what variables are associated with them. Variables defined in inventory are merged with variables from other sources (playbooks, roles, command line) following a strict precedence order.

Understanding this order is crucial for predictable behavior.

The problem inventory solves is fundamental: how to express infrastructure topology and configuration data in a way that is both human-readable and machine-parsable. Without inventory, you'd have to hardcode hostnames and variables in every playbook, leading to duplication and maintenance hell. Inventory provides a central, declarative way to define your infrastructure.

Plain-English First

Imagine you're planning a large family reunion. You have a list of all relatives (the inventory), and you need to send each person a personalized invitation. Some details apply to everyone (like the date and location), some apply to entire families (like dietary preferences for the Smiths), and some are specific to individuals (like a special note for Aunt Carol). Ansible inventory is that list: you define hosts (people), groups (families), and variables (details). Static inventory is like a paper address book — you write everything down manually. Dynamic inventory is like a live phonebook that automatically updates when someone moves. Variable precedence is the rule that decides which detail wins if there's a conflict: a note written directly on a person's page beats a note on the family page, which beats a general note on the front cover.

I'll never forget the Tuesday morning when our production deployment started failing with cryptic SSH key errors. Our playbooks had run fine for months, but suddenly Ansible couldn't connect to a batch of new servers. The error was 'Permission denied (publickey)'. My first assumption was a key rotation gone wrong. But after an hour of head-scratching, I discovered the real culprit: a misconfigured dynamic inventory script that was returning an empty ansible_user for those hosts, causing Ansible to fall back to the current user (which had no access). That incident taught me the hard way how inventory management — especially variable precedence and inventory sources — can silently break deployments.

Ansible inventory is the backbone of configuration management. It defines which hosts to manage, how to group them, and what variables apply. Without a solid understanding of inventory, you'll run into mysterious failures, inconsistent variable resolution, and maintenance nightmares. This article covers everything a beginner needs: static and dynamic inventory files, INI vs YAML formats, groups and children, host_vars and group_vars directories, inventory plugins, and the critical rules of variable precedence across inventory.

By the end, you'll know how to structure your inventory for production, debug common issues, and avoid the pitfalls that tripped me up. We'll use real commands and production-grade examples throughout.

Static Inventory Files: INI vs YAML Format

Static inventory files are the simplest way to define your infrastructure. Ansible supports two formats: INI and YAML. INI is the legacy format, but YAML is now recommended for complex inventories due to its nesting capabilities and readability.

INI Format Example (inventory.ini): ```ini [webservers] web1.example.com ansible_user=deploy web2.example.com

[dbservers] db1.example.com

[production:children] webservers dbservers

[webservers:vars] http_port=80 ```

YAML Format Example (inventory.yaml): ``yaml all: children: webservers: hosts: web1.example.com: ansible_user: deploy web2.example.com: vars: http_port: 80 dbservers: hosts: db1.example.com: production: children: webservers: dbservers: ``

Production Insight: INI format can lead to subtle bugs when using :vars and :children syntax incorrectly. For example, forgetting the :children suffix on a group that contains other groups will silently ignore the group membership. YAML eliminates this ambiguity. In a recent migration, I found a stale INI inventory where a group was misspelled in :children, causing 20 servers to be excluded from deployments for months.

Key Takeaway: For any inventory with more than 10 hosts or nested groups, use YAML format to avoid parsing errors and improve maintainability.

INI Format Pitfall
When using INI, ensure group names in :children exactly match existing group names. Ansible does not warn about missing groups; it simply ignores the line.
Production Insight
We once had a production outage because a junior engineer added a new group in INI format but used [newgroup] instead of [newgroup:children] for a parent group. The hosts were listed but never included in the parent. The fix was to switch to YAML and add a CI linting step with ansible-inventory --list.
Key Takeaway
Prefer YAML for static inventories; it provides clear nesting and reduces configuration errors.

Groups, Children, and Group Hierarchy

Groups are the primary way to organize hosts in Ansible. A group can contain hosts, other groups (via children), or both. Group hierarchy allows you to apply variables to a set of hosts efficiently.

Defining Groups and Children: ``yaml all: children: us_east: children: webservers: hosts: web-east-1: web-east-2: dbservers: hosts: db-east-1: us_west: children: webservers: hosts: web-west-1: dbservers: hosts: db-west-1: ``

Group Resolution: Ansible flattens the group hierarchy at runtime. A host belongs to all groups it is directly or indirectly a member of. For example, web-east-1 belongs to webservers, us_east, and implicitly all.

Variable Inheritance: Variables defined on a group apply to all hosts in that group and its children. If a variable is defined on multiple groups in the hierarchy, the last group (in alphabetical order) wins for that host. This is often confusing; see the variable precedence section.

Production Insight: In a multi-region deployment, we used nested groups for regions and tiers. The problem was that we defined ntp_server on both us_east and webservers groups. Hosts in us_east got the us_east value, but hosts in us_west got the webservers value because us_west didn't define it. The fix was to define region-specific variables only on region groups and tier-specific variables only on tier groups, avoiding overlap.

Key Takeaway: Design group hierarchy to minimize variable conflicts. Use group_vars/all for truly global defaults, and override only at the most specific level needed.

Use ansible-inventory --graph
Run ansible-inventory -i inventory --graph to visualize the group hierarchy. This helps debug unexpected group membership.
Production Insight
During a data center migration, we had groups named dc1 and dc2 as children of all. A host was moved from dc1 to dc2 but the old group_vars file for dc1 was not cleaned up. The host still picked up variables from dc1 because it was still listed in that group. The fix was to remove the host from the old group in the inventory file.
Key Takeaway
Use ansible-inventory --graph to verify group membership after any inventory change.

host_vars and group_vars Directories

Ansible automatically loads variables from host_vars/ and group_vars/ directories located relative to the inventory file or playbook directory. These directories contain YAML files named after the host or group.

Directory Structure Example: `` production/ inventory.yaml host_vars/ web1.example.com.yaml db1.example.com.yaml group_vars/ all.yaml webservers.yaml dbservers.yaml us_east.yaml ``

File Naming: The file name must match the hostname (for host_vars) or group name (for group_vars) exactly, including domain suffix. For example, web1.example.com.yaml for host web1.example.com.

Variable Loading Order: For a given host, Ansible loads variables in this order: 1. group_vars/all 2. group_vars of parent groups (alphabetically) 3. group_vars of the host's immediate groups (alphabetically) 4. host_vars/<hostname>

Later files override earlier ones. This means host_vars always wins over any group_vars, and within group_vars, the last group alphabetically wins.

Production Insight: We once had a variable app_port defined in group_vars/all.yaml as 8080, in group_vars/webservers.yaml as 80, and in host_vars/web1.example.com.yaml as 3000. The host web1.example.com got 3000, as expected. But another host web2.example.com (no host_vars) got 80, which was correct. However, we also had a group_vars/production.yaml that defined app_port: 9090. Because production was a parent group of webservers, and production came alphabetically after webservers, it actually overrode webservers? No, the loading order is parent groups first, then immediate groups. Since production is a parent, its variables are loaded before webservers, so webservers wins. This is a common point of confusion.

Key Takeaway: To avoid confusion, use group_vars/all for defaults, and override in specific group_vars or host_vars. Avoid defining the same variable in multiple group_vars at the same level.

Variable Precedence Order
The full variable precedence (from lowest to highest): role defaults → inventory vars (group_vars/all) → inventory group_vars (parent groups) → inventory group_vars (child groups) → host_vars → playbook group_vars/ host_vars → extra vars. See Ansible docs for the complete list.
Production Insight
I once debugged a case where a host was getting the wrong ansible_host IP. The IP was defined in group_vars/all but overridden in group_vars/datacenter_a. However, the host belonged to both datacenter_a and datacenter_b groups. Because datacenter_b came alphabetically after datacenter_a, its group_vars loaded last and overrode the IP. The fix was to ensure only one group defined that variable for the host.
Key Takeaway
Group variable loading order is alphabetical; be aware that group names affect variable resolution.

Dynamic Inventory: Scripts and Plugins

Dynamic inventory sources generate host lists at runtime from external systems like cloud providers, CMDBs, or custom databases. Ansible supports two approaches: inventory scripts (executable files that output JSON) and inventory plugins (Python modules). Plugins are the modern, preferred method.

Inventory Scripts: A script must be executable and accept --list (return all hosts) and --host <hostname> (return variables for a specific host). The output JSON must include a _meta key with hostvars for all hosts to avoid multiple script calls.

Example minimal script output: ``json { "_meta": { "hostvars": { "web1": { "ansible_host": "10.0.0.1", "ansible_user": "deploy" } } }, "webservers": { "hosts": ["web1"] } } ``

Inventory Plugins: Built-in plugins for AWS (aws_ec2), GCP (gcp_compute), Azure (azure_rm), and many more. They are configured in YAML files and are more efficient than scripts.

Example `aws_ec2.yaml`: ``yaml plugin: aws_ec2 regions: - us-east-1 filters: tag:Environment: production hostnames: - tag:Name keyed_groups: - key: tags.Environment prefix: env - key: tags.Role prefix: role compose: ansible_host: public_ip_address ``

Production Insight: In the incident I mentioned earlier, our custom Python script had a bug: it didn't include ansible_user for hosts in a specific AZ. The script passed --list validation but returned empty strings. The fix was to add a validation step in CI that runs ansible-inventory --list and checks for required variables using jq.

Key Takeaway: Prefer inventory plugins over custom scripts for cloud providers. They are maintained by Ansible and include built-in caching and error handling.

Cache Dynamic Inventory
Enable caching for dynamic inventory to speed up playbook runs: set cache: yes and cache_plugin: jsonfile in the plugin configuration, and configure fact_caching_timeout in ansible.cfg.
Production Insight
We had a custom inventory script that queried a CMDB API. The API had a rate limit, and during peak hours, the script would timeout, causing Ansible to fail with 'Failed to parse inventory'. We switched to using the constructed plugin with a static inventory file that was periodically refreshed via a cron job, avoiding real-time API calls.
Key Takeaway
Always test dynamic inventory scripts with ansible-inventory --list and validate the output format, especially the _meta key.

Variable Precedence Across Inventory

Understanding variable precedence is critical to avoid surprises. Ansible merges variables from multiple sources in a specific order. For inventory-related sources, the order from lowest to highest priority is: 1. group_vars/all 2. group_vars of parent groups (alphabetically) 3. group_vars of child groups (alphabetically) 4. host_vars/<hostname> 5. Playbook group_vars/ and host_vars/ (if using vars_files or include_vars) 6. Extra vars (-e)

Key Rule: Host variables override group variables. Among group variables, the last group alphabetically that the host belongs to wins. Note that groups are processed breadth-first: all parent groups (from root to leaf) are loaded first, then child groups. But within each level, alphabetical order applies.

Example: Host web1 belongs to groups webservers and us_east. us_east is a parent of webservers. The order of loading: - group_vars/all - group_vars/us_east (parent) - group_vars/webservers (child) - host_vars/web1

If group_vars/us_east and group_vars/webservers both define the same variable, webservers wins because it's a child and loaded later.

Production Insight: I once saw a team define ansible_user in group_vars/all as centos, but also in group_vars/webservers as ec2-user. They assumed all would be overridden, but because webservers is a child, it did override. However, a host that belonged to both webservers and dbservers (where dbservers came alphabetically after webservers) got dbservers's ansible_user if defined. This caused confusion. The fix was to use host_vars for hosts that needed a specific user.

Key Takeaway: To predict variable values, trace the group hierarchy and alphabetical order. Use ansible-inventory --host <hostname> --export to see the final resolved variables.

Full Precedence Order
The complete variable precedence list (from lowest to highest) is documented at https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#variable-precedence. Note that group_vars/all is lower than group_vars of any named group.
Production Insight
We had a variable app_version defined in group_vars/all as 1.0, in group_vars/production as 2.0, and in group_vars/canary as 3.0. A host in both production and canary groups got 3.0 because canary comes after production alphabetically. This was intentional for canary testing, but a new engineer assumed production would win and was confused.
Key Takeaway
Use alphabetical naming of groups to control variable precedence, or avoid defining the same variable in multiple group_vars at the same hierarchy level.

Inventory Plugins: Built-in and Custom

Ansible provides over 20 inventory plugins for various sources: cloud providers, databases, file systems, and more. They are configured in YAML files and are more reliable than custom scripts.

Common Built-in Plugins: - aws_ec2: AWS EC2 instances - azure_rm: Azure VMs - gcp_compute: Google Compute Engine - vmware_vm_inventory: VMware VMs - constructed: Build groups and variables from existing inventory - ini: Parse INI format (static) - yaml: Parse YAML format (static)

Using a Plugin: Create a YAML file (e.g., aws_ec2.yaml) with the plugin configuration and use it as the inventory source: ``bash ansible-inventory -i aws_ec2.yaml --list ``

Custom Inventory Plugins: You can write your own plugin by subclassing BaseInventoryPlugin and implementing parse(). This is advanced but gives full control.

Production Insight: We migrated from a custom script to the aws_ec2 plugin and saw immediate benefits: built-in caching, better error messages, and automatic handling of pagination. However, we had to adjust our group naming because the plugin uses keyed_groups which creates groups with prefixes like env_production. Our playbooks expected group names like production. We added a custom keyed_groups mapping to match.

Key Takeaway: Use inventory plugins for cloud sources. They are well-tested and reduce maintenance burden.

Plugin Documentation
List available plugins: ansible-doc -t inventory -l. Get plugin-specific help: ansible-doc -t inventory aws_ec2.
Production Insight
We once used the constructed plugin to create groups based on hostname patterns. For example, all hosts starting with 'web' were added to a webservers group. This eliminated the need to maintain a static list. The configuration was simple:
```yaml
plugin: constructed
strict: false
keyed_groups:
- prefix: ''
key: inventory_hostname
separator: ''
parent_group: all
```
Key Takeaway
The constructed plugin is powerful for dynamically assigning groups based on host variables or names.

Best Practices for Inventory File Organization

Organizing inventory files for production is about scalability, security, and maintainability. Here are patterns I've used in large deployments.

1. Separate Environments: Use different inventory directories for dev, staging, production. `` inventories/ production/ inventory.yaml host_vars/ group_vars/ staging/ inventory.yaml host_vars/ group_vars/ ``

2. Use group_vars/all for Global Defaults: Put common variables like ntp_server, dns_server, and ansible_user here. Override in specific groups only when necessary.

3. Keep Secrets Out of Inventory: Use Ansible Vault for sensitive variables. Store encrypted files in group_vars/all/vault.yaml or use ansible-vault encrypt on individual files.

4. Use Dynamic Inventory for Cloud: For cloud environments, use the appropriate plugin. For on-prem, use static YAML inventory with version control.

5. Validate Inventory in CI: Add a CI step that runs ansible-inventory --list and checks for required variables using a script or jq.

Production Insight: In one project, we had a single monolithic inventory file with hundreds of hosts. It became unmanageable. We split it into environment-specific directories and used ansible-inventory --export to generate a combined view for debugging. The change reduced deployment errors by 40%.

Key Takeaway: Organize inventory by environment and use group_vars/all for defaults. Validate inventory in CI.

Never Commit Secrets in Plain Text
Use ansible-vault to encrypt any sensitive variables in inventory files. Store the vault password securely (e.g., in a password manager or CI secret).
Production Insight
We had a security incident where a developer accidentally committed a plaintext AWS secret key in group_vars/all. The fix was to use Vault and add a pre-commit hook that scans for potential secrets using git-secrets.
Key Takeaway
Always encrypt sensitive variables with Ansible Vault and enforce this with pre-commit hooks.

Using ansible-inventory Command for Debugging

The ansible-inventory command is your best friend for debugging inventory issues. It can list, graph, export, and validate inventory.

Common Usage: ```bash # List all hosts with groups ansible-inventory -i inventory.yaml --list

# Graph group hierarchy ansible-inventory -i inventory.yaml --graph

# Export variables for a specific host ansible-inventory -i inventory.yaml --host web1.example.com --export

# Validate inventory (check for syntax errors) ansible-inventory -i inventory.yaml --list > /dev/null ```

Example Output of --graph: `` @all: |--@ungrouped: |--@us_east: | |--@webservers: | | |--web-east-1 | | |--web-east-2 | |--@dbservers: | | |--db-east-1 |--@us_west: | |--@webservers: | | |--web-west-1 | |--@dbservers: | | |--db-west-1 ``

Production Insight: When debugging the SSH user incident, I ran ansible-inventory --host newserver --export and saw ansible_user: "". That immediately pointed to the dynamic inventory script. Without this command, I would have wasted hours checking SSH keys.

Key Takeaway: Use ansible-inventory --list and --host to inspect resolved inventory before running playbooks.

Output Format
The --list output includes _meta with hostvars. Use jq to filter: ansible-inventory -i inventory.yaml --list | jq '._meta.hostvars'.
Production Insight
I often pipe ansible-inventory --list to jq to quickly check variables across hosts. For example: ansible-inventory -i prod --list | jq '._meta.hostvars | to_entries[] | {host: .key, user: .value.ansible_user}'.
Key Takeaway
Master ansible-inventory commands; they are essential for troubleshooting.

Common Inventory Gotchas with INI Format

Despite being simpler, INI format has several pitfalls that can cause silent failures.

Gotcha 1: Missing :children suffix ``ini [production] webservers dbservers ` This creates a group production with hosts named webservers and dbservers, not the groups themselves. To include groups, you need: `ini [production:children] webservers dbservers ``

Gotcha 2: Whitespace in hostnames Trailing spaces after hostnames can cause Ansible to fail to connect. Always trim whitespace.

Gotcha 3: Case sensitivity Group names are case-sensitive. [Webservers] and [webservers] are different groups.

Gotcha 4: Variables with spaces In INI, variables cannot have spaces in values without quotes. Use ansible_user=my user fails; use ansible_user="my user".

Production Insight: We once had a host that was intermittently unreachable. The cause was a trailing space after the hostname in the INI file. Ansible parsed the hostname as web1 (with space) and tried to resolve that hostname, which failed. The fix was to add a linting rule to check for trailing whitespace.

Key Takeaway: If you must use INI, validate with ansible-inventory --list and use a linter like ansible-lint.

INI Deprecation
INI format is not deprecated but YAML is recommended for new projects. The ini plugin may be removed in a future version of Ansible.
Production Insight
A colleague once spent a day debugging why a group [db:children] didn't work. He had written [db:child] (singular). Ansible silently ignored it. The fix was to use YAML where such errors are caught at parse time.
Key Takeaway
Prefer YAML over INI to avoid subtle syntax errors.

Variable Precedence: Inventory vs Other Sources

Inventory variables are just one source. The full precedence from lowest to highest includes: 1. Role defaults (roles/role/defaults/main.yml) 2. Inventory vars (group_vars/all, group_vars/, host_vars/) 3. Playbook vars (vars: in play) 4. vars_files and include_vars 5. Role vars (roles/role/vars/main.yml) 6. Block vars (only for tasks in block) 7. Task vars (only for that task) 8. register variables 9. set_fact 10. Extra vars (-e)

Key Point: Inventory variables are relatively low in precedence. They can be overridden by playbook vars and extra vars. This is intentional: playbooks should be able to override inventory defaults for a specific run.

Production Insight: We had a scenario where we wanted to run a playbook with a different ansible_user temporarily. Using -e ansible_user=tempuser worked because extra vars have highest precedence. However, we forgot to remove the -e flag in a CI pipeline, and it overrode the inventory variable for all subsequent runs. The fix was to use --extra-vars only in ad-hoc commands, not in CI.

Key Takeaway: Be aware that extra vars override inventory vars. Use them sparingly in automation.

Precedence Diagram
See the official Ansible documentation for a visual diagram of variable precedence: https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#variable-precedence
Production Insight
We once had a playbook that defined app_port: 8080 in its vars section. The inventory had app_port: 3000 for that host. The playbook's var took precedence, causing the application to start on the wrong port. The fix was to remove the var from the playbook and rely on inventory.
Key Takeaway
Inventory variables are not the highest precedence; playbook vars and extra vars override them.

Using Host Variables for Overrides

Host variables are the most specific inventory-level variables. They are defined in host_vars/<hostname>.yaml and override any group variables.

When to Use Host Variables: - Per-host secrets (e.g., ansible_become_password) - Unique configuration (e.g., a specific http_port for a load balancer) - Overriding group defaults for a specific host

Example host_vars/web1.example.com.yaml: ``yaml ansible_host: 10.0.0.1 ansible_user: deploy http_port: 3000 ``

Best Practices: - Keep host variables minimal. Prefer group variables for shared config. - Use host variables only when a host truly differs from its group. - Consider using dictionary variables to organize multiple overrides.

Production Insight: We had a host that needed a different SSH port (2222) because of firewall restrictions. We set ansible_port: 2222 in host_vars. However, we also had a group variable ansible_port: 22 in group_vars/all. The host variable correctly overrode it. The mistake was that we forgot to update the host variable when the firewall changed, causing a deployment failure. We now use a dynamic inventory plugin that reads the port from a CMDB.

Key Takeaway: Use host variables sparingly and document why each override exists.

Host Variable Naming
The host variable file name must match the inventory hostname exactly, including domain. For example, web1.example.com.yaml for host web1.example.com.
Production Insight
We once had a host named web-01 in inventory but the file was named web01.yaml. Ansible silently ignored it. The fix was to ensure file names match exactly, and we added a CI check that compares hostnames in inventory to files in host_vars/.
Key Takeaway
Host variable files must be named exactly as the host appears in inventory.

Common Mistakes and How to Avoid Them

Based on years of production experience, here are the most common inventory mistakes.

1. Forgetting to update inventory after infrastructure changes: When servers are added or removed, the inventory becomes stale. Use dynamic inventory to avoid this.

2. Overusing group_vars/all: Putting all variables there defeats the purpose of group-specific overrides. Use it only for true global defaults.

3. Not using ansible-inventory to validate: Always run ansible-inventory --list after changes.

4. Mixing INI and YAML in the same inventory: This can cause parse errors. Stick to one format per inventory directory.

5. Ignoring variable precedence: Assuming that inventory variables always take effect without considering playbook vars or extra vars.

6. Hardcoding credentials in inventory: Use Ansible Vault or environment variables.

Key Takeaway: Validate your inventory with ansible-inventory and follow the principle of least privilege for variables.

Stale Inventory = Silent Failures
Ansible will not warn you if a host in inventory no longer exists. Use dynamic inventory or regularly audit your static files.
Production Insight
We had a monthly cleanup job that removed old EC2 instances. Our static inventory still listed them, causing playbooks to fail with connection timeouts. The fix was to switch to the aws_ec2 plugin, which automatically reflects the current state.
Key Takeaway
Use dynamic inventory for cloud environments to avoid stale host entries.
● Production incidentPOST-MORTEMseverity: high

The Case of the Missing SSH User

Symptom
Ansible failed with 'Permission denied (publickey)' on a subset of hosts. Other hosts worked fine. The error appeared only after adding new servers to the cloud.
Assumption
We assumed the SSH key was not deployed to the new servers, or that the key rotation script had failed.
Root cause
The custom dynamic inventory script (Python) had a bug: it did not populate ansible_user for hosts in a specific availability zone. The script output JSON with an empty string for ansible_user for those hosts. Ansible's variable precedence then used the empty string, overriding the default ansible_user from group_vars/all.
Fix
Fixed the dynamic inventory script to always include ansible_user for all hosts. Added validation in CI to run ansible-inventory --list and check for missing required variables. Also added a default ansible_user in group_vars/all as a fallback.
Key lesson
  • Always validate dynamic inventory output before using it.
  • Use ansible-inventory --list to inspect the resolved inventory, and set sensible defaults in group_vars/all for critical variables like ansible_user.
Production debug guideSymptom → Root cause → Fix4 entries
Symptom · 01
'ansible-inventory --list' shows unexpected host count or missing groups
Fix
Run ansible-inventory --graph to visualize group hierarchy. Check inventory file paths and ensure dynamic scripts are executable. Use -i with explicit path.
Symptom · 02
Variable doesn't take effect as expected (e.g., wrong port or user)
Fix
Run ansible-inventory --host <hostname> --export to see resolved variables for that host. Check variable precedence: host_vars > group_vars (last group wins) > group_vars/all. Use debug module in playbook to print variable.
Symptom · 03
Dynamic inventory script returns error 'Failed to parse'
Fix
Run the script manually to check its output format. It must return valid JSON with _meta key for hostvars. Use ansible-inventory -i script.py --list to see parsing errors.
Symptom · 04
Playbook runs on wrong hosts or skips hosts
Fix
Check --limit flag and inventory group membership. Run ansible all -i inventory --list-hosts to see which hosts are matched. Verify group names in inventory file.
★ Ansible Inventory Management Quick Referenceprint this for your desk
Inventory not found
Immediate action
Check file path
Commands
ansible-inventory -i /path/to/inventory --list
ls -la /path/to/inventory
Fix now
Set correct path in ansible.cfg or use -i flag
Host not in expected group+
Immediate action
Verify group membership
Commands
ansible-inventory -i inventory --graph
ansible all -i inventory --list-hosts
Fix now
Update inventory file to assign host to correct group
Variable override not working+
Immediate action
Check precedence
Commands
ansible-inventory -i inventory --host <hostname> --export
ansible <hostname> -m debug -a 'var=hostvars[inventory_hostname]'
Fix now
Move variable to higher precedence location (host_vars over group_vars)
Dynamic inventory script fails+
Immediate action
Test script manually
Commands
./inventory_script.py --list
ansible-inventory -i ./inventory_script.py --list
Fix now
Fix script output format; ensure JSON with _meta key
Playbook uses wrong ansible_user+
Immediate action
Inspect resolved vars
Commands
ansible-inventory -i inventory --host <hostname> --export | grep ansible_user
ansible <hostname> -m ping -u <correct_user>
Fix now
Set ansible_user in host_vars or correct group_vars
INI vs YAML Inventory Format Comparison
FeatureINIYAMLRecommendation
ReadabilitySimple for flat structuresMore readable for nested groupsYAML for complex
Group nestingUses :children and :vars suffixesNative nesting with children keyYAML
Variable definitionInline with host or [group:vars]Under vars keyYAML
Error detectionSilent on missing groupsParse errors on malformed YAMLYAML (catches errors early)
Tooling supportBasicWide (YAML linters, editors)YAML
PerformanceSlightly faster parsingSlightly slower but negligibleEither acceptable
Future supportMay be deprecatedActively developedYAML

Key takeaways

1
Use YAML format for all but the simplest static inventories to avoid parsing errors.
2
Organize inventory by environment (dev/staging/prod) with separate directories.
3
Use ansible-inventory --list and --graph to validate inventory before running playbooks.
4
Understand variable precedence
host_vars > group_vars (last group alphabetically) > group_vars/all.
5
Prefer dynamic inventory plugins (e.g., aws_ec2) over custom scripts for cloud sources.
6
Keep secrets out of plain text inventory files; use Ansible Vault.
7
Group variable loading order is alphabetical; be mindful of group names.
8
Extra vars (-e) override inventory vars; use them carefully in automation.

Common mistakes to avoid

6 patterns
×

Using INI format without :children suffix for group membership

Symptom
Hosts not included in parent group
Fix
Use [parent:children] syntax or switch to YAML
×

Putting all variables in group_vars/all

Symptom
Overrides become impossible without host_vars
Fix
Use group-specific group_vars for shared config; host_vars for overrides
×

Not validating inventory with ansible-inventory

Symptom
Unexpected host lists or variables
Fix
Run ansible-inventory --list after every change
×

Hardcoding secrets in inventory files

Symptom
Secrets committed to version control
Fix
Use Ansible Vault to encrypt sensitive variables
×

Assuming alphabetical order of groups doesn't matter

Symptom
Variable overrides behave unpredictably
Fix
Be aware that group_vars load alphabetically; avoid defining same var in multiple groups
×

Using a dynamic inventory script that doesn't output _meta key

Symptom
Ansible calls --host for each host, causing performance issues
Fix
Include _meta.hostvars in script output
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the difference between INI and YAML inventory formats? When woul...
Q02SENIOR
How does Ansible resolve variable precedence for a host that belongs to ...
Q03SENIOR
What is the purpose of the `_meta` key in dynamic inventory script outpu...
Q04JUNIOR
How can you debug which variables a specific host will receive from inve...
Q05SENIOR
Explain how to use the `aws_ec2` inventory plugin to group EC2 instances...
Q06SENIOR
What is the variable precedence order from lowest to highest? Name at le...
Q07JUNIOR
How do you manage secrets in inventory files?
Q08SENIOR
What happens if a dynamic inventory script returns an empty `ansible_use...
Q01 of 08JUNIOR

What is the difference between INI and YAML inventory formats? When would you use each?

ANSWER
INI is a legacy format using sections like [group] and [group:vars]. YAML is more expressive, supporting nested groups and variables under keys. Use YAML for any inventory with more than 10 hosts or nested groups. INI is acceptable for very simple flat inventories but is error-prone.
FAQ · 8 QUESTIONS

Frequently Asked Questions

01
Can I mix INI and YAML inventory files in the same inventory directory?
02
How do I set a variable for all hosts in inventory?
03
What is the difference between `host_vars` and `group_vars`?
04
How do I use a dynamic inventory script?
05
Why does my host get a variable value I didn't expect?
06
Can I use environment variables in inventory files?
07
How do I create a group that contains other groups in YAML?
08
What is the `ungrouped` group?
N
Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Lessons pulled from things that broke in production.

Follow
Verified
production tested
June 21, 2026
last updated
1,596
articles · all by Naren
🔥

That's Ansible. Mark it forged?

12 min read · try the examples if you haven't

Previous
Ansible Roles and Best Practices
4 / 23 · Ansible
Next
Ansible Variables and Facts