AutoSys Alarms and Notifications — Alerting When Jobs Fail
- Set alarm_if_fail: 1 on all critical jobs — the default is 0 (no alarm)
- Use notification_emailaddress for direct email alerts; include log file paths in notification_msg
- max_run_alarm and min_run_alarm provide bounds-based alerting for jobs running too long or suspiciously fast
AutoSys has a built-in alarm system that lets you define exactly what events should trigger alerts and who should be notified. Without alarms, your batch jobs could silently fail overnight and nobody would know until users start reporting missing reports in the morning. With well-configured alarms, your team knows within minutes.
alarm_if_fail — the basic failure alert
The simplest alarm is alarm_if_fail. Set it to 1 on any job, and AutoSys raises an alarm in the Event Server when that job fails. You can then configure alarm actions to send email, page, or invoke a script.
insert_job: critical_eod_job job_type: CMD command: /scripts/critical_eod.sh machine: prod-server-01 owner: batchuser date_conditions: 1 days_of_week: all start_times: "22:00" alarm_if_fail: 1 /* raise alarm if job fails */ max_run_alarm: 60 /* also alarm if still running after 60 minutes */ min_run_alarm: 5 /* alarm if completes in less than 5 minutes (suspicious) */ alarm_if_terminated: 1 /* alarm if job is killed/terminated */
Notification attributes — email on failure
For direct email notification without configuring a separate alarm action, use the notification attributes. These send an email when the job fails.
insert_job: payroll_run job_type: CMD command: /scripts/payroll.sh machine: finance-server owner: finuser date_conditions: 1 days_of_week: fri start_times: "18:00" alarm_if_fail: 1 /* Email notification on failure */ notification_emailaddress: batch-ops@company.com,payroll-lead@company.com notification_emailaddress_on_success: payroll-mgr@company.com notification_msg: "ALERT: AutoSys job %s FAILED on %m at %t. Check log: /logs/autosys/payroll_run.err" notification_msg_on_success: "INFO: Payroll run %s completed successfully at %t"
Machine and system alarms
Beyond job-level alarms, AutoSys can alarm on machine events — when an agent goes MISSING or when the Event Processor has issues.
/* Configure machine-level alarms */ update_machine: prod-server-01 max_load: 100 alarm_on_missing: 1 /* alarm when agent goes offline */ /* View active alarms */ # autorep -a /* show all active alarms */ # sendevent -E ALARM_ACK /* acknowledge an alarm */
| Alarm type | Attribute | Triggers when |
|---|---|---|
| Job failure alarm | alarm_if_fail: 1 | Job exits with non-zero code |
| Long run alarm | max_run_alarm: N | Job still running after N minutes |
| Short run alarm | min_run_alarm: N | Job completes in less than N minutes |
| Termination alarm | alarm_if_terminated: 1 | Job is killed (KILLJOB or term_run_time) |
| Machine offline alarm | alarm_on_missing: 1 | Agent machine stops responding |
🎯 Key Takeaways
- Set alarm_if_fail: 1 on all critical jobs — the default is 0 (no alarm)
- Use notification_emailaddress for direct email alerts; include log file paths in notification_msg
- max_run_alarm and min_run_alarm provide bounds-based alerting for jobs running too long or suspiciously fast
- Alarms need a response process — sending to a shared mailbox nobody monitors defeats the purpose
⚠ Common Mistakes to Avoid
- ✕Not setting alarm_if_fail: 1 on critical jobs — the default is 0, meaning no alert
- ✕Not including %x (exit code) and the log path in notification_msg — on-call engineers need to know where to look immediately
- ✕Setting max_run_alarm without investigating when it fires — alarm fatigue sets in if it triggers every day without action
- ✕Sending failure emails to a shared mailbox nobody actively monitors — alarms need a response process
Interview Questions on This Topic
- QHow do you configure AutoSys to send an email when a job fails?
- QWhat does max_run_alarm do?
- QWhat variables can you use in AutoSys notification_msg?
- QWhat does alarm_if_fail: 0 mean (the default)?
- QHow do you acknowledge an AutoSys alarm?
Frequently Asked Questions
How do I get notified when an AutoSys job fails?
Set alarm_if_fail: 1 and add notification_emailaddress: your-team@company.com to the job definition. Include a notification_msg with the log file path so on-call engineers know where to look.
What is max_run_alarm in AutoSys?
max_run_alarm specifies a runtime threshold in minutes. If the job is still running after that many minutes, AutoSys raises an alarm. It doesn't kill the job (that's term_run_time), it just alerts the team that the job is taking longer than expected.
What are the notification message variables in AutoSys?
AutoSys supports: %s (job name), %m (machine name), %t (timestamp), %x (exit code). Use these in notification_msg and notification_msg_on_success to make alert emails immediately informative.
What is the default value of alarm_if_fail?
The default is 0, which means no alarm is raised on failure. You must explicitly set alarm_if_fail: 1 on jobs where you want failure alerts. Many teams make this a required attribute in their job definition standards.
How do I acknowledge an AutoSys alarm?
Use sendevent -E ALARM_ACK or acknowledge through the WCC interface. Unacknowledged alarms accumulate in the alarm list. Establishing an alarm acknowledgement process is important for keeping the alarm list meaningful.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.