AutoSys Architecture: The 1 Component That Stops All Jobs
- AutoSys has four core components: Event Server (database), Event Processor (scheduler), Remote Agents (job executors), and client tools.
- The Event Server is the single source of truth — every job definition, event, and status lives there.
- The Event Processor continuously evaluates job conditions and triggers agents — it never executes jobs directly.
- AutoSys components: Event Server (database of all jobs/events), Event Processor (scheduler daemon), Remote Agent (job executor), client tools (jil, autorep, sendevent)
- Event Server stores all job definitions, event history, machine definitions; source of truth for entire AutoSys environment
- Event Processor runs continuously, evaluates conditions, triggers jobs on Remote Agents — only ONE per AutoSys instance
- Performance: Event Processor polls Event Server (~30s interval) — buffer for real-time alerts
- Production trap: Remote Agent machine runs out of disk space — agent crashes, all jobs on that machine go PEND_MACH (stuck), no auto-recovery
- Biggest mistake: Running multiple Event Processors — corrupts job state, leads to duplicate job execution
AutoSys Component Debug Cheat Sheet
Jobs stuck in PEND_MACH — all jobs on one machine
ps -ef | grep -i autosys_agentdf -h /var /tmp /optNo jobs starting anywhere — Event Processor likely down
autopingps -ef | grep eventorJobs stuck in RUNNING but log shows they completed
tail -100 /var/log/autosys/agent.log | grep -i errortelnet EVENT_SERVER_HOST 7777 (default Event Server port?) Actually check agent config for event server portJob status inconsistent — duplicates, missing events
ps -ef | grep eventor | grep -v grep | wc -lautorep -J % -q | grep -i duplicateHigh Event Server CPU — slow job scheduling
sqlplus autosys_user@autosys_db <<EOF
SELECT table_name, num_rows FROM user_tables WHERE table_name like 'AE%';
EOFls -lh $AUTOSYS/log/event_processor.logProduction Incident
/var/log/autosys/agent.log by default. A misconfigured application job generated 50GB of debug output overnight, filling /var. When the disk reached 100%, the agent service tried to write to the log, failed, and crashed. The Event Processor sent a heartbeat check to the agent, got no response, and marked all jobs on that machine as PEND_MACH (pending machine). No new jobs could start on that machine, and existing jobs continued running? Actually, running jobs continue, but crashed agent can't report completion. The crashed agent also could not start new jobs. The job stuck in RUNNING state until a human intervened.logrotate with compression and 7-day retention.
3. Set disk_check_interval in agent config to check free space before writing.
4. For the offending job, limited log output to 100MB and added log rotation in the script.
5. Added a cron job that restarts the AutoSys agent if it's down: if ! ps -ef | grep -q 'autosys_agent'; then /etc/init.d/autosys_agent start; fi.
6. Documented PEND_MACH resolution steps: check disk space, restart agent, then sendevent -E FORCE_STARTJOB for stuck jobs.Production Debug GuideSymptom → Action mapping for common AutoSys architecture failures.
ps -ef | grep autosys_agent. Check disk space on agent machine: df -h /var. Check agent log: /var/log/autosys/agent.log. Restart agent: /etc/init.d/autosys_agent restart. Then force restart stuck jobs: sendevent -E FORCE_STARTJOB -J job_name.autoping. If down, restart: eventor. Also check Event Server database connectivity: isql -S autosys (Sybase). Check if Event Processor process exists: ps -ef | grep eventor.$AUTOSYS/log/event_processor.log. Look for 'lost event' or 'queue overflow'. Increase Event Server buffer size or purge old events.ps -ef | grep eventor | wc -l should be 1. If >1, kill duplicate processes. Prevent by using FLOCK on eventor lock file.max_threads in configuration.Before you write a single line of JIL or schedule your first job, it helps to understand how AutoSys actually works under the hood. The architecture is straightforward but knowing what each component does — and why — will save you a lot of head-scratching when things go wrong in production.
AutoSys has four major components that work together: the Event Server, the Event Processor, Remote Agents, and client tools. Each has a clear job, and understanding the flow between them makes debugging much easier.
By the end you'll know exactly how job definitions flow from JIL to Event Server to Event Processor to Remote Agent and back. You'll understand what PEND_MACH means and why it's the most common production issue. And you'll know the component that, when it fails, stops all job scheduling.
The Event Server — the source of truth
The Event Server is a relational database (typically Sybase or Oracle) that stores everything AutoSys needs to operate. This includes all job definitions (what to run, when, where, under which conditions), all events that have occurred (job started, job succeeded, job failed), global variable values, machine definitions, calendar definitions, and monitor and report definitions.
When a job finishes and reports its status, that status goes into the Event Server. When the Event Processor needs to know whether a dependent job's condition is met, it queries the Event Server. It's the single source of truth for the entire AutoSys environment.
db_purge_events) to keep query performance acceptable.The Event Processor — the brain
The Event Processor (also called the scheduler or the event daemon) is the most important component. It runs continuously, polling the Event Server for events. When it detects that a job's starting conditions are met — the right time has arrived, dependent jobs have succeeded, the machine is available — it triggers the job to run on the appropriate agent.
The Event Processor also handles time-based scheduling, evaluates job condition logic, and manages the overall state machine for each job. On Unix/Linux it's started with the eventor command. There is only ever one Event Processor running per AutoSys instance.
# Start the AutoSys event processor (UNIX only) eventor # Check if AutoSys components are up autoping # Check AutoSys flags and system status autoflags -a
if [ $(ps -ef | grep eventor | wc -l) -ne 1 ]; then alert; fi.Remote Agents — where the work actually happens
Remote Agents run on every machine where AutoSys needs to execute jobs. When the Event Processor decides a job should run, it sends a message to the Remote Agent on the target machine. The agent starts the process, monitors it, captures the exit code, and reports the result back to the Event Server.
Agents can be extended with plugins for specific integrations — SAP, Oracle E-Business, PeopleSoft, and others. If an agent goes down, jobs that are supposed to run on that machine go into PEND_MACH status and wait until the agent comes back up.
Client tools — how you interact with AutoSys
Client tools are the interfaces you use to define, manage, and monitor jobs. The main ones are: jil — the command-line JIL processor for creating and modifying job definitions; autorep — reports job status and definitions; sendevent — manually triggers events like starting a job or putting it on hold; autostatus — checks the current status of a specific job; and the WCC Web UI — a browser-based dashboard for monitoring job flows visually.
Most experienced AutoSys administrators work primarily from the command line using jil, autorep, and sendevent. The GUI is useful for monitoring and for people less comfortable with CLI.
# Check status of a specific job autostatus -J daily_report # Get a detailed report on a job autorep -J daily_report -d # List all jobs in a box autorep -J box_name% # Check machine status autorep -M prod-server-01
autorep -J % -d for monitoring, sendevent -E FORCE_STARTJOB for recovery. Avoid manual GUI actions in automated recovery procedures.| Component | Type | Runs On | Key Responsibility | Failure Impact | How to Monitor |
|---|---|---|---|---|---|
| Event Server | Database (Sybase/Oracle) | Dedicated server | Stores all job definitions, events, state | Catastrophic — no job definitions can be read, no status updates | Check database connectivity, table sizes, I/O latency, dual server sync |
| Event Processor | Daemon/Service | AutoSys server | Evaluates conditions, triggers jobs | Severe — no new jobs start, running jobs continue | autoping, ps -ef | grep eventor, check log for errors |
| Remote Agent | Service | Every target machine | Executes jobs, reports results | Local — jobs on that machine go PEND_MACH, other machines unaffected | Check process exists, disk space, network connectivity to Event Server |
| jil | CLI client | Any client machine | Define/modify job definitions | None (if jil fails, use another client) | N/A (tool exits with non-zero code on error) |
| autorep | CLI client | Any client machine | Report job status and definitions | None (use another client) | N/A |
| sendevent | CLI client | Any client machine | Manually trigger events (START, HOLD, etc.) | None (use another client) | N/A |
| WCC (Web UI) | Browser GUI | Browser | Visual monitoring and management | None (use CLI if GUI down) | Check WCC service status, HTTP response |
🎯 Key Takeaways
- AutoSys has four core components: Event Server (database), Event Processor (scheduler), Remote Agents (job executors), and client tools.
- The Event Server is the single source of truth — every job definition, event, and status lives there.
- The Event Processor continuously evaluates job conditions and triggers agents — it never executes jobs directly.
- Remote Agents run on target machines and execute the actual work, reporting results back to the Event Server.
- PEND_MACH is one of the most common production issues and is caused by agent machines going offline or running out of disk space.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat are the four main components of AutoSys architecture?JuniorReveal
- QWhat does the Event Processor do and how does it interact with the Event Server?Mid-levelReveal
- QWhat happens to jobs when a Remote Agent machine goes down?Mid-levelReveal
- QWhat is PEND_MACH status and what causes it?JuniorReveal
- QCan you run multiple Event Processors for the same AutoSys instance?SeniorReveal
Frequently Asked Questions
What database does AutoSys use for the Event Server?
AutoSys traditionally used Sybase as its backend database. Newer versions also support Oracle. The Event Server stores all job definitions, events, and system state.
What happens if the Event Processor crashes?
If the Event Processor goes down, no new jobs will be triggered. Jobs that are already running will continue until they complete. AutoSys supports a shadow scheduler that can take over if the primary Event Processor fails, providing high availability.
Can I run AutoSys jobs on Windows machines?
Yes. AutoSys Remote Agents are available for both Unix/Linux and Windows. You can schedule jobs to run on Windows machines the same way as Unix machines, though some command-line tools like eventor are Unix-only.
What is the difference between the Event Server and the Event Processor?
The Event Server is the database that stores all data. The Event Processor is the daemon that reads from the Event Server, evaluates job conditions, and triggers jobs. One is data storage; the other is the decision engine.
How do I recover from PEND_MACH status?
First, check the agent machine: disk space (df -h), agent process (ps -ef | grep autosys_agent), agent logs (/var/log/autosys/agent.log). Restart agent if needed: /etc/init.d/autosys_agent restart. Then force start jobs: sendevent -E FORCE_STARTMACH -M machine_name. Stuck jobs will then run immediately. Without force start, jobs remain PEND_MACH forever.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.