Senior 3 min · March 19, 2026

AutoSys Interview Questions — 50 Bank & Insurance Q&As

ON_HOLD runs immediately on release; ON_ICE waits for next cycle.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • ON_HOLD: releasing starts job immediately if conditions met. ON_ICE: releasing waits for next scheduling cycle. Most common wrong answer.
  • PEND_MACH = agent unreachable. First check: disk space on agent (df -h). 90% of cases.
  • date_conditions defaults to 0 (time scheduling disabled). Most people assume it's 1.
  • FORCE_STARTJOB bypasses ALL conditions (time AND dependencies). STARTJOB respects them.
  • box_terminator: 1 stops entire box when job fails. Use on validation jobs only.
  • Global variables: SET_GLOBAL writes, autostatus -G reads, variable() in JIL conditions.
Plain-English First

This article collects the AutoSys questions that actually come up in interviews at banks, insurance companies, and enterprise IT shops — and gives you the complete, correct answers, not the vague half-answers you'll find elsewhere.

AutoSys interviews are specific. Interviewers know the tool. Vague answers about 'scheduling jobs' fail.

This guide assumes you've worked through the other articles in this track. It's your review. The questions are organised from foundational to advanced. The answers are complete, not truncated.

The most common wrong answer? ON_HOLD vs ON_ICE. That question appears in almost every interview. Get it right.

Architecture and concepts

These questions test whether you understand what AutoSys actually is and how it works internally. They're usually early in the interview to establish baseline knowledge.

architecture_qa.txtBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Q: What is AutoSys and what problem does it solve?
A: AutoSys is Broadcom's enterprise workload automation platform for scheduling,
   monitoring, and orchestrating batch jobs across multiple servers. It solves the
   scalability problems of cron: dependency management, centralised visibility, alerting,
   audit trails, and multi-server coordination.

Q: What are the main components of AutoSys architecture?
A: Event Server (database storing all definitions and events), Event Processor
   (scheduling daemon that evaluates conditions and triggers agents), Remote Agents
   (lightweight processes on each target machine), and Clients (CLI tools + WCC web UI).

Q: What happens when the Event Processor goes down?
A: Job triggering stops. Jobs that are currently RUNNING continue to completion (the
   agent handles execution independently), but no new jobs will be triggered until
   the Event Processor is restarted.
Interview tip — Event Processor vs Event Server
Interviewers often ask 'what's the difference?' The Event Server is the database (storage). The Event Processor is the daemon (evaluation). One stores state, the other triggers jobs.
Production Insight
A candidate answered 'The Event Processor writes to the Event Server.' That's backwards. The Event Processor reads from the Event Server. The Event Server is written to by agents and sendevent commands. The processor is stateless.
The interviewer asked a follow-up: 'If the Event Server goes down, do running jobs continue?' The candidate didn't know. Answer: Yes — the agent runs jobs independently. But job completion status cannot be written back.
Rule: Know which component does what. If you confuse direction, you fail the architecture section.
Key Takeaway
Event Server = database (storage). Event Processor = daemon (evaluation).
Event Processor down = no new jobs. Running jobs continue.
Agent down = jobs on that PEND_MACH. Other agents fine.
Know the failure modes: silent, not sudden.
Component failure — what happens to jobs?
IfEvent Processor crashes
UseRunning jobs continue. No new jobs start. Status updates queue.
IfEvent Server unreachable
UseRunning jobs continue. Completion status can't be saved. Agent may retry.
IfRemote Agent on machine down
UseJobs on that machine stay PENDING. Other machines unaffected.
IfNetwork between server and agent down
UseJobs on that machine go PEND_MACH. Agent can't start jobs or report status.
AutoSys Interview Topic Map AutoSys Interview Topic Map. Grouped by category — know these cold · Architecture · Event Server vs Processor · Component roles · HA / shadow server · PEND_MACH causesTHECODEFORGE.IOAutoSys Interview Topic MapGrouped by category — know these cold ArchitectureEvent Server vs ProcessorComponent rolesHA / shadow serverPEND_MACH causes JIL Commandsinsert vs updatedelete vs delete_boxautorep -q backupoverride_job Job TypesCMD / BOX / FW diffbox_name attributebox_terminator useFW min_file_size Status CodesAll abbreviationsON_HOLD vs ON_ICEACTIVATED meaningTERMINATED causes Schedulingdate_conditions gaterun_window purposerun_calendar setuptimezone handling TroubleshootingFailure workflowPEND_MACH → disk checkRestart procedureCHANGE_STATUS useTHECODEFORGE.IO
thecodeforge.io
AutoSys Interview Topic Map
Autosys Interview Questions

JIL and job operations

These test practical JIL knowledge — what interviewers really want to know is whether you've actually used the tool, not just read about it.

jil_operations_qa.txtBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Q: What is the difference between insert_job and update_job?
A: insert_job creates a new job definition — fails if job already exists.
   update_job modifies an existing job (partial update, only changed attributes).
   Fails if job doesn't exist.

Q: What is the difference between delete_job and delete_box?
A: delete_job on a box removes only the box, leaving inner jobs as standalone.
   delete_box removes the box AND all its inner jobs.

Q: How do you back up AutoSys job definitions?
A: autorep -J % -q > backup_$(date +%Y%m%d).jil
   This dumps all job definitions in JIL format to a file.

Q: How do you view the JIL definition of an existing job?
A: autorep -J jobname -q

Q: What does FORCE_STARTJOB do differently from STARTJOB?
A: FORCE_STARTJOB starts the job immediately bypassing all conditions
   (date_conditions, start_times, condition attribute). STARTJOB only triggers
   if conditions are currently met.
Most missed JIL question: delete_job vs delete_box
On a box: delete_job removes the box container. Inner jobs become standalone. delete_box removes box AND inner jobs. This is a common trick question — if you say 'delete_job removes the box and its jobs', you're wrong.
Production Insight
An operations engineer used delete_job on a production box thinking it would remove all inner jobs. It didn't. The box vanished. All inner jobs became orphaned standalone jobs. They continued running on their own schedules, independent of dependencies.
A trading settlement job ran 4 hours early because its parent box was gone. The box had enforced a start time. Without the box, the job ran at its own start time — which was 2 PM, not 6 PM.
Recovery: regenerate box definition from backup (autorep -J boxname -q had been saved). Reinsert box. Reassociate inner jobs with box_name attributes.
Rule: Always have current JIL backups. autorep -J % -q weekly. Delete box? Use delete_box or expect orphaned jobs.
Key Takeaway
insert_job vs update_job: create vs modify. delete_job vs delete_box: box-only vs box+children.
Backups: autorep -J % -q > backup.jil — do this weekly.
FORCE_STARTJOB bypasses ALL conditions. STARTJOB respects them.
JIL is case-sensitive on Linux. JOB vs Job are different.

Status codes and troubleshooting

These test operational knowledge — have you actually been on-call for an AutoSys environment? Interviewers love status code questions because they separate theory from practice.

status_trouble_qa.txtBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Q: What does PEND_MACH mean and what usually causes it?
A: PEND_MACH (PE) means the Remote Agent on the target machine is unavailable.
   Most common cause: the agent machine's filesystem is 100% full, stopping the
   agent service. Check disk space first: ssh machine01 'df -h'

Q: What is the difference between ON_HOLD and ON_ICE?
A: ON_HOLD: releasing (OFF_HOLD) starts the job immediately if conditions are currently met.
   ON_ICE: releasing (OFF_ICE) makes the job wait for conditions to reoccur in the
   next scheduling cycle — it does not start immediately.

Q: A job was failing every night for a week. What's your troubleshooting approach?
A: 1. Check std_err_file for the error pattern
   2. Check if it's always the same exit code (consistent root cause)
   3. Check autorep -J jobname -run 7 to compare recent runs
   4. Check if it correlates with system events (deployments, maintenance)
   5. Engage the application team who owns the script

Q: How do you unblock downstream jobs after manually fixing a failed job?
A: sendevent -E CHANGE_STATUS -J fixed_job -s SUCCESS
   This marks the job SUCCESS so all downstream success() conditions are met.
ON_HOLD vs ON_ICE — the most common wrong answer
Most candidates say 'they're the same'. They're not. OFF_HOLD starts immediately. OFF_ICE waits for next schedule. If you get this wrong, you fail the status section. Know it cold.
Production Insight
A candidate correctly defined ON_HOLD vs ON_ICE. Then the interviewer asked: 'You have a job that runs at midnight. At 2 PM, you put it ON_ICE. At 3 PM, you release it. When does it run?'
The candidate thought: immediately. Wrong. ON_ICE release waits for the next scheduling cycle — midnight. The job ran at midnight, not 3 PM.
The candidate would have failed the real scenario. Operational experience matters more than definitions.
Rule: ON_HOLD = manual overrides during the day. ON_ICE = permanent schedule changes or avoiding out-of-cycle runs.
Key Takeaway
PEND_MACH = agent unreachable. First check: disk space.
ON_HOLD = immediate resume. ON_ICE = next scheduled cycle. Learn it.
CHANGE_STATUS -s SUCCESS unblocks downstream after manual fix.
troubleshooting = logs + trends + correlation + escalation.

Advanced and scenario questions

These test whether you can reason about AutoSys in complex real-world situations. Senior-level interviews focus heavily on this section.

advanced_qa.txtBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Q: Design an AutoSys workflow for end-of-day batch processing.
A: Use a 3-level hierarchy: master box (overall schedule) → section boxes
   (logical groupings: extract, transform, load, report) → CMD jobs inside each
   section box. Include a pre-check job as box_terminator, n_retrys on I/O jobs,
   alarm_if_fail on all critical jobs, and a post-check job to validate output.

Q: What is box_terminator and when would you use it?
A: box_terminator: 1 on a job means if that job fails, the entire parent box
   immediately moves to FAILURE and all remaining inner jobs are skipped.
   Use it on validation/pre-check jobs whose failure makes all downstream
   processing pointless.

Q: How do you handle a scenario where an upstream file sometimes arrives late?
A: Use a File Watcher job (job_type: FW) with a run_window covering the expected
   arrival period and an appropriate min_file_size. The downstream jobs condition
   on success(file_watcher_job). This way processing starts as soon as the file
   arrives rather than at a fixed time that may be too early.

Q: How do you pass data between AutoSys jobs?
A: Using global variables: the upstream script runs sendevent -E SET_GLOBAL
   -G "VAR_NAME=value". Downstream jobs read it via autostatus -G VAR_NAME or
   reference it in JIL conditions with variable(VAR_NAME).
Senior interview tip — mention trade-offs and alternatives
When asked 'how would you design X', don't just give one answer. Say 'Option A is a box with a File Watcher. Option B is a scheduled job with polling. Option A is better because...' Show you can compare approaches.
Production Insight
A senior candidate was asked 'How would you handle a file that arrives in multiple chunks?'
Junior answer: 'Use a File Watcher.'
Senior answer: 'Use a manifest file. Upstream writes one .ready file after all chunks are complete. File Watcher watches .ready. This prevents triggering on partial data. Alternatively, use min_file_size set to the expected final size, but manifest is more reliable because chunk order is unpredictable.'
The senior answer showed consideration of edge cases, alternatives, and trade-offs. That's what gets the offer.
Rule: At senior level, every answer should include 'it depends' and then explain the trade-offs.
Key Takeaway
EOD workflow: hierarchical boxes + pre-check terminator + post-check validation.
box_terminator on validation jobs only. Optional jobs should never be terminators.
File Watcher for unpredictable arrival times. Must have min_file_size and run_window.
Global variables pass data. Use workflow prefixes to avoid collisions.
● Production incidentPOST-MORTEMseverity: high

The Interview Answer That Didn't Match Production

Symptom
The candidate answered: 'ON_ICE, because I want the job to wait until the next cycle after the migration.' That's technically correct. But the interviewer wanted to hear 'ON_HOLD, because after the migration finishes, we want the job to run immediately, not wait until midnight.'
Assumption
The candidate memorised definitions but never applied them to real operations. They didn't understand the operational consequence of the difference.
Root cause
ON_HOLD: release triggers immediate start if conditions are currently true. ON_ICE: release requires time conditions to reoccur in the next scheduling cycle. During a database migration at 2 PM, a job that normally runs at midnight is held. After migration completes at 4 PM: - If ON_HOLD: release runs the job at 4 PM (good — you want validation now) - If ON_ICE: release does nothing until midnight (bad — you wait 8 hours to validate)
Fix
The candidate learned the rule: ON_HOLD for temporary pauses during business hours where you want immediate resume. ON_ICE for permanent schedule changes or when you don't want out-of-cycle runs. Interview tip: Always follow definition with 'In production, I would use ON_HOLD when... and ON_ICE when...'
Key lesson
  • Memorised definitions are not enough. Apply them to real scenarios.
  • ON_HOLD = immediate resume. ON_ICE = next scheduled cycle.
  • Database migrations: ON_HOLD. Schedule changes: ON_ICE.
  • Interviewers probe with 'when would you use this?' — always have an example.
Production debug guideThe 'walk me through how you'd fix this' questions4 entries
Symptom · 01
Job in PEND_MACH status at 2 AM
Fix
Step 1: SSH to agent machine. df -h (full disk is #1 cause). Step 2: ps -ef | grep autosys (agent running?). Step 3: Check network: telnet server 7520. Answer: Most likely full disk stopping agent.
Symptom · 02
Job shows SUCCESS but data didn't update
Fix
Look for sqlplus without error checking. Check std_out_file for ORA- errors. Answer: sqlplus returns 0 on SQL errors. Always wrap in script that greps for ORA-.
Symptom · 03
File Watcher triggered on empty file
Fix
Check min_file_size. Default is 0. Increase to 1024+. Answer: Upstream wrote lock file first.
Symptom · 04
SAP job stuck PENDING, no error
Fix
XBP user password expired or account locked. Check with Basis team. Answer: AutoSys can't see SAP auth failures.
★ Interview Command Recall — Must-Know SyntaxYou will be asked these exact commands. Know them cold.
Back up all job definitions
Immediate action
Use autorep with -q flag
Commands
autorep -J % -q > backup_$(date +%Y%m%d).jil
Fix now
This is a complete backup in JIL format
Check why job isn't starting+
Immediate action
View JIL definition and status
Commands
autorep -J JOBNAME -q
autorep -J JOBNAME -d
Fix now
-q shows conditions, -d shows status detail
Force-start a job+
Immediate action
Use sendevent FORCE_STARTJOB
Commands
sendevent -E FORCE_STARTJOB -J JOBNAME
sendevent -E CHANGE_STATUS -J JOBNAME -s SUCCESS (to unblock downstream)
Fix now
FORCE_STARTJOB bypasses ALL conditions
Set a global variable+
Immediate action
Use sendevent SET_GLOBAL
Commands
sendevent -E SET_GLOBAL -G "COUNT=100"
autostatus -G COUNT
Fix now
No spaces around =
AutoSys Interview Topics — What to Expect by Level
Topic areaJunior expected depthMid-level expected depthSenior expected depth
ArchitectureName the componentsExplain what each does, failure modesDesign HA, predict failure cascades
JIL commandsBasic insert/update/delete syntaxautorep flags, backup strategiesComplex JIL with conditions, variables
Status codesRecognise SU/FA/RU/INPEND_MACH causes, ON_HOLD vs ON_ICERecovery procedures for each status
Schedulingdate_conditions, start_timesrun_window, run_calendarComplex calendars, timezone handling
Fault tolerancen_retrys definitionbox_terminator, term_run_timeHA design, recovery strategy
TroubleshootingCheck logs commandSystematic diagnosis workflowRoot cause analysis, prevention

Key takeaways

1
ON_HOLD vs ON_ICE
OFF_HOLD starts immediately. OFF_ICE waits for next schedule. This is tested in almost every interview.
2
autorep flags
default (status), -d (detail), -q (JIL dump), -s (filter), -run (last N runs). Know them cold.
3
PEND_MACH = agent unreachable. First check
disk space (df -h). 90% of cases.
4
date_conditions defaults to 0 (disabled). Most people assume it's 1. That's the trap.
5
FORCE_STARTJOB bypasses ALL conditions (time AND dependencies). STARTJOB respects them.
6
box_terminator
1 on validation only. Never on optional jobs.
7
Senior answers include trade-offs
'it depends' + comparison of approaches.
8
Have a real example ready for every concept
'I used ON_HOLD when...'

Common mistakes to avoid

5 patterns
×

Memorising answers without understanding the reasoning

Symptom
Candidate defines ON_HOLD vs ON_ICE perfectly. When asked 'which would you use during a database migration?', they guess wrong. Interviewer probes deeper and realises lack of operational experience.
Fix
For every definition, think of a production scenario where you would use it. Practice explaining both ON_HOLD and ON_ICE with real examples.
×

Not knowing the autorep flags

Symptom
Candidate says 'I would check the job status' but can't name autorep flags. Interviewer asks 'what's the difference between autorep -d and -q?' Candidate doesn't know.
Fix
Memorise: autorep alone = status table. -d = detail (start/end times). -q = JIL dump (definition). -s = filter by status. -run = last N runs.
×

Confusing ON_HOLD and ON_ICE

Symptom
Candidate says 'they're the same.' Most common wrong answer. Immediate negative signal.
Fix
Repeat: OFF_HOLD starts immediately if conditions met. OFF_ICE waits for next scheduling cycle. If you can't articulate the difference, you haven't operated AutoSys.
×

Being vague about troubleshooting — 'I would check the logs'

Symptom
Candidate says 'I would check the logs' without specifying which logs or what to look for. Interviewer hears 'I've never actually done this'.
Fix
Specific answers: 'First I check $AUTOUSER/out/event_demon.$AUTOSERV for condition evaluation. Then I check the job's std_err_file. Then I ssh to the agent and check the application log.'
×

Not having an end-of-day workflow design ready

Symptom
Interviewer asks 'design an EOD batch workflow'. Candidate stalls or gives a flat list of jobs without hierarchy, error handling, or validation.
Fix
Have a pattern ready: master box → section boxes (extract/transform/load/report) → CMD jobs. Include pre-check validation as box_terminator. Include post-check verification. Mention n_retrys on network I/O jobs. This shows you've built real workflows.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is AutoSys and what makes it better than cron for enterprise batch ...
Q02SENIOR
Explain the AutoSys architecture and the role of each component.
Q03SENIOR
What is the difference between ON_HOLD and ON_ICE? What happens when you...
Q04SENIOR
A job is in PEND_MACH status. Walk me through how you diagnose and fix i...
Q05JUNIOR
What does date_conditions do and what is its default value?
Q06SENIOR
What is box_terminator and when would you use it?
Q07SENIOR
How do you design an AutoSys workflow for a complex end-of-day batch run...
Q08SENIOR
What is the difference between FORCE_STARTJOB and STARTJOB?
Q09SENIOR
How would you pass a record count from one AutoSys job to the next?
Q10SENIOR
Walk me through how you recover from a BOX job that went to FAILURE at 3...
Q01 of 10JUNIOR

What is AutoSys and what makes it better than cron for enterprise batch processing?

ANSWER
AutoSys is an enterprise workload automation platform. Better than cron because: cross-server dependencies (cron can't make job B wait for job A on another server), centralised monitoring (cron logs are per-server), alerting (cron only emails errors), audit trails (who changed what and when), retry logic (n_retrys), file-watching (event-driven), and global variables (cross-job data passing). Cron is fine for single-server, independent jobs. AutoSys is for multi-server, dependent workflows with SLAs and compliance requirements.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What AutoSys questions come up most in interviews?
02
What is the most commonly asked AutoSys interview question?
03
Do I need hands-on AutoSys experience to pass the interview?
04
What is the PEND_MACH answer in AutoSys interviews?
05
How do I explain AutoSys to a non-technical interviewer?
COMPLETE GUIDE
The Complete AutoSys Workload Automation Guide for Engineers →

JIL syntax, sendevent, autorep, box jobs, file watchers, scheduling, HA, security, cloud workload automation, and 22 interview questions — the definitive AutoSys reference for production engineers.

🔥

That's AutoSys. Mark it forged?

3 min read · try the examples if you haven't

Previous
AutoSys Cloud Workload Automation
30 / 30 · AutoSys