DevOps Intermediate

AutoSys Box Jobs: The 1 Container That Controls Your Batch Pipeline

📅 March 19, 2026 ⏱ 3 min read 🎯 Intermediate

Where developers are forged. · Structured learning · Free forever.

📍 Part of: AutoSys → Topic 10 of 30

AutoSys BOX jobs explained with real production failures.

⚙️ Intermediate — basic DevOps knowledge assumed

In this tutorial, you'll learn

AutoSys BOX jobs explained with real production failures.

BOX jobs are containers — they don't execute commands, they control when child jobs can start.
A box enters SUCCESS only when ALL inner jobs have succeeded; it enters FAILURE if any inner job fails.
You can nest boxes inside boxes to build hierarchical batch workflows.

thecodeforge.io

Box Job Hierarchy

Autosys Box Jobs Nested Boxes

✦ Plain-English analogy ✦ Real code with output ✦ Interview questions

⚡Quick Answer

BOX job = container for grouping child jobs — runs no command itself, controls when children can start (box must be RUNNING first)
Key components: box_name (child attribute linking to parent), condition (dependency between child jobs), nested boxes (boxes inside boxes)
Performance: Box status updates are O(number_of_children) — deep nesting (>5 levels) can slow Event Processor
Production trap: Child job has date_conditions: 1 with start_times inside box — waits for box AND specific time, may never run if box starts after that time
Biggest mistake: Box shows RUNNING but children not starting — child has unmet condition, ON_HOLD, or machine offline; not an error, just unmet prerequisite

🚨 START HERE

AutoSys Box Debug Cheat Sheet

Fast diagnostics for box issues in production AutoSys environments.

🟡

Box RUNNING, children INACTIVE

Immediate ActionCheck child's own start_times and conditions

Commands

autorep -J child_name -d | grep -E 'start_times|condition'

autostatus -J child_name

Fix NowRemove child's start_times if box controls timing. Ensure child's condition dependencies are met (check upstream job status). Remove ON_HOLD if set.

🟡

Box never starts — INACTIVE after scheduled time

Immediate ActionCheck box's own start_times and dependencies

Commands

autorep -J box_name -d | grep -E 'start_times|condition'

autorep -J box_name -s

Fix NowVerify date_conditions: 1 with correct days_of_week. Ensure condition dependencies are met. Remove ON_HOLD/ON_ICE. Check run calendar: `autorep -C cal_name`.

🟡

Box stuck RUNNING — never completes

Immediate ActionFind RUNNING child jobs inside box

Commands

autorep -J box_name% -s RU

autorep -J box_name% -d | grep 'Status: RU' -A5

Fix NowCheck stuck job's logs. If hung, use `sendevent -E TERMINATE -J stuck_job`. Set `term_run_time` on child to auto-kill after timeout.

🟡

Box FAILED but need it to continue despite child failure

Immediate ActionCheck if child can be marked as ignorable

Commands

autorep -J failing_child -d | grep ignore_failure

echo 'ignore_failure: 1' can be set on child job

Fix NowSet `ignore_failure: 1` on child job if its failure shouldn't fail the box. Use sparingly — otherwise you mask errors.

🟡

Nested box — parent box RUNNING but child box INACTIVE

Immediate ActionCheck child box's own conditions (same as any child job)

Commands

autorep -J child_box -d | grep -E 'condition|start_times'

autorep -J child_box -s

Fix NowChild box may have its own dependency not yet met. Ensure child box's box_name points to parent box. Remove own start_times if not needed.

Production Incident

The Box That Ran at Midnight but Inside It Stayed Dead

A BOX job started at 6pm, but its child jobs never ran. The box showed RUNNING. The operator spent 2 hours checking machines, permissions, and scripts. The child job had `start_times: "22:00"` — it was waiting for 10pm, even though the box started at 6pm.

SymptomBox started successfully at 18:00 (6pm). autorep showed box status RUNNING. Child jobs showed INACTIVE (IN), not STARTING. No errors in logs. Operator assumed the box was misconfigured or the Event Processor was stuck. They restarted the Event Processor, rebooted the agent machine, checked disk space, all without success.

AssumptionThe operator assumed that when a box is RUNNING, all child jobs would start immediately or follow their own condition dependencies. They didn't know that a child job with date_conditions: 1 and start_times: "22:00" would wait until 10pm regardless of the box's state. They also didn't know how to check a child job's attributes with autorep -J child -d.

Root causeThe child job was defined with date_conditions: 1 and start_times: "22:00". The ETL box started at 18:00. The child job was waiting for its own start time (22:00) to be reached, not just the box's RUNNING status. The box's start condition and the child's start condition were both required. The operator never checked the child's JIL definition. Five hours later at 22:00, the child jobs started and succeeded, but the operator had already wasted 2 hours of troubleshooting time.

Fix1. Removed date_conditions: 1 and start_times from all child jobs inside the box. The box's schedule should control overall timing. 2. For jobs that genuinely need to start at a specific time within a larger window, documented the behaviour in team runbook. 3. Added a pre-flight check script: autorep -J child -d | grep -E 'start_times|date_conditions' to flag child jobs with their own schedules. 4. Added a monitoring alert: if box RUNNING > 10 minutes and no children STARTING, check child job conditions and start_times.

Key Lesson

Child jobs inherit the box's RUNNING status, but they still have their own conditions (start_times, dependencies). Both must be satisfied.Always check child job attributes with autorep -J job -d when they don't start. Look for date_conditions: 1 and start_times.Generally, child jobs inside a box should not have their own start_times. Let the box control timing.When a box is RUNNING but children are INACTIVE, the problem is not the box — it's the children's prerequisites.

Production Debug Guide

Symptom → Action mapping for common box failures in production AutoSys environments.

Box RUNNING but children not starting — all INACTIVE→Check if child jobs have their own start_times: autorep -J child -d | grep start_times. Also check box condition: child may have condition: success(other_job) that not met yet. Check child ON_HOLD/ON_ICE status: autostatus -J child.

Box FAILED but some children succeeded — expected partial failure→Box fails as soon as ANY child job fails. Remaining children (including boxes) stop. Check autorep -J % -d inside box to find which child failed first. Use sendevent -E CHANGE_STATUS -J box -s FAILURE to mark box as failed? Actually, box fails automatically when child fails. To keep box running despite child failure, use ignore_failure: 1 on the child (use sparingly).

Nested box shows SUCCESS but jobs inside child box failed→Impossible — if a child box's inner job fails, the child box fails, and the parent box fails. Verify with autorep -J child_box -d. If child box SUCCESS but inner jobs FAIL, your condition logic may be wrong; the child box may have condition: success(inner_job) that is not evaluated because failed inner job didn't run.

Box never starts — stays INACTIVE after scheduled time→Check if box has date_conditions: 1 and start_times correctly set. Check if box has condition: success(other_box) that is not met. Check if box is ON_HOLD/ON_ICE. Check if run calendar is active: autorep -C calendar_name.

Box stays RUNNING forever — never completes→One or more child jobs are stuck in RUNNING. Check autorep -J % -s RU for jobs inside box. The stuck job may have hung (infinite loop, waiting for I/O, deadlock). Use term_run_time to kill hung jobs automatically.

BOX jobs are central to well-organised AutoSys environments. Without boxes, you'd have hundreds of independent jobs with no logical grouping, no shared scheduling, and no way to see the end-to-end status of a business process at a glance. With boxes, you can group related jobs, control their collective schedule, and see in seconds whether your end-of-day run succeeded.

But boxes are dangerous when misused. A child job with its own start_times inside a box may never run because the box starts after that time window. A box shows RUNNING but no children start — operators assume the jobs are broken, but the condition is unmet. And a Super Box (top-level box) can mask failures: if a child box fails, the parent box fails, but you lose visibility of which specific job caused the failure.

By the end you'll know exactly how boxes control child jobs, when boxes succeed or fail, how to nest boxes, and the specific debug steps when inner jobs won't start.

How BOX jobs control child job execution

The box controls three things for its child jobs: 1. When they can start: Child jobs can only start when the box is in RUNNING state 2. The execution environment: All child jobs inherit the box's scheduling context 3. Collective status: The box reports SUCCESS only when ALL child jobs succeed

A child job's own conditions (start_times, conditions) still apply within the box. If Job B has condition: success(Job A), it still waits for Job A even though the box is running.

io/thecodeforge/autosys/box_control.jil · BASH

123456789101112131415161718192021222324

/* Box starts at 10 PM on weeknights */
insert_job: eod_box
job_type: BOX
date_conditions: 1
days_of_week: mon-fri
start_times: "22:00"
alarm_if_fail: 1

/* Job A: starts as soon as box is RUNNING */
insert_job: job_a
job_type: CMD
box_name: eod_box
command: /scripts/step_a.sh
machine: server01
owner: batch

/* Job B: waits for Job A, but still inside the box */
insert_job: job_b
job_type: CMD
box_name: eod_box
command: /scripts/step_b.sh
machine: server01
owner: batch
condition: success(job_a)

📊 Production Insight

A box in RUNNING status does NOT mean children are running. It means the box's conditions (time, dependencies) are satisfied, so children are eligible to start.

Children still need their own conditions (dependencies, start_times, machine availability, not ON_HOLD) to be met before they actually start.

Rule: When debugging 'box running, children not starting', first check the children's individual conditions, not the box.

🎯 Key Takeaway

BOX jobs are containers — they don't execute commands, they control when child jobs can start.

A box succeeds only when ALL inner jobs succeed; it fails if any inner job fails.

Rule: Child jobs inside a box should generally not have their own start_times — let the box control timing.

Nested boxes — boxes inside boxes

You can place a BOX job inside another BOX job. This creates a hierarchy that lets you organise complex batch flows into logical sub-processes.

In a nested setup, the parent box must be RUNNING before the child box can start. The child box must be RUNNING before its own child jobs can start. The parent box succeeds only when ALL child boxes (and their contents) have succeeded.

io/thecodeforge/autosys/nested_boxes.jil · BASH

12345678910111213141516171819202122232425

/* Parent box — the overall EOD run */
insert_job: master_eod_box
job_type: BOX
date_conditions: 1
days_of_week: mon-fri
start_times: "21:00"

/* Child box 1 — ETL processing */
insert_job: etl_box
job_type: BOX
box_name: master_eod_box       /* inside master box */

/* Child box 2 — reporting (runs after ETL) */
insert_job: reporting_box
job_type: BOX
box_name: master_eod_box
condition: success(etl_box)    /* waits for ETL box to complete */

/* Jobs inside etl_box */
insert_job: etl_extract
job_type: CMD
box_name: etl_box
command: /scripts/extract.sh
machine: etl01
owner: batch

🔥The topmost parent box is called Super Box

In AutoSys terminology, the highest-level BOX that contains all other boxes and jobs is sometimes called the Super Box. It's the single point of control for an entire batch workflow.

📊 Production Insight

Nested boxes add debugging complexity. A failure deep in the hierarchy causes the parent box to fail, but autorep only shows the top-level failure.

To find the root cause, run autorep -J master_box% -s FA to find failed jobs and traverse down.

Rule: Limit nesting depth to 3-4 levels. Deeper nesting makes it harder to trace failures and increases Event Processor load.

🎯 Key Takeaway

You can nest boxes inside boxes to build hierarchical batch workflows.

Parent box must be RUNNING before child box can start; child box must be RUNNING before its children.

Rule: The Super Box (top-level) gives you single-point visibility, but debugging failures requires drilling down each level.

Debugging: box is running but inner jobs aren't starting

This is one of the most common issues you'll encounter. The box is in RUNNING state, but the jobs inside it stay INACTIVE or never start.

Most common causes: 1. The child job's own starting conditions aren't met (check condition attribute) 2. The child job has date_conditions: 1 with a start_times set — it's waiting for that specific time even though the box is running 3. The child job is ON_HOLD or ON_ICE 4. The child job's machine is offline (PEND_MACH) 5. The box_name attribute on the child job has a typo

io/thecodeforge/autosys/debug_box.sh · BASH

1234567891011

# Check status of box and all inner jobs
autorep -J eod_box -s

# Check a specific inner job's attributes
autorep -J job_b -d

# Check if an inner job is on hold or on ice
autostatus -J job_b

# Check if the machine for an inner job is active
autorep -M server01

📊 Production Insight

The most overlooked cause: child job has date_conditions: 1 and start_times that are later than the box start time. The box runs, but the child waits for the clock.

Another common cause: child job has a condition: success(other_job) where other_job is not in the box and never runs.

Rule: When box RUNNING and children INACTIVE, the problem is NEVER the box. Check children conditions, start_times, ON_HOLD, and machine status.

🎯 Key Takeaway

Box running but inner jobs not starting? Check child's own conditions: start_times, condition dependencies, ON_HOLD, machine offline.

Use autorep -J job -d to see child's full attributes. Look for date_conditions: 1 and start_times.

Rule: Generally, child jobs inside a box should not have their own start_times — let the box control timing.

🗂 AutoSys Box Job Statuses

What each box status means for job execution and child job behaviour

BOX Job State	What It Means	Can Inner Jobs Run?	How Box Enters This State
RUNNING	Box is active, scheduling conditions are met	Yes — if their own conditions are also met	Start time reached, dependencies satisfied
SUCCESS	All inner jobs and child boxes succeeded	No — box has completed	All children SUCCESS
FAILURE	At least one inner job or child box failed	No — remaining jobs don't start	Any child FAILURE
INACTIVE	Box hasn't been triggered yet (default state)	No	Initial state; never started
ON_HOLD	Box manually placed on hold	No — box won't start even if conditions met	Manual: ON_HOLD
ON_ICE	Box suspended (like ON_HOLD but more aggressive)	No — box won't start	Manual: ON_ICE
TERMINATED	Box was killed while running (manual or term_run_time)	No — child jobs also terminated	Manual KILLJOB or term_run_time exceeded

🎯 Key Takeaways

BOX jobs are containers — they don't execute commands, they control when child jobs can start.
A box enters SUCCESS only when ALL inner jobs have succeeded; it enters FAILURE if any inner job fails.
You can nest boxes inside boxes to build hierarchical batch workflows.
Child jobs inside a box should generally not have their own start_times — let the box control timing.
When a box is RUNNING but children are INACTIVE, the problem is the children's conditions, not the box.

⚠ Common Mistakes to Avoid

✕Setting date_conditions: 1 and start_times on a child job inside a box

Symptom

Box starts at 6pm, but child jobs don't start until 10pm (or never if box starts after the time window). Operator sees box RUNNING and assumes children should run immediately.

Fix

Remove date_conditions and start_times from child jobs inside boxes. The box's schedule controls timing. If a child must start at a specific time, create a separate box for that timeframe.

✕Putting a machine attribute on a BOX job

Symptom

No immediate failure, but BOX job's machine attribute is ignored. Operator may think box runs on a specific machine and be confused when it doesn't.

Fix

BOX jobs never execute commands, so machine attribute is meaningless. Remove it to avoid confusion. Only child jobs need machine attributes.

✕Not setting a condition on the second-to-last job in a chain, causing jobs to run in parallel unexpectedly

Symptom

Inside a box, Job A, Job B, and Job C are independent. Operator expects A → B → C, but all run in parallel because no conditions are set.

Fix

Add condition: success(previous_job) to each job to enforce ordering. If parallel execution is desired, no condition needed, but be explicit in comments.

✕Confusing box state with child job state

Symptom

Operator sees box RUNNING and assumes children are also RUNNING. They don't check child status and miss that a child failed or is ON_HOLD.

Fix

Always check child status separately: autorep -J box_name% to list all children with their statuses. Never assume children are RUNNING just because box is.

✕Nesting boxes too deeply (>5 levels)

Symptom

Event Processor performance degrades. autorep queries for box status take minutes. Debugging failures requires traversing 5 levels of nesting.

Fix

Limit nesting depth to 3-4 levels. Break complex workflows into separate top-level boxes connected by conditions instead of deep nesting. Each level adds Event Server query overhead.

Interview Questions on This Topic

QWhat is a BOX job in AutoSys and what does it do?JuniorReveal
A BOX job is a container for grouping other jobs (commands, files, other boxes). It doesn't execute any command itself. It controls when child jobs can run — children can only start when the box is in RUNNING state. The box's own start conditions (time, dependencies) determine when it becomes RUNNING. The box succeeds only when all its child jobs have succeeded. If any child fails, the box fails immediately. Boxes are used to organise complex batch workflows into logical groups.
QWhen is a BOX job in SUCCESS state?JuniorReveal
A BOX job reaches SUCCESS only when ALL of its immediate child jobs have completed successfully (status SUCCESS). This includes child jobs that are themselves boxes. If any child job fails, the box immediately transitions to FAILURE and the remaining child jobs (including those not yet started) will not run. Note that a box with no children (empty) will never transition to SUCCESS; it would stay RUNNING forever.
QIf a BOX job is RUNNING but its child jobs aren't starting, what would you check?Mid-levelReveal
I would check the child jobs' individual conditions, not the box. Steps: (1) Run autorep -J child -d to see child's attributes — look for date_conditions: 1 with start_times (time-based waiting). (2) Check child's condition dependencies — is it waiting for another job that hasn't succeeded? (3) Check if child is ON_HOLD or ON_ICE with autostatus -J child. (4) Check child's machine status with autorep -M machine_name — is the agent down (PEND_MACH)? (5) Check box_name attribute spelling on child job. The box's RUNNING status only makes children eligible — their own conditions must also be satisfied.
QWhat is a Super Box in AutoSys?Mid-levelReveal
Super Box is informal terminology (not an official AutoSys attribute) for the highest-level BOX job that contains all other boxes and jobs for a given batch workflow. It provides a single point of control and visibility: start the Super Box to run the entire workflow; monitor its status to know overall success/failure. However, a Super Box failure shows only that something failed — you still need to drill down into child boxes to find the root cause. In production environments, the Super Box is often scheduled with start_times and days_of_week to trigger daily batch runs.
QCan a BOX job contain another BOX job?JuniorReveal
Yes, you can nest boxes inside boxes. This creates hierarchical batch workflows. The parent box must be RUNNING before the child box can start. The child box must be RUNNING before its own child jobs can start. The parent box only succeeds when ALL child boxes (and their contents) have succeeded. Limitations: Nesting beyond 5 levels can cause Event Processor performance issues. Also, debugging failures requires traversing each nesting level to find the failed job.

Frequently Asked Questions

What is a BOX job in AutoSys?

A BOX job is a container for other jobs. It doesn't execute any command itself — it groups jobs together and controls when they can run. Child jobs inside a box can only start when the box is in RUNNING state.

When does a BOX job move to SUCCESS?

A BOX job moves to SUCCESS only when ALL of its inner jobs have completed successfully. If any inner job fails, the box moves to FAILURE and remaining un-started inner jobs will not run.

My BOX job is RUNNING but inner jobs aren't starting — why?

Common causes: the inner job has date_conditions: 1 and is waiting for a specific start_times; the inner job has a condition that isn't met yet; the inner job is ON_HOLD or ON_ICE; the inner job's machine is offline; or the box_name attribute has a typo.

What is a Super Box in AutoSys?

Super Box is informal terminology for the highest-level BOX job that contains all other boxes and jobs for a given batch workflow. It provides a single point of control and visibility for the entire process.

Should child jobs inside a box have their own start_times?

Generally no. When a job is inside a box, the box controls the overall timing. Setting start_times on a child job means it will wait until that specific time even if the box is already running, which often causes confusion. Child jobs inside boxes typically rely on conditions (success of previous jobs) rather than clock times.

What's the difference between ON_HOLD and ON_ICE for a box?

Both prevent the box from starting. ON_HOLD is a temporary hold — the box can be released with sendevent -E RELEASE. ON_ICE is more permanent — the box is ignored as if it doesn't exist; dependencies that reference this box treat it as not satisfied. Use ON_ICE for boxes you want to disable entirely. Use ON_HOLD for pausing while keeping dependencies active.

🔥

Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

About Naren Get in touch

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged