AutoSys Real-World Patterns and Best Practices
- Use a consistent naming convention that includes environment, system, function, and frequency
- The 3-level hierarchy (master box → section boxes → job chains) is the standard pattern for complex batch
- Pre-check jobs with box_terminator stop wasted time on doomed runs; post-check jobs validate success
Having worked with AutoSys means understanding not just the syntax but the patterns that experienced architects use to build batch workflows that run reliably for years. These are the practices that separate a well-run AutoSys environment from one where every incident is a fire drill.
Naming conventions — the difference between sane and unmanageable
In a large AutoSys environment with thousands of jobs, naming conventions are everything. A consistent, searchable naming convention means you can find any job in seconds and understand its purpose without documentation.
Recommended pattern: <ENVIRONMENT>_<SYSTEM>_<FUNCTION>_<FREQUENCY>
/* Good naming examples */ PRD_TRADING_EXTRACT_DAILY /* production, trading system, extract, runs daily */ PRD_TRADING_TRANSFORM_DAILY PRD_TRADING_LOAD_DAILY PRD_PAYROLL_RUN_WEEKLY /* production, payroll, weekly */ PRD_RISK_REPORT_EOD /* production, risk, report, end-of-day */ /* Box jobs: suffix with _BOX */ PRD_TRADING_EOD_BOX PRD_PAYROLL_BOX /* File Watchers: suffix with _FW or _WATCH */ PRD_TRADING_SETTLE_FW PRD_FEEDS_MARKET_DATA_FW /* BAD naming (don't do this) */ job1 my_script test_final_v2_FINAL
autorep -J PRD_% you know you're looking at production. This simple prefix saves incidents.The standard EOD orchestration pattern
The standard pattern for end-of-day batch is a three-level hierarchy: master box → section boxes → job chains. This gives you visibility at multiple levels and makes partial failure recovery clean.
/* Level 1: Master box — overall EOD coordinator */ insert_job: PRD_EOD_MASTER_BOX job_type: BOX date_conditions: 1 days_of_week: mon-fri start_times: "21:00" alarm_if_fail: 1 /* Level 2: Section boxes — logical groupings */ insert_job: PRD_EOD_EXTRACT_BOX job_type: BOX box_name: PRD_EOD_MASTER_BOX insert_job: PRD_EOD_TRANSFORM_BOX job_type: BOX box_name: PRD_EOD_MASTER_BOX condition: success(PRD_EOD_EXTRACT_BOX) insert_job: PRD_EOD_REPORT_BOX job_type: BOX box_name: PRD_EOD_MASTER_BOX condition: success(PRD_EOD_TRANSFORM_BOX) /* Level 3: Actual CMD jobs inside section boxes */ insert_job: PRD_TRADE_EXTRACT_DAILY job_type: CMD box_name: PRD_EOD_EXTRACT_BOX command: /scripts/extract_trades.sh machine: etl-server-01 owner: batchuser alarm_if_fail: 1 n_retrys: 1 std_out_file: /logs/autosys/PRD_TRADE_EXTRACT_DAILY.out std_err_file: /logs/autosys/PRD_TRADE_EXTRACT_DAILY.err
Always include a pre-check and post-check job
Professional batch workflows include a pre-check job (validates environment/inputs before starting) and a post-check job (validates outputs after completion). These save enormous debugging time.
/* Pre-check: validates disk space, DB connectivity, input files */ insert_job: PRD_EOD_PRE_CHECK job_type: CMD box_name: PRD_EOD_MASTER_BOX command: /scripts/eod_pre_check.sh machine: etl-server-01 owner: batchuser box_terminator: 1 /* if pre-check fails, kill the entire box */ alarm_if_fail: 1 /* Post-check: validates output record counts, checksums, file presence */ insert_job: PRD_EOD_POST_CHECK job_type: CMD box_name: PRD_EOD_MASTER_BOX command: /scripts/eod_post_check.sh machine: etl-server-01 owner: batchuser condition: success(PRD_EOD_REPORT_BOX) alarm_if_fail: 1
| Pattern | Benefit | When to apply |
|---|---|---|
| 3-level box hierarchy | Visibility at multiple levels, clean partial recovery | All complex EOD/batch workflows |
| Pre/post check jobs | Catch environmental issues early, validate output | Any workflow with external dependencies |
| box_terminator on validation | Stop the whole box on critical pre-condition failure | Input validation, pre-requisite checks |
| n_retrys: 1 or 2 on I/O jobs | Handle transient network/DB blips automatically | Jobs calling external services or DBs |
| Environment prefix in names | Prevent cross-environment accidents | All environments, always |
🎯 Key Takeaways
- Use a consistent naming convention that includes environment, system, function, and frequency
- The 3-level hierarchy (master box → section boxes → job chains) is the standard pattern for complex batch
- Pre-check jobs with box_terminator stop wasted time on doomed runs; post-check jobs validate success
- Version-control your JIL scripts — every change tracked, every rollback possible
⚠ Common Mistakes to Avoid
- ✕Building flat job lists with hundreds of conditions instead of using box hierarchy — impossible to maintain
- ✕Skipping pre-check jobs to save time — the first time a 2-hour batch run fails at step 50 because disk was full, you'll add the pre-check
- ✕Inconsistent naming that makes searching impossible — establish conventions before the environment grows large
- ✕Not version-controlling JIL scripts — when someone changes a job and it breaks, git blame is invaluable
Interview Questions on This Topic
- QWhat naming convention would you use for AutoSys jobs?
- QDescribe the 3-level box hierarchy pattern for EOD batch orchestration.
- QWhy would you use a pre-check job with box_terminator?
- QHow do you make an AutoSys environment version-controlled?
- QWhat's the difference between a well-designed AutoSys environment and a poorly-designed one?
Frequently Asked Questions
What naming convention should I use for AutoSys jobs?
A common and effective pattern is ENVIRONMENT_SYSTEM_FUNCTION_FREQUENCY — for example, PRD_TRADING_EXTRACT_DAILY. This makes jobs self-documenting and searchable. Always prefix with the environment (PRD/QAT/DEV) to prevent accidental cross-environment mistakes.
What is the 3-level box hierarchy pattern in AutoSys?
The 3-level pattern is: a master box that controls the overall run schedule, section boxes (grouped by logical function like EXTRACT, TRANSFORM, REPORT), and CMD jobs inside each section box. This gives you visibility at multiple levels and clean partial recovery.
Should I version control my JIL scripts?
Yes, absolutely. Store all JIL definitions in Git (or your corporate SCM). Every change is tracked with who made it and why. When a schedule change breaks something, git log tells you exactly what changed. Many teams require a peer review on JIL changes before they're applied to production.
What is a pre-check job in AutoSys?
A pre-check job runs at the start of a box, before any real processing, and validates that all preconditions are met: sufficient disk space, database connectivity, input files present, dependent systems available. It's marked as box_terminator: 1 so a failed pre-check immediately stops the entire box rather than wasting hours of processing on a doomed run.
How many jobs is too many for one AutoSys box?
There's no hard limit, but more than 20-30 jobs in a single box starts to become hard to manage visually and operationally. When a box grows large, refactor it into a parent box with child section boxes. The 3-level hierarchy scales to hundreds of jobs while remaining manageable.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.