AutoSys FORCE_STARTJOB — Condition Bypass Corrupts Data
FORCE_STARTJOB bypassed validate_ledger and extract_transactions conditions.
- FORCE_STARTJOB: runs job immediately, bypasses ALL conditions (time + dependencies). Use when you need output now, but know what you're skipping.
- KILLJOB: terminates running job → TERMINATED status. Downstream success() conditions won't fire. Use CHANGE_STATUS SUCCESS afterward to unblock.
- RESTART: retry a FAILED or TERMINATED job. Cleaner than FORCE_STARTJOB for retries — audit logs show intent.
- STARTJOB: respects conditions. Job starts only if its time + dependency gates are open. Practically useless in emergencies.
- Production rule: FORCE_STARTJOB without understanding dependencies corrupts data. KILLJOB without CHANGE_STATUS blocks workflows for hours.
Force starting a job is like overriding the traffic light and going anyway. Killing a job is like hitting the emergency stop button. Restarting is like pressing the retry button after a failure. These are your emergency controls for when the normal flow needs intervention.
Production AutoSys environments need manual intervention. Jobs hang. Downstream dependencies get stuck. A fix deploys and you need to rerun a failed job at 2 AM.
Knowing the exact sendevent command is table stakes. Knowing what happens after — that's the senior engineer difference.
FORCE_STARTJOB bypasses conditions. KILLJOB leaves downstream jobs waiting. RESTART is for retries, not first runs. This article covers the side effects that incident post-mortems reveal.
FORCE_STARTJOB — bypassing all conditions
FORCE_STARTJOB immediately starts a job regardless of its date_conditions, start_times, or condition dependencies. It's the 'run it now, no questions asked' command.
Critical nuance: FORCE_STARTJOB bypasses EVERYTHING. Not just the schedule. Not just the time gates. Also any condition: success(other_job) dependencies. The job runs even if its upstream dependencies never ran or failed.
This is the most dangerous sendevent command. Use it only when you fully understand what conditions exist on the job.
KILLJOB — terminating a running job
KILLJOB sends a termination signal to the process running on the agent machine. The job moves to TERMINATED status. Any downstream jobs waiting on success() of this job will not start.
Critical nuance: TERMINATED is NOT FAILURE. It's a separate status. A job that's killed doesn't trigger success() conditions, but it also doesn't trigger failure() conditions unless you explicitly check for TERMINATED.
After KILLJOB, the process receives SIGTERM. Well-behaved processes can catch this and clean up. Hung processes may need SIGKILL (AutoSys handles this escalation after a timeout, typically 30 seconds).
success() won't trigger.RESTART — retrying a failed job
The RESTART event tells AutoSys to rerun a job that is in FAILURE or TERMINATED status. It's cleaner than FORCE_STARTJOB for rerunning failed jobs because it signals intent as a retry in audit logs.
Key difference from FORCE_STARTJOB: RESTART works only on FAILURE or TERMINATED jobs. FORCE_STARTJOB works on any non-running state. RESTART also respects that this is a retry — some AutoSys configurations treat retries differently for alerting purposes.
RESTART does NOT bypass conditions. The job still needs its start conditions satisfied (unless they were the reason it failed — then you have a cycle).
CHANGE_STATUS — manual status override
CHANGE_STATUS is the nuclear option. It manually sets a job's status in the Event Server without running anything. This is how you unblock workflows when a job can't or shouldn't be rerun.
Most common use: A job was killed (KILLJOB) but the work was already done. No point rerunning. Set its status to SUCCESS to fire downstream success() conditions.
Risks: CHANGE_STATUS bypasses ALL validation. No agent communication. No process verification. You're telling AutoSys 'trust me, this job is in this state'. If you're wrong, downstream processes run on incorrect assumptions.
The FORCE_STARTJOB That Corrupted the Ledger
condition: success(validate_ledger) AND success(extract_transactions). The validate_ledger job had failed earlier. The on-call engineer saw the reporting job in INACTIVE status and used FORCE_STARTJOB.
FORCE_STARTJOB bypassed BOTH conditions. The job ran without validated ledger data. The output was based on incomplete extracts. No alarm fired because the job succeeded.
The team didn't know about the condition dependency because they only looked at autorep -q, which shows conditions but not in an obvious way.autorep -J jobname -q | grep condition first.
2. For failed dependencies, fix and restart the dependency chain, not the leaf job.
3. Add box_terminator on validation jobs so the box stops before leaf jobs can be force-started externally.
4. Create an audit script that logs all FORCE_STARTJOB events and flags any that bypassed conditions.- FORCE_STARTJOB bypasses ALL conditions — time AND dependencies.
- Always check
autorep -q | grep conditionbefore force-starting. - Fixing a leaf job without fixing its dependencies propagates corruption.
- Force-start is for schedule overrides, not dependency bypasses.
success(). If yes, use CHANGE_STATUS -s SUCCESS after kill. Verify with autorep -d.success() conditions without rerunning the job. Verify autorep shows SUCCESS before downstream runs.Key takeaways
success() won't fire. Use CHANGE_STATUS to unblock if work was done.Common mistakes to avoid
5 patternsForce-starting a job without checking conditions first
autorep -J JOB -q | grep condition before FORCE_STARTJOB. If conditions exist, fix the dependency chain instead. Use RESTART on failed dependencies, then let the original job start naturally.Killing a job but not unblocking downstream dependencies
success(). Team manually reruns downstream jobs but missing the killed job's work.sendevent -E LIST_DEPENDENTS -J killed_job. If downstream jobs are waiting, either:
- RESTART the killed job (if work wasn't done)
- CHANGE_STATUS to SUCCESS (if work was done elsewhere)Using FORCE_STARTJOB when RESTART would be appropriate
Assuming CHANGE_STATUS triggers dependency re-evaluation
Using FORCE_STARTJOB on a child job inside a non-running BOX
Interview Questions on This Topic
What is the difference between FORCE_STARTJOB and STARTJOB?
Frequently Asked Questions
JIL syntax, sendevent, autorep, box jobs, file watchers, scheduling, HA, security, cloud workload automation, and 22 interview questions — the definitive AutoSys reference for production engineers.
That's AutoSys. Mark it forged?
3 min read · try the examples if you haven't