sendevent: 4 Commands That Fix AutoSys
- FORCE_STARTJOB bypasses ALL conditions — time AND dependencies. Fix dependencies first.
- KILLJOB → TERMINATED. Downstream
success()won't fire. Use CHANGE_STATUS to unblock if work was done. - RESTART is for retrying FAILED/TERMINATED jobs after transient failures. Check failure reason first.
- FORCE_STARTJOB: runs job immediately, bypasses ALL conditions (time + dependencies). Use when you need output now, but know what you're skipping.
- KILLJOB: terminates running job → TERMINATED status. Downstream success() conditions won't fire. Use CHANGE_STATUS SUCCESS afterward to unblock.
- RESTART: retry a FAILED or TERMINATED job. Cleaner than FORCE_STARTJOB for retries — audit logs show intent.
- STARTJOB: respects conditions. Job starts only if its time + dependency gates are open. Practically useless in emergencies.
- Production rule: FORCE_STARTJOB without understanding dependencies corrupts data. KILLJOB without CHANGE_STATUS blocks workflows for hours.
sendevent — 60-Second Emergency Reference
Job won't start — need output now regardless of conditions
autorep -J JOBNAME -q | grep -E 'condition|date_conditions|start_times'sendevent -E FORCE_STARTJOB -J JOBNAMEJob hung — needs termination
sendevent -E KILLJOB -J JOBNAMEsendevent -E CHANGE_STATUS -J JOBNAME -s SUCCESSJob failed after transient error — need retry
autorep -J JOBNAME -L 10 | grep FAILUREsendevent -E RESTART -J JOBNAMEDownstream blocked on job that won't rerun
autorep -J UPSTREAM_JOB -q | grep statussendevent -E CHANGE_STATUS -J UPSTREAM_JOB -s SUCCESSProduction Incident
condition: success(validate_ledger) AND success(extract_transactions). The validate_ledger job had failed earlier. The on-call engineer saw the reporting job in INACTIVE status and used FORCE_STARTJOB.
FORCE_STARTJOB bypassed BOTH conditions. The job ran without validated ledger data. The output was based on incomplete extracts. No alarm fired because the job succeeded.
The team didn't know about the condition dependency because they only looked at autorep -q, which shows conditions but not in an obvious way.autorep -J jobname -q | grep condition first.
2. For failed dependencies, fix and restart the dependency chain, not the leaf job.
3. Add box_terminator on validation jobs so the box stops before leaf jobs can be force-started externally.
4. Create an audit script that logs all FORCE_STARTJOB events and flags any that bypassed conditions.autorep -q | grep condition before force-starting.Fixing a leaf job without fixing its dependencies propagates corruption.Force-start is for schedule overrides, not dependency bypasses.Production Debug GuideWhat to run when jobs won't start or won't die
success(). If yes, use CHANGE_STATUS -s SUCCESS after kill. Verify with autorep -d.success() conditions without rerunning the job. Verify autorep shows SUCCESS before downstream runs.Production AutoSys environments need manual intervention. Jobs hang. Downstream dependencies get stuck. A fix deploys and you need to rerun a failed job at 2 AM.
Knowing the exact sendevent command is table stakes. Knowing what happens after — that's the senior engineer difference.
FORCE_STARTJOB bypasses conditions. KILLJOB leaves downstream jobs waiting. RESTART is for retries, not first runs. This article covers the side effects that incident post-mortems reveal.
FORCE_STARTJOB — bypassing all conditions
FORCE_STARTJOB immediately starts a job regardless of its date_conditions, start_times, or condition dependencies. It's the 'run it now, no questions asked' command.
Critical nuance: FORCE_STARTJOB bypasses EVERYTHING. Not just the schedule. Not just the time gates. Also any condition: success(other_job) dependencies. The job runs even if its upstream dependencies never ran or failed.
This is the most dangerous sendevent command. Use it only when you fully understand what conditions exist on the job.
# Force start a single job immediately sendevent -E FORCE_STARTJOB -J daily_report # Force start a BOX (starts the box, inner jobs still follow their conditions) sendevent -E FORCE_STARTJOB -J eod_processing_box # Force start on a specific date (run it as if it were that date) sendevent -E FORCE_STARTJOB -J daily_report -q 20260319 # Check what conditions you're about to bypass — ALWAYS do this first autorep -J daily_report -q | grep condition # Check current status autorep -J daily_report
condition: success(extract_trades) AND success(validate_positions)
/* Event sent: FORCE_STARTJOB for daily_report */
/* Job daily_report: STARTING → RUNNING (02:00:01) */
/* Both dependencies were skipped entirely */
KILLJOB — terminating a running job
KILLJOB sends a termination signal to the process running on the agent machine. The job moves to TERMINATED status. Any downstream jobs waiting on success() of this job will not start.
Critical nuance: TERMINATED is NOT FAILURE. It's a separate status. A job that's killed doesn't trigger success() conditions, but it also doesn't trigger failure() conditions unless you explicitly check for TERMINATED.
After KILLJOB, the process receives SIGTERM. Well-behaved processes can catch this and clean up. Hung processes may need SIGKILL (AutoSys handles this escalation after a timeout, typically 30 seconds).
# Kill a running job sendevent -E KILLJOB -J hung_etl_job # After killing, check status autorep -J hung_etl_job # Check if any downstream jobs are blocked sendevent -E LIST_DEPENDENTS -J hung_etl_job # If you need downstream jobs to proceed after kill: # First kill the job, then manually mark it as success sendevent -E KILLJOB -J hung_etl_job sendevent -E CHANGE_STATUS -J hung_etl_job -s SUCCESS # Check downstream jobs are now unblocked autorep -J downstream_job
hung_etl_job TE -- <- TERMINATED after KILLJOB
Dependents:
downstream_job: waiting on success(hung_etl_job) → condition false
After CHANGE_STATUS SUCCESS:
downstream_job: condition met → job will start normally
success() won't trigger.RESTART — retrying a failed job
The RESTART event tells AutoSys to rerun a job that is in FAILURE or TERMINATED status. It's cleaner than FORCE_STARTJOB for rerunning failed jobs because it signals intent as a retry in audit logs.
Key difference from FORCE_STARTJOB: RESTART works only on FAILURE or TERMINATED jobs. FORCE_STARTJOB works on any non-running state. RESTART also respects that this is a retry — some AutoSys configurations treat retries differently for alerting purposes.
RESTART does NOT bypass conditions. The job still needs its start conditions satisfied (unless they were the reason it failed — then you have a cycle).
# Restart a failed job after fixing the root cause sendevent -E RESTART -J failed_extract_job # Before restarting, check WHY it failed autorep -J failed_extract_job -L 10 # Check if the fix actually worked (test mode) autorep -J failed_extract_job -q | grep command # Manually run the command on the agent to verify # Pattern: check for failures and restart them (for transient failures only) autorep -J % -s FA | awk '{print $1}' | while read job; do echo "Restarting failed job: $job" sendevent -E RESTART -J "$job" done # For TERMINATED jobs (killed), RESTART also works sendevent -E RESTART -J killed_job
Restarting failed job: load_positions
/* Job extract_trades: FAILURE → STARTING → RUNNING */
/* Restart treated as a new run, not a continuation */
CHANGE_STATUS — manual status override
CHANGE_STATUS is the nuclear option. It manually sets a job's status in the Event Server without running anything. This is how you unblock workflows when a job can't or shouldn't be rerun.
Most common use: A job was killed (KILLJOB) but the work was already done. No point rerunning. Set its status to SUCCESS to fire downstream success() conditions.
Risks: CHANGE_STATUS bypasses ALL validation. No agent communication. No process verification. You're telling AutoSys 'trust me, this job is in this state'. If you're wrong, downstream processes run on incorrect assumptions.
# After a KILLJOB, mark it as SUCCESS to unblock downstream sendevent -E CHANGE_STATUS -J hung_job -s SUCCESS # Manually mark a job as FAILURE (e.g., if you see it's going to fail) sendevent -E CHANGE_STATUS -J running_job -s FAILURE # Mark a job as INACTIVE to prevent it from running sendevent -E CHANGE_STATUS -J job_on_hold -s INACTIVE # Verify the status change took effect autorep -J hung_job -q | grep status # Dangerous: Mark a job as SUCCESS that never ran # Only do this if you have verified the work was done elsewhere sendevent -E CHANGE_STATUS -J never_ran_job -s SUCCESS
/* Downstream jobs with success(hung_job) now evaluate as true */
/* No actual process ran — you are asserting correctness */
| Event | Respects conditions? | Works on status | What actually happens | Typical use |
|---|---|---|---|---|
| FORCE_STARTJOB | No — bypasses all | Any non-running state | Job starts immediately, no condition checks | Run job outside its schedule (fixing missed window) |
| STARTJOB | Yes | INACTIVE / ACTIVATED | Job starts only if time and conditions are met | Trigger after fixing a condition (rarely used) |
| RESTART | Yes (transient failures only) | FAILURE / TERMINATED | Reruns failed job, respects that it's a retry | Rerun after fixing a transient failure |
| KILLJOB | N/A | RUNNING only | SIGTERM to agent process → TERMINATED status | Terminate hung or infinite-loop job |
| CHANGE_STATUS + SUCCESS | N/A | Any | Manually sets status without running anything | Unblock downstream after manual verification |
🎯 Key Takeaways
- FORCE_STARTJOB bypasses ALL conditions — time AND dependencies. Fix dependencies first.
- KILLJOB → TERMINATED. Downstream
success()won't fire. Use CHANGE_STATUS to unblock if work was done. - RESTART is for retrying FAILED/TERMINATED jobs after transient failures. Check failure reason first.
- CHANGE_STATUS is a manual override that doesn't trigger dependency re-evaluation automatically.
- Always check conditions before force-starting: autorep -J JOB -q | grep condition.
- Document every manual intervention. If you're using these commands weekly, your automation is broken.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat is the difference between FORCE_STARTJOB and STARTJOB?JuniorReveal
- QWhat happens to a job's status after KILLJOB?JuniorReveal
- QWhat is the difference between RESTART and FORCE_STARTJOB for a failed job?Mid-levelReveal
- QHow would you unblock dependent jobs after a job was killed?Mid-levelReveal
- QIf term_run_time terminates a job, what status does it move to?JuniorReveal
- QWhat are the risks of using CHANGE_STATUS to mark a job as SUCCESS when it didn't actually run?SeniorReveal
Frequently Asked Questions
How do I force start an AutoSys job from the command line?
Use sendevent -E FORCE_STARTJOB -J jobname. This starts the job immediately, bypassing all starting conditions including date_conditions, start_times, and condition dependencies.
Before force-starting, always check dependencies: autorep -J jobname -q | grep condition. If the job has dependencies, fix them first — force-starting a job that depends on missing data produces corrupt output.
What happens after KILLJOB in AutoSys?
The job moves to TERMINATED status. Any downstream jobs with condition: success(killed_job) will not run because success was never declared.
If you need downstream jobs to proceed and the work was actually done: 1. sendevent -E KILLJOB -J jobname 2. sendevent -E CHANGE_STATUS -J jobname -s SUCCESS
If the work wasn't done, fix the root cause then sendevent -E RESTART -J jobname.
What is the difference between RESTART and FORCE_STARTJOB?
Both can run a job immediately, but they signal different intent: - RESTART: Works only on FAILURE or TERMINATED jobs. Semantically a retry after failure. Cleaner in audit logs. - FORCE_STARTJOB: Works on any non-running state. Bypasses all conditions. Use for schedule overrides, not routine retries.
For a failed job, prefer RESTART. For a job that never ran (INACTIVE) that you need to run outside its schedule, use FORCE_STARTJOB.
Can I force start a job that is inside a BOX?
Yes, but it's usually the wrong approach. Force-starting a child job inside a non-running BOX leads to inconsistent state — the BOX may still show INACTIVE while the child runs.
Better approach: Force-start the BOX itself: sendevent -E FORCE_STARTJOB -J BOXNAME. The child job will start naturally if its conditions are met.
If you must run only the child, either move it outside the BOX temporarily or accept that the BOX state will be inconsistent.
What status does a job have after term_run_time kills it?
TERMINATED (TE) — the same status as KILLJOB. The exit code will be -1 or a signal number. Check autorep -J jobname -d to see the exact exit code and confirm it was term_run_time that caused it.
Downstream success() conditions will NOT trigger. You need to either RESTART the job (if it needs to rerun) or CHANGE_STATUS to SUCCESS (if the work was already done before timeout).
Does CHANGE_STATUS automatically trigger downstream jobs?
No. CHANGE_STATUS updates the stored status in the Event Server but does NOT automatically trigger a full dependency re-evaluation. This is a common misconception.
- Send a dummy event to trigger re-evaluation (e.g.,
sendevent -E JOB_STATUS_CHANGED) - Force-start the downstream job directly
- For critical paths, the upstream job may need to actually rerun
Test your specific case in a dev environment before relying on CHANGE_STATUS to unblock workflows.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.