Senior 3 min · March 19, 2026

AutoSys File Watcher — min_file_size:0 Empty File Trap

Empty lock files trigger File Watchers before real data arrives.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • File Watcher (job_type: FW) triggers downstream jobs the moment a file arrives — event-driven, not time-based
  • watch_file supports wildcards. Be specific: /data/.csv triggers on every CSV, including partial writes
  • min_file_size prevents empty or partial file triggers. Set to 1KB minimum — never leave at 0
  • run_window restricts active hours. Without it, stale files from last week trigger immediately on restart
  • Production trap: upstream writes temp file then renames. Your wildcard matches the temp file. Trigger on incomplete data.
Plain-English First

A File Watcher job is like a security guard sitting at the loading dock. When the truck (file) arrives and the package (file) is unloaded (reaches minimum size), the guard calls the warehouse (triggers the next job) to start processing.

You can't schedule a job at a fixed time when you don't control when the data arrives. That's the problem File Watcher solves.

A bank sends trade files anytime between 8 AM and 6 PM. You want processing to start the moment the file lands — not poll every 5 minutes and definitely not guess a start time.

But here's what bites people: empty files trigger the watcher. Stale files from yesterday trigger the watcher. The wrong wildcard matches a temp file mid-write. This article covers the production traps that monitoring won't catch.

Defining a File Watcher job

A File Watcher job has job_type: FW (or just 'f'). The key attribute is watch_file — the full path of the file to watch for. The job completes with SUCCESS the moment it finds a file matching the pattern AND the file meets the min_file_size requirement.

Here's the non-obvious part: the File Watcher doesn't 'consume' the file. It just detects it. The file remains on disk. Downstream jobs are responsible for reading, moving, or deleting it. Multiple watchers can't watch the same file — the first one wins.

file_watcher.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
insert_job: watch_settlement_file
job_type: FW
watch_file: /data/inbound/settlement_*.csv
watch_interval: 60                 /* check every 60 seconds */
min_file_size: 512                 /* at least 512 bytes (not empty) */
machine: file-landing-server
owner: batchuser
date_conditions: 1
days_of_week: mon-fri
run_window: "07:00 - 23:00"        /* only watch during this window */
alarm_if_fail: 1
description: "Watches for daily settlement file from clearing house"

insert_job: process_settlement
job_type: CMD
command: /opt/scripts/process_settlement.sh
machine: processing-server-01
owner: batchuser
condition: success(watch_settlement_file)
Output
/* 09:32:07 — watch_settlement_file found: /data/inbound/settlement_20260319.csv (87431 bytes)
09:32:07 — watch_settlement_file: SUCCESS
09:32:08 — process_settlement: STARTING */
The Security Guard Model
  • Watcher doesn't delete, move, or read the file. It just reports existence.
  • Multiple watchers on the same file cause a race. First trigger wins.
  • The file is still there after the watcher succeeds. Your downstream job must handle it.
  • If downstream fails, the file remains. Watcher won't trigger again unless the file changes.
Production Insight
A team had two File Watchers watching the same directory pattern. One triggered on file arrival, moved the file to /processed, and succeeded. The second watcher was still RUNNING, waiting for the same file pattern. The file was already moved. Second watcher stayed RUNNING forever until manually terminated.
Diagnosis: autorep showed second watcher as RUNNING. The file didn't exist anymore. No error, just stuck.
Fix: Each watcher needs a unique watch pattern or the files must be distinct. Or use a box with a single watcher and fan-out dependencies.
Rule: One watcher per file pattern. If you need multiple downstream jobs, use condition dependencies from the single watcher.
Key Takeaway
FW job = file arrival detector, not file processor.
Downstream jobs must handle file consumption — read, move, or delete.
One watcher per file pattern. Multiple watchers cause race conditions.
Watcher succeeds once per file. After success, the file stays.
Should you use a File Watcher or a scheduled CMD job with file check?
IfFile arrival time is unpredictable (±4 hour window)
UseUse File Watcher. Scheduled polling wastes cycles and adds latency.
IfFile arrives within a predictable 30-minute window
UseEither works. Scheduled job with file check is simpler to debug.
IfYou need to check file content before triggering
UseUse CMD job with full validation. Watcher only checks size and existence.
IfFiles arrive multiple times per day, same pattern
UseFile Watcher triggers once per file. For multiple files, use box with running conditions.
File Watcher Trigger Flow File Watcher Trigger Flow. Event-driven processing on file arrival · Upstream system writes file · /data/inbound/trades_*.csv · File Watcher polls (30s) · checks file path on agent machine · min_file_size check passes THECODEFORGE.IOFile Watcher Trigger FlowEvent-driven processing on file arrival Upstream system writes file/data/inbound/trades_*.csv File Watcher polls (30s)checks file path on agent machine min_file_size check passesfile ≥ configured minimum bytes File Watcher → SUCCESScondition satisfied automatically Downstream CMD job startscondition: success(watch_job)THECODEFORGE.IO
thecodeforge.io
File Watcher Trigger Flow
Autosys File Watcher Jobs

Understanding watch_file wildcards

watch_file supports the * wildcard, which matches any sequence of characters in a filename. The File Watcher triggers as soon as any file matching the pattern appears. This is especially useful when upstream systems include a date in the filename.

The wildcard matches only the filename, not subdirectories. /data/*/file.csv doesn't work. Use separate watchers for different subdirectories.

Critical nuance: The watcher triggers on the FIRST file matching the pattern. If multiple files arrive simultaneously, the watcher triggers on one, succeeds, and the other files are never detected by that watcher instance. Use atomic file naming to control which file triggers.

wildcard_examples.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/* Match any CSV file starting with 'trades_' */
watch_file: /data/inbound/trades_*.csv

/* Match any file in the directory (dangerous — matches everything) */
watch_file: /data/inbound/*

/* Match files with a date-stamped name pattern */
watch_file: /data/feeds/FEED_*_DONE.txt

/* Be specific to avoid triggering on partial/temp files */
watch_file: /data/inbound/FINAL_trades_*.csv    /* not temp_trades_*.csv */

/* Safe pattern for atomic writes: write to .tmp, rename to .ready */
/* Watcher watches .ready files only */
watch_file: /data/inbound/*.ready
Wildcards trigger on the first match only
If watch_file matches multiple files (e.g., /data/*.csv and three CSVs arrive at once), the watcher triggers on whichever file the filesystem returns first. The other files are not detected. Use unique patterns or single-file-per-trigger designs.
Production Insight
A data pipeline dropped 15 partition files into a directory simultaneously. One File Watcher with pattern /data/partition_*.parquet triggered on partition_03.parquet, succeeded, and processing began. The other 14 files sat unprocessed until the next day's batch.
The team assumed the watcher would queue multiple triggers. It doesn't.
Fix: Changed upstream to write a single manifest file after all partitions are ready. The watcher watches the manifest. Downstream reads all partition files.
Alternative: Use a box with a watcher and a command job that loops through all matching files.
Rule: File Watcher = one trigger per run. If you need multiple files detected, redesign the file pattern or upstream write pattern.
Key Takeaway
* matches any characters in filename, not subdirectories.
First matching file wins. Later files are ignored by that watcher run.
Use atomic file naming: write to .tmp, rename to .ready.
For multiple files, use a manifest file pattern.

run_window — limiting when the watcher is active

Without run_window, a File Watcher with date_conditions: 0 runs continuously 24/7. That's usually not what you want — if a stale file from last week is still in the directory when the watcher starts, it triggers immediately on the old file.

run_window restricts the hours during which the File Watcher will detect the file. Outside the window, it won't trigger even if the file is there. When the window opens again, the watcher checks for files — if an old file is still present, it will trigger at window open.

Critical: run_window does NOT prevent the watcher from seeing old files when the window opens. It only restricts when detection is active. File age is not considered.

run_window.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
/* Only watch for the file between 6 AM and 8 PM */
insert_job: watch_eod_file
job_type: FW
watch_file: /data/eod/eod_positions_*.dat
watch_interval: 30
min_file_size: 10240
machine: data-server-01
owner: batchuser
run_window: "06:00 - 20:00"   /* won't trigger outside these hours */
alarm_if_fail: 1

/* To avoid stale files at window open — add date to watch path */
watch_file: /data/eod/eod_positions_$(date +%Y%m%d).dat
/* Or use a wrapper CMD job that checks file timestamp before processing */
run_window doesn't filter by file age
When the window opens at 8 AM, if a file from yesterday still exists, the watcher triggers immediately on the stale file. Use date-stamped file names or a downstream age check to prevent this.
Production Insight
A Friday night file arrived at 11 PM, just before the weekend. The File Watcher had run_window: "09:00 - 17:00" (business hours only). The file sat in the directory all weekend. Monday at 9:00 AM, the watcher started, saw the file, and triggered immediately. Downstream jobs processed Friday's data on Monday morning, overwriting Monday's expected data.
Recovery: restore from backup, reprocess Friday correctly, mark Monday as errored.
Fix: Use date-stamped watch_file paths: watch_file: /data/feed_$(date +%Y%m%d).csv. The watcher only matches today's date. Stale files from other dates are ignored.
Alternative: Downstream job checks file modification time. If > 12 hours old, fails with alert instead of processing stale data.
Rule: run_window controls WHEN detection happens. Date-stamped paths or age checks control WHICH files are valid.
Key Takeaway
run_window restricts active hours — no detection outside window.
At window open, the watcher sees ALL matching files, including stale ones.
Use date-stamped watch paths or downstream age checks for freshness.
run_window alone doesn't prevent stale-file triggers.

Troubleshooting File Watcher jobs

Problem: File Watcher stays RUNNING after file arrived Check: Is the file smaller than min_file_size? Is the file on the machine specified in the machine attribute? Is the agent on that machine running? Is current time inside run_window?

Problem: File Watcher triggered on wrong file Check: Is your wildcard pattern too broad? Did an old file match? Did a temp file match? Use autorep -J FW_JOB -q to see the exact watch_file pattern.

Problem: File Watcher didn't trigger at all Check: Is the current time within run_window? Is date_conditions set correctly (must be 1 if using start_times)? Is the file in the exact path specified (case-sensitive on Linux)?

Problem: File Watcher triggers but downstream fails saying file is empty Check: min_file_size likely too low. Upstream may have written a zero-byte lock file first.

fw_troubleshoot.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Check File Watcher job status
autorep -J watch_settlement_file -d

# Check the exact watch_file attribute set
autorep -J watch_settlement_file -q

# Verify the file actually exists on the target machine
ssh file-landing-server 'ls -la /data/inbound/settlement_*.csv'

# Check agent is running on the watcher machine
autoping -m file-landing-server

# Check current autosys time (in case timezone matters)
autoflags -a | grep -i time

# Check if file is below min_file_size
ssh file-landing-server 'wc -c /data/inbound/settlement_*.csv'
Output
Job Name: watch_settlement_file
Status: RU <- still running — file not yet found
watch_file: /data/inbound/settlement_*.csv
min_file_size: 512
run_window: 07:00 - 23:00
File on agent: /data/inbound/settlement_20260319.csv (0 bytes) <- problem: empty file, min_file_size 512
The autorep -q command shows the exact watch_file pattern
autorep -J FW_JOB -q resolves any variables or wildcards. Always check this before debugging — the pattern may not be what you think.
Production Insight
A team spent 4 hours debugging a File Watcher that wouldn't trigger. The watch_file was /data/inbound/*_feed.csv. A file named ABC_feed.csv existed in the directory. Everything looked correct.
The problem: The file was on a different machine than the machine attribute. The agent on machineA was checking its local /data/inbound directory. The file was on machineB.
Diagnosis would have taken 30 seconds: ssh machineA 'ls -la /data/inbound/*_feed.csv' returned nothing.
Rule: Always verify file existence on the EXACT machine specified in the job's machine attribute. 'ls' on your workstation proves nothing.
Key Takeaway
Stuck RUNNING? Check min_file_size and file existence on the right machine.
Wrong trigger? Wildcard too broad or stale file in directory.
No trigger? Check run_window and date_conditions.
Always verify file path on the agent machine — not your local workstation.
● Production incidentPOST-MORTEMseverity: high

The Empty File That Crashed the Data Warehouse

Symptom
File Watcher job shows SUCCESS. Downstream processing job fails immediately with 'no data found' or 'file empty' errors. The actual data file exists in the directory with correct size and timestamp, created seconds after the watcher triggered.
Assumption
The team assumed min_file_size: 0 was safe — 'the file will have data when it arrives'. They didn't know the upstream system wrote a 0-byte placeholder before the real file.
Root cause
min_file_size: 0 (or omitted) means ANY file, including empty ones, triggers the watcher. Upstream systems often write lock files, temp files, or zero-byte placeholders before writing actual data. The File Watcher can't distinguish intent — if a file matches watch_file and meets min_file_size (0 bytes qualifies), it triggers immediately. The fix seems obvious in hindsight, but teams discover it only after the first empty-file incident.
Fix
1. Set min_file_size: 1024 (1KB) minimum. Real data files are rarely smaller. 2. For CSV/JSON files, check header-only files. Use min_file_size: 10240 (10KB) if headers are small. 3. Add downstream validation that checks file content, not just existence. 4. Work with upstream teams to use atomic file moves: write to .tmp, then rename to .csv.
Key lesson
  • Never set min_file_size: 0. 1KB minimum for text files, 10KB for data files.
  • Upstream write patterns matter. Ask: does your source write temp/lock files?
  • Atomic file moves (write .tmp, rename) prevent partial-trigger issues.
  • Downstream jobs must validate content, not assume the watcher got it right.
Production debug guideWhen your watcher doesn't watch — or watches too much4 entries
Symptom · 01
File Watcher stays RUNNING even though file exists
Fix
Check min_file_size. If file is 512 bytes and min_file_size is 1024, watcher won't trigger. Check file is on correct machine (ls on agent host). Check run_window — current time must be inside window.
Symptom · 02
File Watcher triggered on empty or partial file
Fix
Check min_file_size. If 0 or omitted, any file triggers. Set to at least 1024. Also check if upstream writes temp files first — adjust wildcard to exclude *_tmp patterns.
Symptom · 03
File Watcher triggered on old file from yesterday
Fix
run_window not set or too wide. Watcher triggers on ANY matching file when it starts, regardless of age. Set run_window to expected arrival window. Add date-specific subdirectory if possible.
Symptom · 04
File Watcher triggers on wrong file (multiple matches)
Fix
Wildcard too broad. watch_file: /data/.csv triggers on every CSV. Use specific pattern: /data/FEED_.csv or include date: /data/20260319_*.csv.
★ File Watcher — 60-Second DiagnosisWhen files exist but the watcher doesn't trigger, or triggers on the wrong file
Watcher stuck RUNNING, file exists
Immediate action
Check file size vs min_file_size
Commands
ls -la /path/to/watch_file/*.csv
autorep -J FW_JOB -q | grep min_file_size
Fix now
update_job: FW_JOB min_file_size: 1024
Watcher triggered on empty file+
Immediate action
Find the empty file in watch directory
Commands
find /path/to/watch_dir -size 0 -name '*.csv'
autorep -J FW_JOB -q | grep min_file_size
Fix now
update_job: FW_JOB min_file_size: 1024
Watcher triggered on stale file+
Immediate action
Check run_window and file timestamps
Commands
autorep -J FW_JOB -q | grep run_window
ls -la /path/to/watch_dir/ | grep 'Mar 18'
Fix now
update_job: FW_JOB run_window: "08:00 - 20:00"
Watcher not triggering — no file found+
Immediate action
Verify file path on correct machine
Commands
ssh agent_host 'ls -la /path/to/watch_file/ '
autoping -m agent_host
Fix now
Check JIL machine attribute matches where file actually lands
File Watcher Attributes — Quick Reference
AttributeWhat it doesDefault if omittedProduction recommendation
watch_fileFile path pattern to watch forRequired — no defaultUse specific patterns with date stamps, avoid /* alone
watch_intervalHow often to check (seconds)60 seconds30-60 seconds for most cases — lower = more agent load
min_file_sizeMinimum file size before success0 (any size, including empty)1024 bytes minimum, 10240 for data files — never 0
run_windowHours during which watcher is activeNo restriction (24/7)Match expected file arrival window + 2 hours buffer
machineAgent machine where file is locatedRequired — no defaultMust match the server where file physically lands

Key takeaways

1
File Watcher = event-driven trigger on file arrival
perfect for unpredictable upstream schedules
2
min_file_size default 0 is dangerous. Set to 1024 bytes minimum. Never leave at 0 in production.
3
run_window restricts active hours but doesn't filter by file age. Stale files trigger at window open.
4
Wildcards trigger on first match only. Multiple simultaneous files? Use a manifest pattern.
5
Watcher detects only
downstream must consume (move/delete). Otherwise same file triggers repeatedly.
6
Always verify file existence on the agent machine
ssh agent_host 'ls -la'. Your workstation means nothing.

Common mistakes to avoid

5 patterns
×

Setting min_file_size: 0 or omitting it

Symptom
File Watcher triggers on empty or partially-written files. Downstream jobs fail with 'file empty' or 'no data' errors. The real file arrives seconds later, but the watcher has already succeeded and won't trigger again.
Fix
Always set min_file_size at least 1024 (1KB). For data files expected to be large, set 10240 (10KB) or higher. Never leave it at 0.
×

Not setting run_window

Symptom
File Watcher triggers on stale files from previous days when it starts up. A file from Friday triggers on Monday morning, causing downstream jobs to process outdated data.
Fix
Set run_window to match expected file arrival window. Use date-stamped watch_file paths to filter by date. Add downstream file age validation.
×

Using overly broad wildcards (e.g., /data/*)

Symptom
Watcher triggers on temp files, lock files, or unrelated files. Multiple files arrive simultaneously; watcher picks one and ignores others.
Fix
Be specific. Use /data/FEED_.csv not /data/.csv. Work with upstream to use atomic file naming (.tmp → .ready).
×

Pointing watch_file to a different machine than where the file lands

Symptom
File exists, watcher stays RUNNING forever. Teams waste hours checking file permissions, sizes, and wildcards. The file is on the wrong server entirely.
Fix
Always verify file existence with ssh agent_host 'ls -la /path/to/file'. The agent on that machine does the local check. Your workstation's filesystem is irrelevant.
×

Assuming File Watcher deletes or moves the file

Symptom
Watcher succeeds. Downstream runs. Next day, the same file is still in the directory. The watcher triggers again on the same file, causing duplicate processing.
Fix
Downstream jobs must move, rename, or delete the file after processing. The watcher only detects — it doesn't consume.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is an AutoSys File Watcher job and when would you use one?
Q02JUNIOR
What does the min_file_size attribute do on a File Watcher?
Q03SENIOR
What does run_window control on a File Watcher?
Q04SENIOR
Your File Watcher triggered prematurely on an incomplete file — what are...
Q05SENIOR
Can a File Watcher job detect files on a remote server?
Q06SENIOR
How does a File Watcher handle multiple files arriving simultaneously?
Q01 of 06JUNIOR

What is an AutoSys File Watcher job and when would you use one?

ANSWER
A File Watcher (job_type: FW) monitors a file path pattern on an agent machine. When a matching file appears and meets min_file_size, the job completes with SUCCESS and triggers dependent jobs. Use cases: - Processing files from external systems with unpredictable arrival times (trading files, bank feeds) - Event-driven pipelines where polling would be inefficient - Orchestrating workflows that start when a file is ready Avoid File Watchers when: files arrive at predictable times (use scheduled jobs), you need to check file content before triggering (use CMD job), or multiple files need to be batched together (use manifest pattern).
FAQ · 6 QUESTIONS

Frequently Asked Questions

01
What is a File Watcher job in AutoSys?
02
What is min_file_size in a File Watcher?
03
Why is my File Watcher triggering on old files?
04
Can a File Watcher watch for a file on a remote machine?
05
What happens if the file never arrives within the run_window?
06
Does a File Watcher delete or move the file after triggering?
COMPLETE GUIDE
The Complete AutoSys Workload Automation Guide for Engineers →

JIL syntax, sendevent, autorep, box jobs, file watchers, scheduling, HA, security, cloud workload automation, and 22 interview questions — the definitive AutoSys reference for production engineers.

🔥

That's AutoSys. Mark it forged?

3 min read · try the examples if you haven't

Previous
Box Jobs and Nested Boxes in AutoSys
11 / 30 · AutoSys
Next
JIL One-Time Job Overrides