File Watcher (job_type: FW) triggers downstream jobs the moment a file arrives — event-driven, not time-based
watch_file supports wildcards. Be specific: /data/.csv triggers on every CSV, including partial writes
min_file_size prevents empty or partial file triggers. Set to 1KB minimum — never leave at 0
run_window restricts active hours. Without it, stale files from last week trigger immediately on restart
Production trap: upstream writes temp file then renames. Your wildcard matches the temp file. Trigger on incomplete data.
Plain-English First
A File Watcher job is like a security guard sitting at the loading dock. When the truck (file) arrives and the package (file) is unloaded (reaches minimum size), the guard calls the warehouse (triggers the next job) to start processing.
You can't schedule a job at a fixed time when you don't control when the data arrives. That's the problem File Watcher solves.
A bank sends trade files anytime between 8 AM and 6 PM. You want processing to start the moment the file lands — not poll every 5 minutes and definitely not guess a start time.
But here's what bites people: empty files trigger the watcher. Stale files from yesterday trigger the watcher. The wrong wildcard matches a temp file mid-write. This article covers the production traps that monitoring won't catch.
Defining a File Watcher job
A File Watcher job has job_type: FW (or just 'f'). The key attribute is watch_file — the full path of the file to watch for. The job completes with SUCCESS the moment it finds a file matching the pattern AND the file meets the min_file_size requirement.
Here's the non-obvious part: the File Watcher doesn't 'consume' the file. It just detects it. The file remains on disk. Downstream jobs are responsible for reading, moving, or deleting it. Multiple watchers can't watch the same file — the first one wins.
file_watcher.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
insert_job: watch_settlement_file
job_type: FW
watch_file: /data/inbound/settlement_*.csv
watch_interval: 60 /* check every 60 seconds */
min_file_size: 512 /* at least 512bytes (not empty) */
machine: file-landing-server
owner: batchuser
date_conditions: 1
days_of_week: mon-fri
run_window: "07:00 - 23:00" /* only watch during this window */
alarm_if_fail: 1
description: "Watches for daily settlement file from clearing house"
insert_job: process_settlement
job_type: CMD
command: /opt/scripts/process_settlement.sh
machine: processing-server-01
owner: batchuser
condition: success(watch_settlement_file)
Watcher doesn't delete, move, or read the file. It just reports existence.
Multiple watchers on the same file cause a race. First trigger wins.
The file is still there after the watcher succeeds. Your downstream job must handle it.
If downstream fails, the file remains. Watcher won't trigger again unless the file changes.
Production Insight
A team had two File Watchers watching the same directory pattern. One triggered on file arrival, moved the file to /processed, and succeeded. The second watcher was still RUNNING, waiting for the same file pattern. The file was already moved. Second watcher stayed RUNNING forever until manually terminated.
Diagnosis: autorep showed second watcher as RUNNING. The file didn't exist anymore. No error, just stuck.
Fix: Each watcher needs a unique watch pattern or the files must be distinct. Or use a box with a single watcher and fan-out dependencies.
Rule: One watcher per file pattern. If you need multiple downstream jobs, use condition dependencies from the single watcher.
Key Takeaway
FW job = file arrival detector, not file processor.
Downstream jobs must handle file consumption — read, move, or delete.
One watcher per file pattern. Multiple watchers cause race conditions.
Watcher succeeds once per file. After success, the file stays.
Should you use a File Watcher or a scheduled CMD job with file check?
IfFile arrival time is unpredictable (±4 hour window)
→
UseUse File Watcher. Scheduled polling wastes cycles and adds latency.
IfFile arrives within a predictable 30-minute window
→
UseEither works. Scheduled job with file check is simpler to debug.
IfYou need to check file content before triggering
→
UseUse CMD job with full validation. Watcher only checks size and existence.
IfFiles arrive multiple times per day, same pattern
→
UseFile Watcher triggers once per file. For multiple files, use box with running conditions.
thecodeforge.io
File Watcher Trigger Flow
Autosys File Watcher Jobs
Understanding watch_file wildcards
watch_file supports the * wildcard, which matches any sequence of characters in a filename. The File Watcher triggers as soon as any file matching the pattern appears. This is especially useful when upstream systems include a date in the filename.
The wildcard matches only the filename, not subdirectories. /data/*/file.csv doesn't work. Use separate watchers for different subdirectories.
Critical nuance: The watcher triggers on the FIRST file matching the pattern. If multiple files arrive simultaneously, the watcher triggers on one, succeeds, and the other files are never detected by that watcher instance. Use atomic file naming to control which file triggers.
wildcard_examples.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/* Match any CSV file starting with 'trades_' */
watch_file: /data/inbound/trades_*.csv
/* Match any file in the directory (dangerous — matches everything) */
watch_file: /data/inbound/*
/* Match files with a date-stamped name pattern */
watch_file: /data/feeds/FEED_*_DONE.txt
/* Be specific to avoid triggering on partial/temp files */
watch_file: /data/inbound/FINAL_trades_*.csv /* not temp_trades_*.csv */
/* Safe pattern for atomic writes: write to .tmp, rename to .ready */
/* Watcher watches .ready files only */
watch_file: /data/inbound/*.ready
Wildcards trigger on the first match only
If watch_file matches multiple files (e.g., /data/*.csv and three CSVs arrive at once), the watcher triggers on whichever file the filesystem returns first. The other files are not detected. Use unique patterns or single-file-per-trigger designs.
Production Insight
A data pipeline dropped 15 partition files into a directory simultaneously. One File Watcher with pattern /data/partition_*.parquet triggered on partition_03.parquet, succeeded, and processing began. The other 14 files sat unprocessed until the next day's batch.
The team assumed the watcher would queue multiple triggers. It doesn't.
Fix: Changed upstream to write a single manifest file after all partitions are ready. The watcher watches the manifest. Downstream reads all partition files.
Alternative: Use a box with a watcher and a command job that loops through all matching files.
Rule: File Watcher = one trigger per run. If you need multiple files detected, redesign the file pattern or upstream write pattern.
Key Takeaway
* matches any characters in filename, not subdirectories.
First matching file wins. Later files are ignored by that watcher run.
Use atomic file naming: write to .tmp, rename to .ready.
For multiple files, use a manifest file pattern.
run_window — limiting when the watcher is active
Without run_window, a File Watcher with date_conditions: 0 runs continuously 24/7. That's usually not what you want — if a stale file from last week is still in the directory when the watcher starts, it triggers immediately on the old file.
run_window restricts the hours during which the File Watcher will detect the file. Outside the window, it won't trigger even if the file is there. When the window opens again, the watcher checks for files — if an old file is still present, it will trigger at window open.
Critical: run_window does NOT prevent the watcher from seeing old files when the window opens. It only restricts when detection is active. File age is not considered.
run_window.jilBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
/* Only watch for the file between 6AM and 8PM */
insert_job: watch_eod_file
job_type: FW
watch_file: /data/eod/eod_positions_*.dat
watch_interval: 30
min_file_size: 10240
machine: data-server-01
owner: batchuser
run_window: "06:00 - 20:00" /* won't trigger outside these hours */
alarm_if_fail: 1
/* To avoid stale files at window open — add date to watch path */
watch_file: /data/eod/eod_positions_$(date +%Y%m%d).dat
/* Or use a wrapper CMD job that checks file timestamp before processing */
run_window doesn't filter by file age
When the window opens at 8 AM, if a file from yesterday still exists, the watcher triggers immediately on the stale file. Use date-stamped file names or a downstream age check to prevent this.
Production Insight
A Friday night file arrived at 11 PM, just before the weekend. The File Watcher had run_window: "09:00 - 17:00" (business hours only). The file sat in the directory all weekend. Monday at 9:00 AM, the watcher started, saw the file, and triggered immediately. Downstream jobs processed Friday's data on Monday morning, overwriting Monday's expected data.
Recovery: restore from backup, reprocess Friday correctly, mark Monday as errored.
Fix: Use date-stamped watch_file paths: watch_file: /data/feed_$(date +%Y%m%d).csv. The watcher only matches today's date. Stale files from other dates are ignored.
Alternative: Downstream job checks file modification time. If > 12 hours old, fails with alert instead of processing stale data.
Rule: run_window controls WHEN detection happens. Date-stamped paths or age checks control WHICH files are valid.
Key Takeaway
run_window restricts active hours — no detection outside window.
At window open, the watcher sees ALL matching files, including stale ones.
Use date-stamped watch paths or downstream age checks for freshness.
Common File Watcher problems and how to diagnose them:
Problem: File Watcher stays RUNNING after file arrived Check: Is the file smaller than min_file_size? Is the file on the machine specified in the machine attribute? Is the agent on that machine running? Is current time inside run_window?
Problem: File Watcher triggered on wrong file Check: Is your wildcard pattern too broad? Did an old file match? Did a temp file match? Use autorep -J FW_JOB -q to see the exact watch_file pattern.
Problem: File Watcher didn't trigger at all Check: Is the current time within run_window? Is date_conditions set correctly (must be 1 if using start_times)? Is the file in the exact path specified (case-sensitive on Linux)?
Problem: File Watcher triggers but downstream fails saying file is empty Check: min_file_size likely too low. Upstream may have written a zero-byte lock file first.
fw_troubleshoot.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# CheckFileWatcher job status
autorep -J watch_settlement_file -d
# Check the exact watch_file attribute set
autorep -J watch_settlement_file -q
# Verify the file actually exists on the target machine
ssh file-landing-server 'ls -la /data/inbound/settlement_*.csv'
# Check agent is running on the watcher machine
autoping -m file-landing-server
# Check current autosys time (in case timezone matters)
autoflags -a | grep -i time
# Checkif file is below min_file_size
ssh file-landing-server 'wc -c /data/inbound/settlement_*.csv'
The autorep -q command shows the exact watch_file pattern
autorep -J FW_JOB -q resolves any variables or wildcards. Always check this before debugging — the pattern may not be what you think.
Production Insight
A team spent 4 hours debugging a File Watcher that wouldn't trigger. The watch_file was /data/inbound/*_feed.csv. A file named ABC_feed.csv existed in the directory. Everything looked correct.
The problem: The file was on a different machine than the machine attribute. The agent on machineA was checking its local /data/inbound directory. The file was on machineB.
Diagnosis would have taken 30 seconds: ssh machineA 'ls -la /data/inbound/*_feed.csv' returned nothing.
Rule: Always verify file existence on the EXACT machine specified in the job's machine attribute. 'ls' on your workstation proves nothing.
Key Takeaway
Stuck RUNNING? Check min_file_size and file existence on the right machine.
Wrong trigger? Wildcard too broad or stale file in directory.
No trigger? Check run_window and date_conditions.
Always verify file path on the agent machine — not your local workstation.
● Production incidentPOST-MORTEMseverity: high
The Empty File That Crashed the Data Warehouse
Symptom
File Watcher job shows SUCCESS. Downstream processing job fails immediately with 'no data found' or 'file empty' errors. The actual data file exists in the directory with correct size and timestamp, created seconds after the watcher triggered.
Assumption
The team assumed min_file_size: 0 was safe — 'the file will have data when it arrives'. They didn't know the upstream system wrote a 0-byte placeholder before the real file.
Root cause
min_file_size: 0 (or omitted) means ANY file, including empty ones, triggers the watcher. Upstream systems often write lock files, temp files, or zero-byte placeholders before writing actual data. The File Watcher can't distinguish intent — if a file matches watch_file and meets min_file_size (0 bytes qualifies), it triggers immediately.
The fix seems obvious in hindsight, but teams discover it only after the first empty-file incident.
Fix
1. Set min_file_size: 1024 (1KB) minimum. Real data files are rarely smaller.
2. For CSV/JSON files, check header-only files. Use min_file_size: 10240 (10KB) if headers are small.
3. Add downstream validation that checks file content, not just existence.
4. Work with upstream teams to use atomic file moves: write to .tmp, then rename to .csv.
Key lesson
Never set min_file_size: 0. 1KB minimum for text files, 10KB for data files.
Upstream write patterns matter. Ask: does your source write temp/lock files?
Downstream jobs must validate content, not assume the watcher got it right.
Production debug guideWhen your watcher doesn't watch — or watches too much4 entries
Symptom · 01
File Watcher stays RUNNING even though file exists
→
Fix
Check min_file_size. If file is 512 bytes and min_file_size is 1024, watcher won't trigger. Check file is on correct machine (ls on agent host). Check run_window — current time must be inside window.
Symptom · 02
File Watcher triggered on empty or partial file
→
Fix
Check min_file_size. If 0 or omitted, any file triggers. Set to at least 1024. Also check if upstream writes temp files first — adjust wildcard to exclude *_tmp patterns.
Symptom · 03
File Watcher triggered on old file from yesterday
→
Fix
run_window not set or too wide. Watcher triggers on ANY matching file when it starts, regardless of age. Set run_window to expected arrival window. Add date-specific subdirectory if possible.
Symptom · 04
File Watcher triggers on wrong file (multiple matches)
→
Fix
Wildcard too broad. watch_file: /data/.csv triggers on every CSV. Use specific pattern: /data/FEED_.csv or include date: /data/20260319_*.csv.
★ File Watcher — 60-Second DiagnosisWhen files exist but the watcher doesn't trigger, or triggers on the wrong file
Watcher stuck RUNNING, file exists−
Immediate action
Check file size vs min_file_size
Commands
ls -la /path/to/watch_file/*.csv
autorep -J FW_JOB -q | grep min_file_size
Fix now
update_job: FW_JOB
min_file_size: 1024
Watcher triggered on empty file+
Immediate action
Find the empty file in watch directory
Commands
find /path/to/watch_dir -size 0 -name '*.csv'
autorep -J FW_JOB -q | grep min_file_size
Fix now
update_job: FW_JOB
min_file_size: 1024
Watcher triggered on stale file+
Immediate action
Check run_window and file timestamps
Commands
autorep -J FW_JOB -q | grep run_window
ls -la /path/to/watch_dir/ | grep 'Mar 18'
Fix now
update_job: FW_JOB
run_window: "08:00 - 20:00"
Watcher not triggering — no file found+
Immediate action
Verify file path on correct machine
Commands
ssh agent_host 'ls -la /path/to/watch_file/ '
autoping -m agent_host
Fix now
Check JIL machine attribute matches where file actually lands
File Watcher Attributes — Quick Reference
Attribute
What it does
Default if omitted
Production recommendation
watch_file
File path pattern to watch for
Required — no default
Use specific patterns with date stamps, avoid /* alone
watch_interval
How often to check (seconds)
60 seconds
30-60 seconds for most cases — lower = more agent load
min_file_size
Minimum file size before success
0 (any size, including empty)
1024 bytes minimum, 10240 for data files — never 0
run_window
Hours during which watcher is active
No restriction (24/7)
Match expected file arrival window + 2 hours buffer
machine
Agent machine where file is located
Required — no default
Must match the server where file physically lands
Key takeaways
1
File Watcher = event-driven trigger on file arrival
perfect for unpredictable upstream schedules
2
min_file_size default 0 is dangerous. Set to 1024 bytes minimum. Never leave at 0 in production.
3
run_window restricts active hours but doesn't filter by file age. Stale files trigger at window open.
4
Wildcards trigger on first match only. Multiple simultaneous files? Use a manifest pattern.
5
Watcher detects only
downstream must consume (move/delete). Otherwise same file triggers repeatedly.
6
Always verify file existence on the agent machine
ssh agent_host 'ls -la'. Your workstation means nothing.
Common mistakes to avoid
5 patterns
×
Setting min_file_size: 0 or omitting it
Symptom
File Watcher triggers on empty or partially-written files. Downstream jobs fail with 'file empty' or 'no data' errors. The real file arrives seconds later, but the watcher has already succeeded and won't trigger again.
Fix
Always set min_file_size at least 1024 (1KB). For data files expected to be large, set 10240 (10KB) or higher. Never leave it at 0.
×
Not setting run_window
Symptom
File Watcher triggers on stale files from previous days when it starts up. A file from Friday triggers on Monday morning, causing downstream jobs to process outdated data.
Fix
Set run_window to match expected file arrival window. Use date-stamped watch_file paths to filter by date. Add downstream file age validation.
×
Using overly broad wildcards (e.g., /data/*)
Symptom
Watcher triggers on temp files, lock files, or unrelated files. Multiple files arrive simultaneously; watcher picks one and ignores others.
Fix
Be specific. Use /data/FEED_.csv not /data/.csv. Work with upstream to use atomic file naming (.tmp → .ready).
×
Pointing watch_file to a different machine than where the file lands
Symptom
File exists, watcher stays RUNNING forever. Teams waste hours checking file permissions, sizes, and wildcards. The file is on the wrong server entirely.
Fix
Always verify file existence with ssh agent_host 'ls -la /path/to/file'. The agent on that machine does the local check. Your workstation's filesystem is irrelevant.
×
Assuming File Watcher deletes or moves the file
Symptom
Watcher succeeds. Downstream runs. Next day, the same file is still in the directory. The watcher triggers again on the same file, causing duplicate processing.
Fix
Downstream jobs must move, rename, or delete the file after processing. The watcher only detects — it doesn't consume.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01JUNIOR
What is an AutoSys File Watcher job and when would you use one?
Q02JUNIOR
What does the min_file_size attribute do on a File Watcher?
Q03SENIOR
What does run_window control on a File Watcher?
Q04SENIOR
Your File Watcher triggered prematurely on an incomplete file — what are...
Q05SENIOR
Can a File Watcher job detect files on a remote server?
Q06SENIOR
How does a File Watcher handle multiple files arriving simultaneously?
Q01 of 06JUNIOR
What is an AutoSys File Watcher job and when would you use one?
ANSWER
A File Watcher (job_type: FW) monitors a file path pattern on an agent machine. When a matching file appears and meets min_file_size, the job completes with SUCCESS and triggers dependent jobs.
Use cases:
- Processing files from external systems with unpredictable arrival times (trading files, bank feeds)
- Event-driven pipelines where polling would be inefficient
- Orchestrating workflows that start when a file is ready
Avoid File Watchers when: files arrive at predictable times (use scheduled jobs), you need to check file content before triggering (use CMD job), or multiple files need to be batched together (use manifest pattern).
Q02 of 06JUNIOR
What does the min_file_size attribute do on a File Watcher?
ANSWER
min_file_size specifies the minimum file size in bytes that a file must reach before the File Watcher declares SUCCESS. Default is 0 (any file, including empty).
Why it matters: Upstream systems often write lock files, temp files, or zero-byte placeholders before writing the real data. min_file_size: 0 would trigger on these empty files, causing downstream jobs to run on incomplete or missing data.
Recommendation: Set min_file_size to at least 1024 (1KB) for text files, 10240 (10KB) for data files. Never leave it at 0 in production.
Q03 of 06SENIOR
What does run_window control on a File Watcher?
ANSWER
run_window restricts the hours during which the File Watcher actively checks for files. Outside the window, the watcher does not detect files even if they exist. When the window opens again, the watcher checks for files and triggers on any matching files present.
Critical nuance: run_window does NOT filter by file age. If a stale file from yesterday is still present when the window opens, the watcher triggers immediately on the stale file.
Best practice: Combine run_window with date-stamped watch_file paths (e.g., /data/feed_$(date +%Y%m%d).csv) or downstream file age validation to prevent stale-file triggers.
Q04 of 06SENIOR
Your File Watcher triggered prematurely on an incomplete file — what are the possible causes and fixes?
ANSWER
Causes:
1. min_file_size set to 0 or too low — watcher triggers on empty or partial file
2. Upstream writes temp/lock files before real data — wildcard matches temp file pattern
3. Upstream doesn't use atomic file moves — file becomes visible before write completes
Fixes:
- Set min_file_size to at least 1024 bytes for text files, 10240 for data files
- Adjust wildcard to exclude temp patterns: watch_file: /data/FINAL_.csv instead of /data/.csv
- Work with upstream to use atomic writes: write to .tmp, rename to .ready when complete
- Add downstream validation that checks file content (row count, header) before processing
Prevention: Test with actual upstream write patterns. Ask: does your source write lock files? Do they use rename or direct write?
Q05 of 06SENIOR
Can a File Watcher job detect files on a remote server?
ANSWER
The File Watcher checks for files locally on the machine specified in the machine attribute. That machine must have an AutoSys agent installed. So effectively, yes — the watcher runs on that remote machine via its agent.
If the file is on serverA and your File Watcher has machine: serverB, it will never find the file. The agent on serverB checks serverB's local filesystem.
Debugging: Always verify file existence with 'ssh agent_host ls -la /path/to/file' before checking the watcher configuration.
Q06 of 06SENIOR
How does a File Watcher handle multiple files arriving simultaneously?
ANSWER
The File Watcher triggers on the FIRST file matching watch_file that the filesystem returns during its scan. It does NOT queue or batch multiple files. After triggering, the watcher succeeds and stops. The remaining files are NOT detected by that watcher instance.
Solutions:
- Use a manifest file: Upstream writes one 'ready' file after all data files are present. Watcher triggers on manifest. Downstream reads all data files.
- Use a box with a watcher and a loop: Watcher triggers once, then a CMD job processes all matching files in a loop.
- Use date-stamped unique filenames: Each file has a unique timestamp. Watcher triggers per file, but you need separate watcher runs (requires restart or dateless boxes).
Common mistake: Assuming watcher will trigger multiple times. It won't. Design your file layout accordingly.
01
What is an AutoSys File Watcher job and when would you use one?
JUNIOR
02
What does the min_file_size attribute do on a File Watcher?
JUNIOR
03
What does run_window control on a File Watcher?
SENIOR
04
Your File Watcher triggered prematurely on an incomplete file — what are the possible causes and fixes?
SENIOR
05
Can a File Watcher job detect files on a remote server?
SENIOR
06
How does a File Watcher handle multiple files arriving simultaneously?
SENIOR
FAQ · 6 QUESTIONS
Frequently Asked Questions
01
What is a File Watcher job in AutoSys?
A File Watcher job (job_type: FW) monitors a specified file path pattern on an agent machine. When a matching file appears and meets the minimum size requirement, the job completes with SUCCESS and triggers any dependent jobs.
It's event-driven: processing starts when the file arrives, not at a fixed schedule time.
Was this helpful?
02
What is min_file_size in a File Watcher?
min_file_size specifies the minimum file size in bytes that a file must reach before the File Watcher declares SUCCESS. This prevents the watcher from triggering on empty or partially-written files.
Default is 0 (any file). Always set a value larger than what an empty or header-only file would produce — 1024 bytes minimum, 10240 for data files.
Was this helpful?
03
Why is my File Watcher triggering on old files?
If a file from a previous day matches your watch_file pattern and run_window isn't set, or the window just opened, the watcher triggers on any matching file regardless of age.
Fix: Use date-stamped watch_file paths (watch_file: /data/feed_$(date +%Y%m%d).csv). Or add downstream file age validation that fails if the file is older than expected.
Was this helpful?
04
Can a File Watcher watch for a file on a remote machine?
The File Watcher checks for files locally on the machine specified in the machine attribute. That machine must have an AutoSys agent installed. So effectively, yes — the watcher runs on that remote machine via its agent.
Common mistake: The file lands on serverA but the watcher has machine: serverB. The agent on serverB checks its local filesystem and never finds the file.
Was this helpful?
05
What happens if the file never arrives within the run_window?
If the run_window ends without the file arriving, the File Watcher job moves to FAILURE status (if date_conditions: 1 and no other start conditions). This triggers any alarm_if_fail alert.
The downstream dependent jobs will not run until the next scheduling cycle when the watcher is re-triggered (next day, or manually via sendevent).
Was this helpful?
06
Does a File Watcher delete or move the file after triggering?
No. The File Watcher only detects the file's existence. It does NOT delete, move, rename, or read the file. The file remains on disk exactly as it was.
Your downstream CMD job must handle file consumption — reading the data, moving the file to /processed, or deleting it after successful processing. Otherwise, the same file will trigger the watcher again on the next run (if the watcher restarts).