Cron Jobs in Linux — Why Your Backup Script Ran Twice
Cron fires jobs even if previous runs are still active.
- Cron is a time-based job scheduler that runs commands at specified intervals using a daemon (crond).
- Five-field syntax: minute hour day-of-month month day-of-week — each field accepts numbers, ranges, steps, or wildcards.
- Crontab files hold job definitions: user crontabs (
crontab -e) and system crontabs (/etc/crontabwith an extra user field). - Key pitfall: cron runs jobs with minimal environment — never assume PATH or env variables without explicit export.
- Performance insight: cron wakes every minute to check schedules — O(1) overhead per job, but overlapping runs corrupt data.
- Production insight: silent failures dominate — without logging redirection (
>> log 2>&1) a crashed job produces zero feedback.
Imagine you set a reminder on your phone to water your plants every Monday at 8am. You don't have to remember it — your phone just does it automatically, every single week, even while you're asleep. Cron is exactly that, but for your Linux server. You tell it 'run this script at 2am every night', and it does it forever without you lifting a finger. That's it. A tireless, never-forgetting robot assistant built into every Linux system.
Every production system has a graveyard of tasks that someone used to do manually — rotating log files, sending weekly reports, backing up databases, clearing temp folders. Done by hand, these tasks get forgotten, delayed, or skipped on holidays. Done wrong, they take down services. Cron is the Linux answer to this problem: a built-in scheduler that's been running quietly on Unix systems since 1975, and still powers millions of automated workflows today.
The core problem cron solves is reliability. Humans forget. Cron doesn't. If you need something to happen at a predictable time — daily, hourly, every 15 minutes, or at 3:47am on the last day of the month — cron handles it without a process manager, without a paid SaaS tool, and without a single line of application code. It lives at the OS level, which means it works regardless of what language your app is written in or whether your app is even running.
By the end of this article you'll be able to write cron expressions confidently, manage crontab files without breaking things, debug jobs that silently fail, and apply the real-world patterns that DevOps engineers actually use in production. You'll also know the three mistakes that catch almost everyone the first time they use cron — and exactly how to avoid them.
How Cron Actually Works — The Daemon, the Crontab, and the Schedule
Cron is a daemon — a background process that starts when your system boots and never stops. Its name comes from 'chronos', the Greek word for time. Every minute, the cron daemon wakes up, checks all the crontab files on the system, and asks: 'Is there anything I should run right now?' If yes, it fires the job. Then it goes back to sleep until the next minute.
A crontab (cron table) is just a plain text file that lists scheduled jobs. Each line in a crontab is one job: five time fields followed by the command to run. You never edit this file directly — you use the crontab command, which validates the format before saving, protecting you from syntax errors that would silently break everything.
There are two types of crontabs you'll work with. The first is user crontabs — each Linux user has their own, and jobs run as that user. The second is the system crontab at /etc/crontab and files dropped into /etc/cron.d/, which include an extra field specifying which user to run the job as. For most application-level automation, user crontabs are the right choice. For system-level jobs like log rotation, the system crontab is used.
The key mental model: cron doesn't track whether a previous job finished. If your job takes longer than its schedule interval, you can end up with two copies running at the same time. That's one of the most dangerous production gotchas, and we'll cover how to handle it.
@reboot nickname is great but remember it runs before network interfaces are always ready — add a sleep 10 if your job depends on external services.crontab -l after editing — one stray space can shift your schedule by an entire day.crontab -e — simpler, no need to specify user./etc/crontab with the extra user field./etc/cron.daily/ — no crontab editing needed.Writing Production-Ready Cron Jobs — Logging, Environments, and Locking
Here's where most tutorials stop — and where most real problems start. A cron job that runs date will work fine. A cron job that runs your Python script will almost certainly fail silently the first time, and here's why: cron runs with a minimal environment. It doesn't load your .bashrc, .bash_profile, or any of the environment variables you set in your shell session. That means PATH, PYTHONPATH, NODE_ENV, database credentials, API keys — none of it is there unless you explicitly provide it.
The second production concern is logging. By default, cron swallows all output. If your script crashes, you'll never know unless you've set up logging. The fix is simple: redirect both stdout and stderr to a log file on every single cron job.
The third concern is job overlap. If your backup script takes 90 minutes but runs every hour, you'll eventually have two copies fighting over the same files. The solution is a lock file — a mechanism where the script checks if another copy of itself is already running and exits gracefully if so. The flock utility makes this one line.
These three patterns — explicit environment, output logging, and job locking — are what separate a toy cron job from a production one. The example below shows all three working together in a realistic database backup script.
>> /var/log/myjob.log 2>&1 to every cron line as your absolute minimum. Better yet, build logging directly into the script itself as shown above — that way you get it whether the job is triggered by cron or run manually.$? check after pg_dump can silently produce empty backups for weeks.MAILTO or external monitoring.>> /var/log/job.log 2>&1.flock locking — never write a job that can overlap.Special Schedules, System Crontabs, and When to Use Alternatives
Once you're comfortable with the five-field syntax, there are shorthand strings that make common schedules far more readable. Instead of 0 0 * you can write @daily. These are called cron nicknames and every modern cron daemon supports them. They're self-documenting and much harder to misread.
Beyond user crontabs, Linux ships with a set of system-managed cron directories. Drop an executable script into /etc/cron.daily/ and it will run once a day — no crontab editing required. The actual run times are controlled by the run-parts entries in /etc/crontab. These are perfect for system maintenance tasks packaged by software installers.
That said, cron isn't always the right tool. It has real limitations: it has no dependency management (it can't wait for Job A to finish before starting Job B), it doesn't retry on failure, it has no built-in alerting, and it doesn't scale across multiple machines. For anything more complex — multi-step pipelines, distributed systems, or jobs that need retry logic — tools like systemd timers, Apache Airflow, or cloud-native schedulers (AWS EventBridge, GCP Cloud Scheduler) are better fits. Knowing when NOT to use cron is as important as knowing how to use it.
/etc/cron.daily/, it runs at the system's configured time (often 6:25am via anacron or run-parts). If multiple day jobs take longer than 24h, they'll backlog and eventually crash the system.OnCalendar= which is more precise than cron's minute granularity, and they can catch up missed runs with Persistent=true.@daily over 0 0 *.After= or a workflow engine like Airflow.Restart=on-failure.Debugging Cron Jobs — Diagnosing Silent Failures and Common Pitfalls
When a cron job fails silently, there's no error message, no email, no stack trace — just an application that starts degrading while nobody notices. This is the number one reason cron gets a bad reputation in production. The fix is a repeatable debug workflow.
Start with the cron daemon logs. On modern systems, journalctl -u cron shows every job execution — the exact command, the timestamp, and the user. On older systems, /var/log/syslog or /var/log/cron contains the same. Search for your command name with grep.
If the job ran but had no effect, the problem is usually the environment. Simulate cron's minimal environment with env -i HOME=/root PATH=/usr/bin:/bin /bin/bash your_script.sh. This stripped-down shell will reproduce the exact conditions under which cron runs your script. Any errors you see here are the errors cron sees — they just go unlogged.
Another common pitfall: the script has a shebang pointing to the wrong interpreter. If your script starts with #!/usr/bin/python3 but Python 3 is installed at /usr/local/bin/python3, cron will fail with a misleading error. Always use #!/usr/bin/env python3 for portability.
Finally, remember that cron jobs inherit the umask of their parent process — typically 022. If your job creates files that need specific permissions, set umask explicitly in the script.
- Every cron job runs in a fresh, isolated shell session with a minimal environment.
- Any customisation you rely on must be re-declared inside the script.
- The
env -icommand reproduces that isolation exactly — use it to see what cron sees. - When a job works manually but fails under cron, the first suspect is always the environment.
os.getenv('ENV') fails because cron doesn't set it. The script exits 0 but does nothing.echo "PATH=$PATH" >> $LOG_FILE at the top of every script.env -i, it's not a cron problem — check for race conditions or resource limits.env -i replicates the cron environment perfectly — use it to expose missing variables.Real-World Cron Patterns — Database Backups, Cache Warming, and Health Checks
In production, cron jobs fall into a handful of common patterns. Getting these right means the difference between a reliable automation pipeline and a pager at 3am.
Database Backups — The most critical cron job. Use the production-ready pattern from section 2: explicit env, logging, locking. Always validate the backup: after pg_dump, run pg_restore --list on the dump file to catch corruption early. Store backups off-server (S3, NFS) and retain multiple copies.
Cache Warming — Many apps rely on a cache that expires at midnight. Instead of serving slow responses to the first users, run a cron job just before peak traffic (e.g., 0 5 *) to pre-compute and populate the cache. Use flock to prevent overlap if the warming takes longer than the interval.
Health Checks — These run every 5 or 15 minutes and report system health to monitoring (e.g., check disk space, process up, API endpoint response). The script should output metrics in a parseable format (JSON) so Prometheus or a custom collector can ingest them. Unlike backups, health checks should NOT use locking — you want to run even if the previous check is still going (but alert if that happens).
Log Rotation — Cron handles log rotation via /etc/cron.daily/logrotate. When building custom rotation for app logs, never use rm -rf — use logrotate with compression and date-stamped filenames. Accidentally removing a log file that a running process is writing to can cause process hang or data loss.
Data Syncing — For ETL or replication between systems, schedule the sync during off-peak hours. Use idempotent scripts (sync only what changed) and monitor for latency. If a sync fails, cron won't retry — wrap it in a retry loop with exponential backoff.
pg_restore --list as shown above — it catches schema corruption, partial dumps, and truncation in milliseconds. For MySQL, use mysqlcheck --databases $DB_NAME after dumping.flock to cache warmers too — stale cache is better than a runaway refresh loop.timeout 10 curl ....--dry-run) that prints what it would do — run it after every code change.When Database Backup Overlaps Destroyed a Week of Transactions
pg_dump which writes to a temporary file; two concurrent dumps wrote to the same temp file, corrupting it.flock as shown in the production-ready example above. Also switch the backup interval to once daily (the job runtime was well under 24h after optimisation) and set the cron expression to 30 2 * so it runs once per day at 2:30am, when system load is lowest.- Cron will never wait for a previous run to finish — you must implement locking yourself.
- Always verify backup integrity with automated restoration tests (e.g., restore to a staging DB and run a checksum).
- Log the PID at the start of each run so you can identify overlapping executions retroactively.
env -i HOME=$HOME PATH=/usr/bin:/bin /bin/bash your_script.sh and check errors. Compare env output vs cron log.grep your_command /var/log/syslog or journalctl -u cron --since today. Ensure redirection is present: >> /var/log/myjob.log 2>&1.#!/bin/bash or #!/usr/bin/python3). If shebang is missing, cron assumes shell script and may fail. Also verify script is executable (chmod +x).>> /var/log/cron_output.log 2>&1 to the crontab line and re-run the job manually.Key takeaways
>> /path/to/job.log 2>&1)flock to create a lock file when your job's runtime could exceed its schedule intervalpg_restore --list or similar to catch corruption immediately after dump.Common mistakes to avoid
3 patternsRelying on your shell's PATH
/usr/bin/python3 not python3) and set PATH explicitly at the top of your script, or source your env file.Forgetting to redirect stderr
>> /var/log/myjob.log 2>&1 to your crontab line, where 2>&1 merges stderr into stdout so both go to the same log file.Using `crontab -r` when you meant `crontab -e`
crontab -l > ~/crontab_backup.txt to save a copy. Some teams commit their crontab to version control via a provisioning script specifically to prevent this.Interview Questions on This Topic
What happens to a cron job if the server is rebooted exactly when the job was supposed to run? How would you handle that scenario in production?
Persistent=true, which will run the job immediately after boot if the timer was missed. Alternatively, use @reboot to trigger the job on startup, or design your job to be idempotent and run frequently so missing one run doesn't matter. For critical jobs like database backups, consider a monitoring layer that alerts if a backup file hasn't been created in the last 26 hours.Frequently Asked Questions
That's Linux. Mark it forged?
6 min read · try the examples if you haven't