Python Requests — Missing Timeout Blocks 200 Threads
A missing timeout blocked all 200 threads in under 2 minutes, causing silent outage.
- Requests is the industry-standard Python HTTP library — one readable line to make GET, POST, and other HTTP calls
- Always call response.raise_for_status() — requests.get() silently returns 404/500 without raising an exception
- Use json=payload not data=json.dumps(payload) — json= handles serialisation and Content-Type header automatically
- Use requests.Session() for multiple calls to the same host — reuses TCP connections, persists headers and auth
- Always set timeout=(connect, read) — without it, requests.get() waits forever and can hang your entire app
- Biggest production trap: processing error responses as valid data because you never checked the status code
Imagine you walk into a restaurant. You — your Python script — tell the waiter what you want. The waiter is the Requests library. It walks to the kitchen, which is some server out there on the internet, picks up your order, and brings it back to your table. You never see the kitchen. You never deal with how food gets plated or how tickets get routed. You just get your meal. That's exactly what Requests does for HTTP. It handles the back-and-forth of talking to remote servers — the connection negotiation, the headers, the encoding — so your code stays clean and readable. The waiter analogy breaks down in one place though: a real waiter will tell you if the kitchen is on fire. Requests, by default, will not. It will hand you a plate with an error message on it and smile. That's why raise_for_status() exists.
Every modern application talks to the internet. Whether it's pulling live weather data, posting a support ticket, authenticating a user through OAuth, scraping pricing data, or integrating with a payment gateway — your Python code needs a reliable, human-friendly way to make HTTP requests. The Requests library has been that solution for over a decade. It's downloaded more than 300 million times a month, it ships as a dependency in more Python projects than almost anything else, and it's the first thing most engineers reach for when they need to talk to an API.
But Requests has a deceptive learning curve. The basics look trivially simple — one line and you have a response. That simplicity is the trap. Most tutorials stop at requests.get() and show you a happy path. Production systems don't get happy paths. They get slow gateways, flaky upstreams, expired tokens, malformed JSON in 200 responses, thread pools exhausted by a single hanging call, and credential leaks from keys hardcoded three years ago by someone who no longer works there.
This guide covers all of it. The correct patterns, the failure modes that actually bite teams in production, the debugging steps when things go wrong at 2am, and the mental models that help you make the right call without having to re-read the docs every time.
GET Requests — Asking the Internet for Data
A GET request is the most fundamental HTTP action. When you type a URL into your browser, that is a GET request. The browser is saying: 'Hey server, give me this resource.' With the Requests library, you replicate that in a single readable line of Python.
But a response is more than just the data you asked for. It's a complete package — a status code that tells you whether the request succeeded, headers that carry metadata about the response, and a body that contains the actual content. Requests gives you clean access to all three.
The .json() method deserves special mention. It automatically parses the JSON response body into a Python dictionary. This is better than calling json.loads(response.text) yourself — it's shorter, it's cleaner, and it raises a clear JSONDecodeError if the body is not valid JSON instead of silently returning garbage.
Here is the thing that catches every beginner: requests.get() does not raise an exception when the server returns a 404 or 500. It returns that error response just as cheerfully as it returns a 200. If you skip the status check and call .json() on a 404 HTML error page, you get a ValueError. If the server returns a 500 with a JSON error object and you treat it as real data, you get subtle corruption downstream that surfaces an hour later in a completely different part of your system. Always check the status before trusting the body.
raise_for_status() and go straight to .json(), you will either get a ValueError when parsing an HTML error page, or worse, you will silently process a JSON error envelope as if it were real data. Always call raise_for_status() before accessing the body. Every time. No exceptions to this rule.response.json() without first checking response.status_code or calling raise_for_status(). The failure mode is invisible during development — the happy path works perfectly — and only surfaces in production when the upstream API is under stress and starts returning 503 responses with JSON error bodies that your code processes as real data, corrupting your database or triggering downstream failures with no clear error trail. Always gate .json() access behind raise_for_status().response.json() for JSON bodies, response.text for raw text, response.content for binary data like images or filesPOST Requests and Sending Data — How You Talk Back to a Server
A GET request fetches data. A POST request sends data and asks the server to do something with it — create a record, trigger an action, authenticate a user, process a payment. The distinction matters because GET requests should be safe to repeat with no side effects. POST requests are not — sending the same POST twice might create two records, charge a card twice, or fire a webhook twice.
There are two common ways to send data with a POST: form-encoded (the data= parameter) or JSON (the json= parameter). This distinction causes more production bugs than almost anything else in Requests. When you use json=, Requests automatically serialises your Python dict to a JSON string, sets Content-Type to application/json, and encodes the body correctly. When you use data= with a dict, it sends the data as HTML form fields — application/x-www-form-urlencoded — which is what browsers send when you submit a form. Most modern REST APIs expect JSON. Sending form data to a JSON API typically results in a 400 Bad Request with a vague error message, because the server is trying to parse form-encoded bytes as JSON and failing.
The subtle trap is data=json.dumps(payload). The body bytes are correct JSON. But the Content-Type header is wrong — Requests does not automatically set it to application/json when you use data=. The server sees JSON bytes arriving with an application/x-www-form-urlencoded header and rejects it. It looks identical when you print(response.request.body) but fails because of a header you cannot see without explicitly printing response.request.headers.
Query parameters are separate from the request body. They appear in the URL after a ? — things like ?state=open&per_page=25. Use the params= argument in Requests to add them. Never concatenate them into URL strings yourself — Requests handles URL encoding correctly, including special characters and spaces, which manual string formatting often gets wrong.
Sessions, Timeouts and Retries — Writing Production-Grade Code
Here's what most tutorials skip: using bare requests.get() and requests.post() in production service code is an anti-pattern. Not because it's broken — it works fine for one-off scripts and local testing. But because it opens a new TCP connection for every single call. When you're hitting the same API 50 times to paginate through results, or making 10 concurrent calls in a thread pool, those repeated connection setups and TLS handshakes add up to real latency and real resource consumption.
A Session object solves this by maintaining a connection pool. Connections to the same host are reused across requests, which eliminates the per-call handshake overhead and is measurably faster under any real workload.
But connection reuse is not the main reason senior engineers reach for Sessions. The main reason is that Sessions give you a single place to configure everything that applies to all your requests — auth headers, base headers, cookies, retry strategies, TLS settings — and then all of those settings are automatically applied to every request you make through that session. You stop repeating yourself. And you stop the class of bugs where you added the auth header to seven out of eight calls and the eighth one silently fails with a 401.
Timeouts deserve their own paragraph because they are the most commonly omitted configuration and the most reliably catastrophic when missing. requests.get() has no default timeout. It will block the calling thread indefinitely. In a concurrent application — a Flask or FastAPI service running with multiple workers, or any thread pool — one slow API call that takes 90 seconds instead of 500ms will hold its thread for 90 seconds. If you get enough slow calls in parallel, you exhaust your thread pool. Your service appears alive — the process is running, memory looks fine — but it is processing nothing. This is exactly the failure mode from the production incident at the top of this guide.
The fix is always setting timeout=(connect_timeout, read_timeout) as a two-element tuple. The connect timeout is how long to wait for the initial TCP connection to be established. The read timeout is how long to wait between bytes of the response being received. For most APIs, (3, 10) is a reasonable starting point. For large file downloads, increase the read timeout. For fast internal services, tighten it.
For resilience against transient failures, pair a Session with urllib3's HTTPAdapter and Retry strategy. This handles the retry loop you would otherwise write manually, implements correct exponential backoff, and respects the right set of retryable status codes. The key insight: only retry on status codes that indicate transient infrastructure failures — 429, 502, 503, 504. Never retry on 400, 401, 403, 404, or 422 — those are permanent errors that will fail identically on every attempt and retrying them just wastes time and generates noise in logs.
- Session reuses TCP connections via connection pooling — one TLS handshake per host, not one per request
- Session persists headers and auth across all requests — set once, inherited everywhere, no repetition
- HTTPAdapter + Retry handles transient failures automatically — no manual retry loops, correct exponential backoff
- Timeout must still be set per-call — it is not a session-level setting. Every
requests.get()andrequests.post()call needs its own timeout argument - Bare
requests.get()is for one-off scripts — anything running in a service under concurrent load should use Session with explicit pool limits
requests.get() waits forever by default and a missing timeout is a production outage waiting to happen.requests.get() with timeout is fine — no Session overhead needed for a single callSession() — reuses TCP connections and eliminates repeated header configurationAuthentication Patterns — Basic Auth, Tokens, and OAuth in the Real World
Almost every API that does anything useful requires authentication. Understanding which auth pattern to use, why it exists, and how to implement it correctly is what separates an engineer who can follow an API quickstart from one who can build a production integration that stays secure and maintainable.
Basic Auth is the oldest pattern. Your username and password are combined as 'username:password', base64-encoded, and sent in every request as the Authorization header value. Requests handles this encoding for you when you pass auth=HTTPBasicAuth(username, password) or the equivalent tuple shorthand auth=(username, password). Basic Auth is simple and widely supported. Its weakness is that the password is transmitted with every request — if you're ever on HTTP instead of HTTPS, those credentials are readable by anyone on the network. Only use Basic Auth over HTTPS, and treat it as a legacy pattern for internal tools or simple APIs. Most serious APIs have moved away from it.
Bearer token authentication is the modern standard. You obtain a token once — usually at login or from an API key dashboard — and attach it to every subsequent request as 'Authorization: Bearer your_token_here'. The server validates the token without ever seeing your password again. The Session pattern makes this elegant: set the Authorization header once on the session and every request inherits it automatically. This is the pattern used by GitHub, Stripe, OpenAI, and most REST APIs built in the last several years.
OAuth 2.0 is the framework for apps acting on behalf of users. Instead of handling user passwords, your application redirects the user to the provider's login page, the user authenticates there, and the provider sends your app a token with specific scopes. Your app never sees the password. The requests-oauthlib library extends Requests with full OAuth 2.0 flow support. This is what you need for 'Sign in with Google', 'Connect with GitHub', Spotify API access, or any integration where users authorise your app to act on their accounts.
API key authentication varies more than the others. Some APIs want the key as a query parameter (?api_key=...). Others want it as a custom header (X-API-Key: ...). A few use the standard Authorization header with a custom scheme (ApiKey your_key_here). Always check the API documentation. As a general principle, prefer header-based API keys over query parameter keys — query parameters appear in server access logs, CDN logs, and browser history, which makes them a persistent credential exposure risk that is hard to detect and remediate.
os.environ.get() or from a .env file using python-dotenv. Add .env to your .gitignore before your first commit, not after. If a token was ever hardcoded and committed — even for one commit, even in a 'private' repo — rotate it immediately. git history is permanent and full-history searches for credentials in leaked repos are automated.os.environ.get() or python-dotenv, and add .env to .gitignore before your first commit.Missing Timeout Cascades Into Full Service Outage
requests.post() call because it had never been needed before.requests.post() call in the service blocked its thread for the full 90 seconds waiting for a response that was crawling back. The service ran with 200 worker threads. Within less than two minutes, all 200 were blocked mid-request. No threads remained to process incoming payments, respond to health check endpoints, or handle Kubernetes liveness probes. The pods were restarted but immediately consumed their thread pools on the backlog of queued requests. The service appeared alive — process running, no crashes, no exceptions — and was completely dead.- requests.get() and
requests.post()wait forever by default — a missing timeout is a production time bomb, not a minor oversight - One slow upstream can consume your entire thread pool in seconds — connection pool limits are a required safeguard, not a premature optimisation
- A process that is running but doing nothing is significantly harder to diagnose than a process that has crashed — add thread-level health metrics and alert on thread pool saturation, not just process liveness
- Circuit breakers prevent cascade failures — once you know an upstream is failing, stop calling it immediately and fail fast rather than queuing more blocked threads
- Load test with latency injection, not just load — a dependency that responds in 200ms under normal conditions and 90 seconds under stress will not reveal this class of failure in a happy-path load test
requests.get() and requests.post() wait forever without a timeout — there is no default. If this is happening in production, also check whether your thread pool or worker count is saturated, because one hanging call usually means many.raise_for_status() before every .json() call.requests.get() in a loop or thread pool. Each call opens a new TCP connection and new file descriptor. Switch to requests.Session() and configure HTTPAdapter with explicit pool_connections and pool_maxsize values. Check current open file descriptors with lsof -p <pid> | grep TCP | wc -l.Key takeaways
response.raise_for_status() before accessing response dataSession() for any production code making multiple requests to the same hostrequests.get() is for one-off scripts.Common mistakes to avoid
5 patternsNot setting a timeout on HTTP requests
Using data=json.dumps(payload) instead of json=payload for JSON APIs
Trusting response data without checking the HTTP status code
response.raise_for_status() immediately after every request and before accessing the body. This raises requests.exceptions.HTTPError with the status code and URL for any 4xx or 5xx response. Catch it explicitly where you can handle it meaningfully. Do not suppress it silently with a bare except.Hardcoding API tokens, passwords, or secrets directly in source code
os.environ.get() or from a .env file using python-dotenv. Add .env to .gitignore before the first commit. If any credential was ever hardcoded in a commit — even once, even in a 'private' repository — rotate it immediately. Treat it as compromised regardless of whether you see evidence of misuse.Using bare requests.get() in production code that makes multiple requests
requests.get() call opens a new socket and performs a full TLS handshake. In thread pools, this creates a new connection per thread per call, which exhausts ephemeral ports and file descriptor limits on systems under sustained load.Session() for any production code making more than one request to the same host. Configure HTTPAdapter with explicit pool_connections and pool_maxsize values to cap resource usage. As a rule: if the code runs in a service rather than a script, it should use a Session.Interview Questions on This Topic
What's the difference between using requests.get() directly and using a requests.Session(), and when would you choose one over the other in a production system?
requests.get() opens a fresh TCP connection for every call, does not persist headers, cookies, or auth between calls, and has no built-in retry support. It is the right tool for a one-off script making a single request. requests.Session() maintains a connection pool — it reuses TCP connections to the same host, which eliminates repeated TLS handshakes and is measurably faster for any code making multiple requests. Sessions also persist headers, cookies, and auth across all calls, so you configure them once and stop repeating yourself. And Sessions support mounting HTTPAdapter with a Retry strategy, which handles transient failures automatically without you writing retry loops. In a production service — anything running as a long-lived process under load — I would always use a Session. The performance improvement is real, the code is cleaner because auth is configured in one place, and the retry support is the difference between a service that handles transient upstream failures gracefully and one that needs a support ticket every time an upstream has a bad minute.Frequently Asked Questions
That's Python Libraries. Mark it forged?
8 min read · try the examples if you haven't