Streamlit re-runs your entire Python script on every widget interaction — no callbacks, no event loop
@st.cache_data caches serializable returns (DataFrames, API responses) — @st.cache_resource caches non-serializable objects (DB connections, ML models)
st.session_state persists data across re-runs within a single browser session — but is lost on page refresh
Use st.form() to batch widget inputs and prevent re-runs on every keystroke during data entry
The #1 production mistake: loading data outside a cache decorator — every slider move re-queries the database
Biggest misconception: Streamlit is only for prototypes. With proper caching and Docker deployment, it handles internal tools at enterprise scale
✦ Definition~90s read
What is Streamlit for Data Apps?
Streamlit is a Python framework that turns data scripts into interactive web apps with minimal boilerplate. The core problem it solves is the impedance mismatch between data exploration (notebooks, scripts) and production data applications (dashboards, internal tools).
★
Imagine you've baked an amazing cake — your data analysis — but it's sitting in your kitchen where nobody can see it.
Instead of managing request/response cycles, you write a linear Python script that re-executes top-to-bottom on every user interaction or page load. This execution model is the root cause of the DB pool exhaustion issue: every widget change, button click, or rerun triggers a fresh database query unless you explicitly cache results.
Streamlit competes with Dash, Panel, and Shiny for R, but its key differentiator is the simplicity of its reactive model — no callbacks, no state management boilerplate. However, that simplicity comes at a cost: naive apps hammer databases, and you must use @st.cache_data or @st.cache_resource to prevent connection leaks.
For production use cases with multiple concurrent users, you'll also need connection pooling via SQLAlchemy or psycopg2.pool and careful session management. Streamlit is ideal for internal analytics tools, ML model demos, and quick data exploration UIs, but it's not suited for high-traffic public-facing apps or complex multi-user workflows where you need fine-grained control over server resources.
Plain-English First
Imagine you've baked an amazing cake — your data analysis — but it's sitting in your kitchen where nobody can see it. Streamlit is like a pop-up bakery window. It takes your Python code and instantly gives it a front door, a menu, and a way for customers to interact with what you made. You don't need to know how to build a shop; you just focus on the cake. That's Streamlit: a way to share your data work with the world without learning web development.
Most data insights die in Jupyter notebooks. A data scientist builds a forecasting model, but only someone who can run Python can actually see it. Streamlit fixes this — it turns any Python script into a live web app with zero frontend code.
The core trade-off: Streamlit re-runs your entire script on every interaction. This makes the programming model dead simple — your code stays linear, no callback wiring. But it also means un-cached operations like database queries, model loading, and file reads fire on every slider move. Without disciplined caching, your app grinds to a halt after the second click.
Productionizing Streamlit requires three things: cachingdecorators on every expensive operation, st.session_state for cross-interaction state, and a deployment strategy — Docker, Streamlit Community Cloud, or Kubernetes. Miss any of these and you have a prototype that breaks under real usage.
Why Streamlit Data Apps Exhaust DB Pools
Streamlit is a Python framework that turns data scripts into interactive web apps. Its core mechanic: every user interaction or widget change triggers a full top-to-bottom re-execution of the script. This means each rerun opens new database connections unless you explicitly cache queries. Without caching, every slider drag or button click fires fresh SQL queries, each consuming a connection from the pool. In practice, a single user rapidly clicking through filters can consume 10–20 connections per minute. With 50 concurrent users, that's 500–1000 connections per minute — enough to exhaust a typical 100-connection pool in seconds. The result: connection timeouts, app hangs, and cascading failures across services sharing the same database. Use @st.cache_data to cache query results. Set TTLs that match your data freshness needs — seconds for real-time dashboards, minutes for daily reports. Never let a UI event become a direct database query.
Caching Is Not Optional
Without caching, each rerun opens new connections — a single user clicking filters can exhaust a 100-connection pool in under a minute.
Production Insight
A finance dashboard with 30 concurrent users hit 100% connection pool usage within 2 minutes of launch, causing all downstream services to timeout.
Symptom: psycopg2.OperationalError: FATAL: remaining connection slots are reserved for non-replication superuser connections.
Rule: Always set @st.cache_data(ttl=60) on any query that doesn't need real-time freshness — even for small datasets.
Key Takeaway
Streamlit reruns the entire script on every interaction — treat it like a fresh request, not a persistent session.
Cache every database query with @st.cache_data and a TTL; uncached queries are the #1 cause of pool exhaustion.
Monitor connection pool usage in production — a sudden spike often points to a missing cache decorator on a frequently triggered widget.
thecodeforge.io
Streamlit App: Uncached Queries Exhaust DB Pool
Streamlit Data Apps
How Streamlit's Execution Model Actually Works (This Changes Everything)
Before you write a single widget, you need to understand Streamlit's most important — and most surprising — design decision: every time a user interacts with your app, Streamlit re-runs your entire Python script from top to bottom. Every. Single. Time.
This is completely different from how most web frameworks operate. There is no event loop, no callbacks, no onclick handler wiring. When a user moves a slider, Streamlit re-executes your script with the new slider value baked in as the widget's return value. It sounds expensive, and it can be — but it is also what makes Streamlit so easy to reason about.
The upside: your app logic stays linear and readable, exactly like a regular Python script. The downside: if you are loading a 2GB CSV or running a complex SQL query on every re-run, your app will be unusably slow within seconds. That is why caching is not optional — it is the single most critical design decision in any Streamlit app.
execution_model_demo.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import streamlit as st
import datetime
# io.thecodeforge: Tracking the execution lifecycle# This line runs EVERY time the user interacts with anything in the app.
st.write(f"Script last ran at: {datetime.datetime.now().strftime('%H:%M:%S')}")
st.title("Understanding Streamlit's Re-Run Model")
# When the user moves this slider, the ENTIRE script above AND below re-executes.
temperature_celsius = st.slider(
label="Set temperature (°C)",
min_value=-20,
max_value=50,
value=22
)
temperature_fahrenheit = (temperature_celsius * 9 / 5) + 32
st.metric(
label="Temperature in Fahrenheit",
value=f"{temperature_fahrenheit:.1f} °F",
delta=f"{temperature_celsius} °C input"
)
if temperature_celsius > 35:
st.warning("That's dangerously hot. Stay hydrated and limit outdoor exposure.")
elif temperature_celsius < 0:
st.info("Below freezing — roads may be icy. Check local transport advisories.")
else:
st.success("Comfortable temperature range.")
Output
Script last ran at: 14:32:07
[Title: Understanding Streamlit's Re-Run Model]
[Slider rendered at 22°C by default]
Temperature in Fahrenheit: 71.6 °F (+22 °C input)
[Green success box]: Comfortable temperature range.
Streamlit as a Replay Engine
No callbacks, no event loop — your code runs top-to-bottom on every interaction
Widget return values change between re-runs, but the code structure stays identical
This makes the mental model dead simple: write a script, add widgets, done
The cost: every uncached operation re-executes — this is why caching is mandatory, not optional
Think of it as replaying your script with new inputs each time, not patching specific components
Production Insight
Every widget interaction triggers a full script re-run.
Uncached operations — DB queries, file reads, model inference — fire on every slider move.
Rule: if an operation takes more than 100ms, it must live inside a cache decorator.
Key Takeaway
Streamlit re-runs your entire script on every interaction — this is its defining design decision.
The simplicity comes at a real cost: every uncached operation re-executes on every slider move.
Caching is not a performance optimization in Streamlit — it is a correctness requirement.
When Does the Re-Run Model Break Down?
IfAll expensive operations are cached and widgets are lightweight
→
UseRe-run model works great — sub-100ms response times, dead-simple code to maintain
IfLoading large files or running DB queries without caching
→
UseApp becomes unusable after 2 to 3 interactions — wrap everything expensive in @st.cache_data immediately
IfNeed to persist state across re-runs — multi-step wizard, login flow, accumulated user inputs
→
UseUse st.session_state — it is the only mechanism that survives a re-run within the same browser session
IfNeed push-based updates — WebSocket data streams, Kafka topics, live sensor feeds
→
UseStreamlit is not the right tool for this pattern — use Dash, Panel, or a custom FastAPI plus React frontend
Caching and State: Making Your App Fast and Stateful
Streamlit gives you two caching decorators and they solve different problems. @st.cache_data is for functions that return data — CSVs, API responses, processed DataFrames. It serializes the return value using pickle, which means every user session gets its own copy. @st.cache_resource is for non-serializable objects like database connections or ML models — it stores the object reference directly and shares it across all sessions.
Then there is st.session_state — a dictionary that persists for the lifetime of a user's session. It disappears on page refresh, but it survives every re-run within that session. It is how you build multi-step forms, track login status, or accumulate user inputs without losing them between interactions.
caching_and_state_demo.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import streamlit as st
import pandas as pd
import time
# ── CACHING DATA (io.thecodeforge standard) ───────────────────────────────────
@st.cache_data(ttl=3600) # Cache expires after 1 hour — reduces DB load significantlydefload_sales_data(num_records: int) -> pd.DataFrame:
"""
Simulates a slow data fetch. With @st.cache_data, this only
runs once per hour regardless of how many times the user
interacts with widgets.
"""
time.sleep(2) # Simulated network/DB latencyreturn pd.DataFrame({
"record_id": range(num_records),
"revenue": [i * 1.5for i inrange(num_records)]
})
# ── CACHING RESOURCES ─────────────────────────────────────────────────────────
@st.cache_resource
definit_connection():
"""
@st.cache_resource isfor non-serializable infrastructure.
This object is created once and shared across all user sessions.
Use it forDB pools andML models — never forDataFrames.
"""
return {"status": "connected", "provider": "ForgeCloud"}
# ── SESSION STATE ─────────────────────────────────────────────────────────────# Always guard initialization — without this check, state resets on every re-run.if"analysis_count"notin st.session_state:
st.session_state.analysis_count = 0
st.title("Forge Analytics Dashboard")
col1, col2 = st.columns([3, 1])
with col1:
record_count = st.slider("Number of records to load", 100, 10000, 1000)
with col2:
st.metric("Analyses Run", st.session_state.analysis_count)
if st.button("Run Analysis"):
st.session_state.analysis_count += 1
df = load_sales_data(record_count) # Cached — will not re-query on every click
st.dataframe(df.head(10))
st.write(f"Analysis#{st.session_state.analysis_count} complete. Loaded {len(df)} records.")
conn = init_connection() # Shared resource — created once, reused across sessions
st.caption(f"Connection status: {conn['status']} via {conn['provider']}")
Output
Forge Analytics Dashboard
[Slider: Number of records to load — set at 1000]
[Metric: Analyses Run — 0]
[Button: Run Analysis]
After clicking Run Analysis:
Analysis #1 complete. Loaded 1000 records.
[DataFrame: first 10 rows displayed]
Connection status: connected via ForgeCloud
Watch Out: The Initialization Trap
Never initialize session_state keys unconditionally at the top of your script like st.session_state.count = 0. That line runs on every re-run and silently resets whatever the user has accumulated. Always wrap initialization in: if 'key' not in st.session_state: — that is the correct, safe pattern.
Production Insight
@st.cache_data serializes returns — it will raise a serialization error on open file handles or live DB connections.
@st.cache_resource does NOT serialize — it stores the object reference directly and shares it across all sessions.
Rule: use @st.cache_data for data, @st.cache_resource for infrastructure objects. Mixing them up causes either silent bugs or outright exceptions.
Key Takeaway
@st.cache_data is for serializable data — DataFrames, API responses, computed aggregates.
@st.cache_resource is for non-serializable infrastructure — DB pools, ML models, file handles.
Choosing the wrong decorator causes either pickle serialization errors or silent cache misses that are extremely difficult to debug.
Choosing the Right Cache Decorator
IfFunction returns a DataFrame, list, dict, or any primitive value
→
UseUse @st.cache_data — it serializes the return and gives each session its own safe copy
IfFunction returns a DB connection pool, ML model loaded into memory, or file handle
→
UseUse @st.cache_resource — it stores the reference without serialization and shares it across all sessions
IfCached data becomes stale after a database update or pipeline run
→
UseAdd ttl=300 to @st.cache_data for automatic expiry, or call st.cache_data.clear() from a refresh button for manual invalidation
IfFunction takes a DataFrame as an argument
→
UseThe DataFrame will bust the cache on most calls because its hash changes with content — pass immutable parameters like column names, date strings, or filter values instead
Building a Real Multi-Page Data App with Layout and Forms
Real data apps require navigation, structured layouts, and forms that do not re-run the entire script on every character the user types. Streamlit handles multi-page navigation through a pages/ directory — any Python file placed there is automatically discovered and shown in the sidebar. Layout primitives like st.columns() and st.tabs() handle visual organization within a page.
Forms are particularly important for production apps. Without st.form(), every keystroke in a text input triggers a full script re-run — which means every keystroke fires your cached data loading function's cache key check, re-renders the entire chart, and redraws the page. With st.form(), all widget changes inside the form are buffered locally and a single re-run fires only when the user explicitly clicks the submit button.
professional_dashboard.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import streamlit as st
import pandas as pd
import numpy as np
# io.thecodeforge: Professional dashboard layout# st.set_page_config MUST be the first Streamlit call — anything before it raises StreamlitAPIException.
st.set_page_config(
page_title="Forge Intelligence Hub",
page_icon="🔬",
layout="wide",
initial_sidebar_state="expanded"
)
# ── SIDEBAR CONTROLS ──────────────────────────────────────────────────────────with st.sidebar:
st.header("Control Panel")
st.divider()
mode = st.radio(
"Analysis Mode",
["Standard", "Advanced"],
help="Advanced mode enables cohort segmentation and confidence intervals."
)
date_range = st.date_input("Reporting Period", [])
# ── KPI HEADER ROW ────────────────────────────────────────────────────────────
st.title("Forge Intelligence Hub")
kpi1, kpi2, kpi3, kpi4 = st.columns(4)
kpi1.metric("Revenue", "$1.24M", delta="+12.3%")
kpi2.metric("Active Users", "8,412", delta="+340")
kpi3.metric("Churn Rate", "2.4%", delta="-0.5%", delta_color="inverse")
kpi4.metric("Model Accuracy", "94.1%", delta="+1.2%")
st.divider()
# ── TABBED CONTENT ────────────────────────────────────────────────────────────
overview_tab, forecast_tab, settings_tab = st.tabs(["Overview", "Forecast", "Settings"])
with overview_tab:
chart_data = pd.DataFrame(
np.random.randn(30, 3),
columns=["Revenue", "Cost", "Profit"]
)
st.line_chart(chart_data)
with forecast_tab:
# st.form() buffers ALL widget interactions — re-run only fires on submit.# Without this, every slider nudge or text keystroke triggers a full re-run.with st.form("forecast_parameters"):
st.subheader("Configure Forecast")
col_a, col_b = st.columns(2)
with col_a:
horizon = st.slider("Forecast horizon (days)", 7, 90, 30)
confidence = st.selectbox("Confidence interval", ["80%", "90%", "95%"])
with col_b:
target_metric = st.text_input("Target metric", placeholder="e.g. daily_revenue")
include_weekends = st.checkbox("Include weekends", value=True)
submitted = st.form_submit_button("Run Forecast", type="primary")
if submitted:
ifnot target_metric:
st.error("Target metric is required before running the forecast.")
else:
with st.spinner(f"Running {horizon}-day forecast for '{target_metric}'..."):
# In production this would call your ML backend API
st.success(f"Forecast complete: {horizon} days, {confidence} CI, {'weekends included'if include_weekends else'weekdays only'}.")
with settings_tab:
st.info("Settings are persisted per session. Changes here reset on page refresh.")
theme = st.selectbox("Dashboard theme", ["Light", "Dark", "System"])
refresh_interval = st.number_input("Auto-refresh interval (seconds)", min_value=30, value=300)
Pro Tip: st.set_page_config() Must Come First — No Exceptions
If you call any other Streamlit function before st.set_page_config(), you will get a StreamlitAPIException that halts the entire app. This includes st.write(), st.title(), and even importing a module that calls a Streamlit function at import time. Make st.set_page_config() the absolute first line after your imports.
Production Insight
st.form() buffers all widget changes and triggers a single re-run on explicit submit.
Without forms, every keystroke in a text_input fires a full script re-run — which under load means dozens of unnecessary cache checks and re-renders per user per minute.
Rule: always wrap multi-input workflows in st.form() to prevent cascading re-runs during data entry.
Key Takeaway
st.form() is the key to performant multi-input workflows — it batches all interactions into a single re-run on submit.
Multi-page apps use the pages/ directory convention — each .py file is auto-discovered with no configuration required.
Always call st.set_page_config() as the absolute first Streamlit command — any call before it raises an exception that crashes the app on startup.
Layout and Form Strategy
IfSingle-page app with a few widgets and one chart
→
UseUse st.columns() for side-by-side layout — no multi-page complexity needed
UseUse the pages/ directory for auto-discovered navigation — each .py file becomes a separate page in the sidebar
IfUser needs to fill multiple fields before triggering any processing
→
UseUse st.form() — it prevents re-runs on every keystroke and fires once cleanly on submit
IfNeed real-time updates as the user types — search-as-you-type filtering, live validation
→
UseDo NOT use st.form() — use individual widgets outside a form so each keystroke triggers the update
Data Persistence: The SQL Backend
Streamlit's st.session_state is ephemeral by design. It lives in server memory, scoped to a single browser session, and disappears the moment the user refreshes the page, closes the tab, or the server restarts. For anything that needs to survive beyond a single session — audit logs, saved analysis results, user preferences, cross-session dashboards — you must write to an external persistent store.
At the enterprise level, a structured SQL backend is the standard approach. The pattern is straightforward: use @st.cache_resource to create a shared database connection pool once, and write session events to an audit table on key user actions. This gives you a complete record of dashboard activity without impacting the read performance of your main queries.
io/thecodeforge/db/app_audit.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- io.thecodeforge: Persistence Layer for Streamlit Session Activity-- This table captures dashboard interactions that must survive beyond a single session.-- st.session_state cannot be used for this — it is lost on every page refresh.CREATETABLEIFNOTEXISTS io.thecodeforge.dashboard_activity (
id SERIALPRIMARYKEY,
session_id TEXTNOTNULL,
user_email TEXT,
action_performed TEXTNOTNULL,
widget_context JSONB, -- captures which filters/params were active
interaction_ts TIMESTAMPDEFAULT CURRENT_TIMESTAMP
);
-- Index on session_id for fast per-session retrievalCREATEINDEXIFNOTEXISTS idx_dashboard_activity_session
ON io.thecodeforge.dashboard_activity (session_id);
-- Index on interaction_ts for time-range analysis of dashboard usageCREATEINDEXIFNOTEXISTS idx_dashboard_activity_ts
ON io.thecodeforge.dashboard_activity (interaction_ts DESC);
-- Example: record a forecast run triggered from the Streamlit UIINSERTINTO io.thecodeforge.dashboard_activity
(session_id, user_email, action_performed, widget_context)
VALUES
(
'sess_882',
'editor@thecodeforge.io',
'run_forecast',
'{"horizon_days": 30, "confidence": "95%", "target_metric": "daily_revenue"}'
);
-- Retrieve activity for a specific session (useful for debugging user-reported issues)SELECT action_performed, widget_context, interaction_ts
FROM io.thecodeforge.dashboard_activity
WHERE session_id = 'sess_882'ORDERBY interaction_ts DESC;
st.session_state is browser-session scoped — it disappears on refresh, tab close, or any server restart. Any data that must survive beyond a single session must be written to an external store. PostgreSQL handles audit trails and structured data well. Redis is the right choice for ephemeral cross-session state like rate limiting or short-lived user tokens. S3 or object storage works for saving large analysis outputs or exported reports.
Production Insight
st.session_state is lost on every page refresh — it is not a persistence mechanism, it is a re-run coordination mechanism.
For audit trails and cross-session data, write to a database on key user actions.
Rule: session state is for UI coordination only — counters, step tracking, form progress. Persistent data goes to an external store, no exceptions.
Key Takeaway
st.session_state is for ephemeral UI state only — it disappears on refresh and is never shared between users.
Persistent data — audit logs, user preferences, saved results — must go to an external store.
The SQL audit table with a JSONB widget_context column is the production-standard approach for tracing dashboard interactions at enterprise scale.
When to Use Session State vs External Store
IfTracking current page, selected filters, wizard step progress, or form field values mid-entry
→
UseUse st.session_state — fast, no external dependencies, ideal for transient UI state
IfSaving user preferences, audit logs, analysis results, or anything that must survive a refresh
→
UseWrite to PostgreSQL or Redis — session state is lost on refresh and cannot be relied on for persistence
IfSharing state between multiple users viewing the same dashboard simultaneously
→
UseUse an external store — st.session_state is strictly per-user and per-session, never shared
IfState must survive server restarts or application deployments
→
UseExternal store is mandatory — st.session_state lives entirely in server memory and is wiped on restart
Java Integration: Consuming Dashboards via API
In hybrid infrastructure environments, your Streamlit app often serves as the UI layer for a Java or Go-based compute engine. The pattern is clean: the Java service owns the business logic and heavy computation, exposes it via a REST endpoint, and Streamlit calls that endpoint, caches the response, and handles visualization. This separation of concerns keeps your Streamlit script lightweight and your backend independently testable and deployable.
The critical rule: always cache the API call with @st.cache_data. Without it, Streamlit calls your Java backend on every single re-run — which means every slider move, every checkbox toggle, every character typed fires an HTTP request to your backend service. Under even modest concurrency, this becomes a self-inflicted DDoS.
io/thecodeforge/api/DashboardController.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
package io.thecodeforge.api;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import java.time.Instant;
/**
* io.thecodeforge: RESTAPI consumed by the Streamlit frontend.
* Streamlit calls this endpoint via requests.get() with @st.cache_data applied.
* All business logic and heavy computation lives here — not in the dashboard script.
*/
@RestController
@RequestMapping("/api/v1/forge-metrics")
publicclassDashboardController {
/**
* Returns summary metrics for the Streamlit dashboard KPI row.
* Streamlit caches this response — this endpoint typically receives
* 1 request per 5 minutes per user, not 1 per interaction.
*/
@GetMapping("/summary")
publicResponseEntity<MetricResponse> getSummary(
@RequestParam(defaultValue = "30") int horizonDays) {
// In production: query your data warehouse or aggregation service here.// The compute stays in Java; the visualization stays in Streamlit.MetricResponse response = newMetricResponse(
1_204_847.50,
"USD",
0.941,
horizonDays,
Instant.now().toString()
);
returnResponseEntity.ok(response);
}
/**
* Record used as the JSON response body.
* Streamlit receives this as a dict after requests.get().json().
*/
record MetricResponse(
double revenue,
String currency,
double modelAccuracy,
int forecastHorizonDays,
String generatedAt
) {}
}
Output
Spring Boot application started on port 8080.
GET /api/v1/forge-metrics/summary?horizonDays=30
HTTP 200 OK
{
"revenue": 1204847.50,
"currency": "USD",
"modelAccuracy": 0.941,
"forecastHorizonDays": 30,
"generatedAt": "2026-04-20T14:32:07Z"
}
Streamlit as a Frontend for Microservices
Streamlit excels as a thin UI layer over existing backend APIs. The Java service handles heavy compute, complex business rules, and data access. Streamlit handles layout, visualization, and user interaction. Use @st.cache_data(ttl=300) on every function that calls an external API — without it, you are firing an HTTP request to your backend on every slider nudge and every re-run.
Production Insight
Streamlit calling a Java or Go backend API on every re-run without caching will saturate your backend under real concurrency.
Always cache the API call with @st.cache_data(ttl=...) — the TTL depends on how fresh the data needs to be.
Rule: treat Streamlit as a presentation layer. Business logic, data access, and compute belong in the backend service, not in the dashboard script.
Key Takeaway
Streamlit works best as a thin presentation layer over existing backend services — not as a compute engine.
Cache every API call with @st.cache_data to prevent re-run storms from saturating your backend.
For heavy compute, keep the logic in a dedicated microservice and have Streamlit consume the result — this also makes the backend independently testable.
UseDo the computation inside Streamlit in Python — no external API needed, just cache the result
IfHeavy compute — ML model inference, large-scale ETL, complex multi-join SQL, data warehouse queries
→
UseOffload to a Java or Go microservice and call via REST — cache the response in Streamlit with an appropriate TTL
IfReal-time data streaming — Kafka consumer, WebSocket feed, live sensor data
→
UseStreamlit does not support push-based updates natively. Use st.empty() with a polling loop as a stopgap, or switch to Dash or Panel for a proper streaming UI
Deploying Your Streamlit App — From Local to Live
For production deployments, Docker is the standard. It guarantees that your runtime environment — including system-level dependencies for libraries like OpenCV, PyTorch, or GeoPandas — is identical from local development to production. It also makes secrets management, health checking, and container orchestration straightforward.
The single most common Docker deployment mistake with Streamlit: forgetting --server.address=0.0.0.0. Without it, Streamlit binds to 127.0.0.1 inside the container. The app starts, the process runs, but no external connection can reach it. You see 'connection refused' in the browser and nothing obviously wrong in the logs.
DockerfileDOCKERFILE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# io.thecodeforge: ProductionStreamlitContainer
# Built on python:3.11-slim to minimize image size while retaining pip and venv support.
FROM python:3.11-slim
WORKDIR /app
# Install system-level dependencies.
# curl is required for the HEALTHCHECK command below.
# Add any system packages your Python libraries need here (e.g., libgdal-dev forGeoPandas).
RUN apt-get update \
&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first to leverage Docker layer caching.
# If requirements.txt does not change, this layer is reused on rebuild.
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application source after dependencies to keep the source-change rebuild fast.
COPY . .
# Streamlit's default port. Expose it so orchestrators can route traffic correctly.
EXPOSE8501
# Health check using Streamlit's built-in health endpoint.
# interval: how often to check. timeout: how long to wait. retries: failures before unhealthy.
HEALTHCHECK \
--interval=30s \
--timeout=5s \
--start-period=10s \
--retries=3 \
CMD curl --fail http://localhost:8501/_stcore/health || exit 1
# --server.address=0.0.0.0 is REQUIRED in Docker.
# Without it, Streamlit binds to 127.0.0.1 inside the container
# and external connections get 'connection refused' with no obvious error.
ENTRYPOINT [
"streamlit", "run", "app.py",
"--server.port=8501",
"--server.address=0.0.0.0",
"--server.headless=true"
]
Add .streamlit/secrets.toml to your .gitignore before your very first commit — before you write a single credential into it. If you accidentally commit secrets, treat every exposed credential as compromised and rotate immediately. In production, use your platform's native secrets manager: AWS Secrets Manager, GCP Secret Manager, or Kubernetes Secrets mounted as environment variables.
Production Insight
Without --server.address=0.0.0.0, Streamlit binds to localhost inside the container and external connections fail silently.
The HEALTHCHECK endpoint at /_stcore/health is Streamlit's built-in liveness check — wire it to your load balancer's health probe.
Rule: always set 0.0.0.0 in Docker deployments and always configure the health check — these two lines prevent the two most common production deployment failures.
Key Takeaway
Docker is the production-standard deployment for Streamlit — it guarantees environment consistency and makes health checking and secrets management tractable.
Always bind to 0.0.0.0 inside containers — localhost binding is a silent deployment failure that shows nothing in the app logs.
For enterprise deployments, Streamlit sits behind a reverse proxy with an auth layer — it has no built-in authentication and should never be exposed directly to the internet without one.
Deployment Strategy Decision
IfInternal tool, fewer than 20 users, no compliance requirements, public GitHub repo
→
UseStreamlit Community Cloud — free, zero-config, deploys directly from GitHub on every push
IfInternal tool, 20 to 100 users, custom system dependencies, private repo
→
UseDocker container on your own infrastructure — AWS ECS Fargate, GCP Cloud Run (always-on), or a single VM behind Nginx
IfPublic-facing app, 100+ concurrent users, need auto-scaling
→
UseKubernetes with Horizontal Pod Autoscaler — Streamlit does not scale horizontally without sticky sessions at the load balancer layer, so configure session affinity
IfNeed authentication, SSO, or enterprise security controls
→
UseDeploy behind a reverse proxy — Nginx or Caddy — with an authentication layer such as oauth2-proxy or Cloudflare Access. Streamlit has no built-in authentication mechanism.
Your First Streamlit App: The 'Hello, World' That Will Break Production
Installing Streamlit is the easy part. Understanding why your first app might crash DB pools is the lesson. Run pip install streamlit and verify with streamlit --version. That command pulls in Tornado, the async server that hijacks your script execution. Now create a file: every time your script runs—on each button click, each slider move—Tornado executes the entire file from top to bottom. That means every database connection, every API call, every expensive compute runs fresh. Do not, under any circumstances, put a production database query inside an unconditional st.write(). You will exhaust your connection pool in minutes. Instead, start simple: import streamlit as st; st.title("DB Killer"). Run it with streamlit run app.py. See that URL http://localhost:8501? That's your new home. Remember: Streamlit is not Flask. Every rerun is a full execution. Design for that.
app.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// io.thecodeforge
import streamlit as st
import psycopg2
# BAD: This runs on every rerun
conn = psycopg2.connect("dbname=prod")
cursor = conn.cursor()
cursor.execute("SELECT * FROM orders")
data = cursor.fetchall()
st.write(data)
# GOOD: Cache it
@st.cache_data
deffetch_orders():
conn = psycopg2.connect("dbname=prod")
cursor = conn.cursor()
cursor.execute("SELECT * FROM orders")
return cursor.fetchall()
st.write(fetch_orders())
Output
App loads fast. DB pool stays alive.
Production Trap:
Never put a database connection outside a cached function. A single user scrolling will open and close multiple connections per second. Use @st.cache_data with a TTL or manage connections manually with st.connection().
Key Takeaway
Streamlit reruns your whole script on every interaction. Cache everything stateful or watch your connection pool die.
Widgets: The Silent Rerun Trigger Nobody Warned You About
Streamlit widgets look harmless. A slider, a button, a text input. Click one, and the whole script reruns. That's the design. But it's also the reason your dashboard feels sluggish and your API endpoints get hammered. Understand the lifecycle: when a widget changes value, it triggers a re-execution. Streamlit collects the new value, runs your script top-to-bottom, and redraws only the parts that changed. But if you have a st.slider() followed by a st.plotly_chart() that recomputes 10GB of data, you've built a performance trap. The fix: use session_state to cache widget values and prevent unnecessary recomputation. Check st.session_state for existing values before running heavy operations. Also, never put API calls inside event handlers—they fire on each rerun, not just the click. For buttons, wrap the action in a conditional: if st.button("Run"): ensures the block executes only when clicked. Your fellow engineers will thank you.
widget_trap.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// io.thecodeforge
import streamlit as st
import time
# BAD: Slider triggers heavy compute on each change
x = st.slider("X", 0, 100)
st.write(f"You selected {x}")
time.sleep(5) # Simulates heavy compute
st.write("Finished")
# GOOD: Use session state to debounceif"x_val"notin st.session_state:
st.session_state.x_val = 0defon_change():
st.session_state.x_val = st.session_state.temp
st.slider("X", 0, 100, key="temp", on_change=on_change)
st.write(f"Debounced value: {st.session_state.x_val}")
Output
First approach: lags on every slider move. Second approach: updates only on release.
Senior Engineer Insight:
Use key arguments on widgets to track them in session_state. Combine with on_change callbacks to control when heavy operations fire. This pattern saved us from a 5-second UI freeze on every slider drag.
Key Takeaway
Widgets rerun your entire script. Cache widget changes with session_state and debounce heavy compute.
● Production incidentPOST-MORTEMseverity: high
Internal dashboard hammered the production database — uncached query fired on every widget interaction
Symptom
Dashboard loads in 4 seconds. Moving any slider causes another 4-second wait. Database monitoring shows 150+ active connections against a normal baseline of 20. Payment service starts returning 503 errors. The DBA messages the team: 'Who is running SELECT * FROM orders 200 times per minute?'
Assumption
The team assumed Streamlit handled request batching automatically. Nobody on the team had read about the re-run-on-interaction model before deploying. They tested with one developer, one browser tab, and a couple of slider clicks — performance seemed acceptable.
Root cause
The dashboard loaded sales data via a raw SQL query at the top of the script, with no @st.cache_data decorator. Streamlit's execution model means every widget interaction — slider move, checkbox toggle, dropdown selection — triggers a full script re-execution from top to bottom. With 10 concurrent users each making 3 to 5 interactions per minute, the database received over 200 queries per minute against a 50-million-row orders table. The connection pool (max_connections=100) was exhausted in 3 minutes. The query itself made things worse — SELECT * with no pagination, no column projection, no date filter.
Fix
1. Wrapped the data loading function with @st.cache_data(ttl=300) — cache expires every 5 minutes, reducing queries from 200 per minute to roughly 1 per 5 minutes per user. 2. Replaced SELECT * with a parameterized query filtered by the selected date range — column projection dropped result size by 80%. 3. Added st.form() around all input widgets so re-runs only happen on explicit submit, not on every keystroke or slider nudge. 4. Moved the dashboard connection to a read replica instead of hammering the primary. 5. Added a Streamlit-specific connection pooler using st.connection with SQLAlchemy, capped at 5 connections.
Key lesson
Streamlit re-runs the entire script on every interaction — uncached database queries will destroy your database under any real concurrency.
@st.cache_data is not optional for production apps — it is the single most important performance decision you will make.
Always point analytics dashboards at a read replica — never query the production primary from a UI layer.
st.form() prevents re-runs during data entry — use it for any multi-input workflow where users adjust several controls before committing.
Test with realistic concurrency before you deploy — 10 simultaneous users behave nothing like a single developer clicking through one scenario.
Production debug guideCommon symptoms when a Streamlit app is slow or broken in production.8 entries
Symptom · 01
Dashboard takes 5+ seconds to respond to any interaction.
→
Fix
Check whether every data loading function is decorated with @st.cache_data. If any are missing the decorator, wrap them immediately. If the decorator is already present and the app is still slow, check whether the cache is being invalidated unexpectedly — passing a mutable argument like a DataFrame directly into a cached function will bust the cache on every call because the hash changes.
Symptom · 02
App works fine for one user but crashes or slows dramatically under concurrent access.
→
Fix
Each Streamlit session runs in its own thread. Check for shared mutable state that is not thread-safe — a global dictionary or list that multiple sessions write to will corrupt silently. Use @st.cache_resource for shared objects like database connection pools, which are designed to be shared safely. Review server logs for RuntimeError or threading exceptions.
Symptom · 03
Cached data is stale — users are seeing results that do not reflect recent database changes.
→
Fix
The cache TTL has not expired yet. Either reduce the ttl parameter on @st.cache_data — for example, ttl=60 for data that changes frequently — or call st.cache_data.clear() programmatically. For user-controlled refresh, add a clearly labeled 'Refresh Data' button that calls the clear function and immediately triggers a re-run.
Symptom · 04
App shows a white screen or 'Connection lost' error in the browser.
→
Fix
The Streamlit server process crashed or a Python exception bubbled up and was not caught. Check the terminal or server logs for the full traceback. The most common cause is an unhandled exception in the script body that only surfaces for certain widget value combinations — for example, a division by zero when a slider is at its minimum. Add try/except blocks around critical sections and use st.error() to surface failures gracefully rather than crashing the session.
Symptom · 05
App is slow and you need to identify which specific function is the bottleneck.
→
Fix
Symptom · 06
Cache seems to not be working — data reloads on every interaction even though the decorator is present.
→
Fix
Symptom · 07
Docker container starts successfully but the browser shows connection refused.
→
Fix
Symptom · 08
st.session_state values reset unexpectedly between interactions.
→
Fix
Session state is keyed by widget identity. If you dynamically generate widget keys that change between re-runs — for example, keys derived from loop indices or timestamps — state for the old keys is orphaned and new keys start empty. Use stable, descriptive string keys: st.text_input('Name', key='user_name'). Also verify that every initialization follows the correct guard pattern: if 'key' not in st.session_state: — without the guard, the initialization line fires on every re-run and overwrites whatever the user set.
★ Streamlit Debug Cheat SheetQuick commands and checks when a Streamlit app is misbehaving in development or production.
App is slow — need to identify which function is consuming the most execution time.−
Immediate action
Add timing instrumentation around the suspected functions using time.perf_counter(), or install streamlit-profiler for per-function execution time breakdown.
Commands
python -m cProfile -s cumtime -m streamlit run app.py 2>&1 | head -30
pip install streamlit-profiler && streamlit-profiler run app.py
Fix now
Profile output shows cumulative time per function — the top entry is your bottleneck. Cache it with @st.cache_data or rewrite the query before touching anything else.
Cache seems to not be working — data reloads on every interaction.+
Immediate action
Verify the function is decorated and all arguments passed to it are hashable.
Commands
Add st.write(st.cache_data) to inspect cache stats — shows hits and misses per decorated function.
Check if any argument to the cached function is a mutable type — list, dict, DataFrame — these bust the cache silently on every call.
Fix now
Convert mutable arguments to immutable equivalents — tuple instead of list — or remove them from the function signature entirely.
Docker container runs but browser shows connection refused.+
Immediate action
Verify the server is listening on 0.0.0.0, not 127.0.0.1.
Ensure ENTRYPOINT includes --server.address=0.0.0.0. Without it, Streamlit binds to localhost only and is unreachable from outside the container.
Streamlit vs Dash vs Gradio
Feature / Aspect
Streamlit
Dash (Plotly)
Gradio
Learning curve
Minimal — pure Python script style, no frontend knowledge required
Moderate — callback-based reactive model requires understanding Input/Output wiring
Minimal — but strongly opinionated toward ML inference interfaces
Re-run model
Full script re-runs on every interaction — simple but requires disciplined caching
Targeted callbacks — only the components affected by an Input update
Event-driven per component — each function maps to specific UI elements
Best for
Data dashboards, internal tools, rapid prototyping, ML result visualization
Complex production-grade analytics apps where fine-grained update control matters
ML model demos, inference UIs, sharing models with non-technical stakeholders
Layout control
Good — columns, tabs, expanders, sidebar. Limited CSS customization without components
Excellent — full CSS and HTML control, Bootstrap integration, arbitrary component placement
Limited — opinionated grid layout, not suitable for complex multi-section dashboards
State management
st.session_state dictionary — simple but ephemeral, lost on refresh
Explicit callback Output/Input wiring — more verbose but gives precise control over what updates
Implicit per-function state — simple for single-function interfaces, awkward for multi-step flows
Concurrency model
Each session runs in its own thread — shared objects need @st.cache_resource for safety
Async callbacks supported — better suited for high-concurrency production workloads
Single-user focus by default — sharing a model demo link spins up separate instances
Production deployment
Docker, Streamlit Community Cloud, Kubernetes with sticky sessions
Docker, Gunicorn/uWSGI, standard WSGI deployment — same as any Flask app
Hugging Face Spaces (native), Docker, or standalone server
Custom JavaScript
Supported via st.components.v1 — but requires wrapping components manually
Native — arbitrary Dash components can include React and JavaScript
Not supported — Gradio controls the entire frontend
Key takeaways
1
Streamlit's re-run-on-interaction model is its superpower and its main operational risk
mastering caching is non-negotiable before any production deployment.
2
@st.cache_data is for serializable values like DataFrames and API responses; @st.cache_resource is for shared non-serializable objects like database connection pools and ML models.
3
st.session_state initialization must always be guarded with a 'not in' check
without it, every re-run resets the state and users lose their progress.
4
Use st.form() to batch widget interactions into a single re-run on submit
it is the single most effective way to prevent re-run storms during multi-input data entry.
5
Never commit .streamlit/secrets.toml to version control
manage credentials via your platform's native secrets manager and rotate anything that was ever exposed.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
Describe three concrete strategies for optimizing a slow Streamlit app t...
Q02SENIOR
Explain the difference between @st.cache_data and @st.cache_resource. Wh...
Q03JUNIOR
How does st.session_state facilitate the creation of multi-step wizards ...
Q04SENIOR
What are the security implications of using st.file_uploader in a public...
Q05SENIOR
Can Streamlit run on AWS Lambda or GCP Cloud Run in request-based server...
Q01 of 05SENIOR
Describe three concrete strategies for optimizing a slow Streamlit app that has 50 concurrent users and re-runs taking 8+ seconds.
ANSWER
First and most impactful: wrap every data loading function with @st.cache_data. Without caching, every re-run by every user re-executes the full data pipeline. With caching, 50 users sharing the same parameters hit the cache instead of the database — this alone can reduce load by 98%. Second: add st.form() around multi-widget inputs — Streamlit Community Cloud, Streamlit Cloud connections, and ML models loaded via @st.cache_resource. This is non-negotiable. Third: use st.form() for multi-input workflows so a re-run fires once on submit instead of once per keystroke. Additionally, minimize work in the script body — move expensive setup into cached functions, keep the top-level script as lightweight as possible. Use st.empty() and st.container() for partial UI updates where appropriate, and point the dashboard at a read replica rather than the production primary database.
Q02 of 05SENIOR
Explain the difference between @st.cache_data and @st.cache_resource. What happens if you try to cache an open file handle or database connection with @st.cache_data?
ANSWER
@st.cache_data serializes the return value using pickle and stores a copy per cache key. It is designed for data: DataFrames, lists, dicts, primitives. If you try to cache an open file handle or database connection with @st.cache_data, it will raise a serialization error at runtime because file handles and connection objects cannot be pickled. @st.cache_resource does not serialize — it stores the original object reference directly in memory. It is designed for infrastructure objects: database connection pools, ML models loaded into GPU memory, open file handles. The critical behavioral difference: @st.cache_data creates a separate copy per session, making it safe for multi-user apps. @st.cache_resource shares the same object instance across all sessions — which is what you want for a single connection pool, but it also means you need thread-safe objects.
Q03 of 05JUNIOR
How does st.session_state facilitate the creation of multi-step wizards or complex data entry forms?
ANSWER
st.session_state is a dictionary that persists across script re-runs within a single browser session. For a multi-step wizard, you store the current step index and all accumulated user inputs in session state. Each re-run reads the current step from state, renders the appropriate UI for that step, and updates state when the user advances or goes back. The critical implementation detail: always initialize session state keys with a 'if key not in st.session_state' guard. Without this guard, the initialization line executes on every re-run and resets whatever the user entered on previous steps — this is the most common bug in multi-step Streamlit flows. st.form() pairs naturally with this pattern by ensuring a re-run fires once on submit rather than once per keystroke, so partial form data does not trigger premature state updates.
Q04 of 05SENIOR
What are the security implications of using st.file_uploader in a public-facing dashboard, and how do you mitigate them?
ANSWER
st.file_uploader allows users to upload arbitrary files to your server. The risks are: (1) uploading malicious executables or scripts disguised as data files — an attacker uploads a .csv that is actually a Python script and hopes the app executes it; (2) denial-of-service via extremely large file uploads that exhaust disk space or server memory; (3) path traversal attacks if the file is saved to a predictable or user-controlled location on disk. Mitigations: validate file extensions and MIME types immediately after upload using the Python-magic library, not just the filename extension which is trivially spoofed. Set a hard file size limit using the maxUploadSize server config option. Process uploaded files entirely in memory — never write them to a predictable path on disk. If disk persistence is required, write to a sandboxed temporary directory with a randomized name. For public-facing apps, add rate limiting at the reverse proxy layer to prevent upload flooding.
Q05 of 05SENIOR
Can Streamlit run on AWS Lambda or GCP Cloud Run in request-based serverless mode? Explain the architectural constraints.
ANSWER
Streamlit cannot run on AWS Lambda at all. Lambda is a request-response serverless model with a hard execution timeout of 15 minutes and no support for long-lived WebSocket connections. Streamlit's frontend communicates with the server via a persistent WebSocket — this is how widget interactions trigger re-runs without full HTTP round-trips. Lambda terminates connections between requests, which fundamentally breaks the Streamlit communication model. GCP Cloud Run can work, but only in always-allocated CPU mode — not in the default request-based mode. In request-based mode, Cloud Run pauses container CPU between HTTP requests, which drops the WebSocket connection and causes the browser to show 'Connection lost'. With always-allocated CPU and a minimum instance count of 1, Cloud Run keeps the container alive and WebSocket connections stay open. For production Streamlit deployments, the correct targets are ECS Fargate, GKE or Kubernetes, a plain VM behind Nginx, or Streamlit Community Cloud — all of which support long-lived TCP connections without the constraints of pure serverless.
01
Describe three concrete strategies for optimizing a slow Streamlit app that has 50 concurrent users and re-runs taking 8+ seconds.
SENIOR
02
Explain the difference between @st.cache_data and @st.cache_resource. What happens if you try to cache an open file handle or database connection with @st.cache_data?
SENIOR
03
How does st.session_state facilitate the creation of multi-step wizards or complex data entry forms?
JUNIOR
04
What are the security implications of using st.file_uploader in a public-facing dashboard, and how do you mitigate them?
SENIOR
05
Can Streamlit run on AWS Lambda or GCP Cloud Run in request-based serverless mode? Explain the architectural constraints.
SENIOR
FAQ · 4 QUESTIONS
Frequently Asked Questions
01
Is Streamlit good for production apps or just prototyping?
Streamlit is production-ready for internal tools, data dashboards, and apps with moderate traffic. Teams run it at enterprise scale behind Docker and Kubernetes with proper caching in place. For very high-traffic public apps — thousands of concurrent users — or apps that need fine-grained component-level updates without full script re-runs, Dash or a React plus FastAPI stack may be more appropriate. The limiting factor is not Streamlit's code quality; it is the full-script re-run model, which does not fit every use case.
Was this helpful?
02
How do I add authentication to a Streamlit app?
For simple internal tools, the streamlit-authenticator library provides username/password flows with hashed credentials. For apps deployed on Streamlit Community Cloud, you can restrict access to specific GitHub accounts or use OAuth2 with Google or GitHub. For enterprise SSO — SAML, OIDC, Active Directory — the standard approach is to deploy Streamlit behind a reverse proxy like Nginx or Caddy with an authentication layer such as oauth2-proxy or Cloudflare Access. Streamlit itself has no built-in authentication mechanism and should not be exposed to the internet without one.
Was this helpful?
03
Why does my Streamlit app lose all its data when I refresh the page?
Refreshing the browser starts a new session, which clears st.session_state entirely. Session state lives in server memory and is scoped to a single browser session — it was never designed to survive a refresh. For data that must persist across page refreshes, browser sessions, or server restarts, write it to an external store: PostgreSQL for structured data, Redis for short-lived key-value state, or S3 for large result files. Load it back at the start of each session.
Was this helpful?
04
How can I make my Streamlit app look more professional?
Start with st.set_page_config(layout='wide') to use the full browser width instead of Streamlit's default narrow column. Define a custom theme in .streamlit/config.toml — primary color, background color, and font. Organize content with st.tabs() to reduce vertical scrolling, st.columns() for side-by-side layouts, and st.expander() to hide secondary information behind a toggle. Use st.metric() for KPI cards instead of plain st.write() for numbers. For icons and logos, st.image() accepts URLs and local file paths. If you need more visual control than Streamlit's built-in components allow, st.components.v1.html() lets you inject raw HTML and CSS.