Skip to content
Home Python Streamlit for Data Apps: Build Interactive Dashboards in Pure Python

Streamlit for Data Apps: Build Interactive Dashboards in Pure Python

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Python Libraries → Topic 22 of 51
Streamlit lets you turn Python scripts into live data apps in minutes.
⚙️ Intermediate — basic Python knowledge assumed
In this tutorial, you'll learn
Streamlit lets you turn Python scripts into live data apps in minutes.
  • Streamlit's re-run-on-interaction model is its superpower and its main operational risk — mastering caching is non-negotiable before any production deployment.
  • @st.cache_data is for serializable values like DataFrames and API responses; @st.cache_resource is for shared non-serializable objects like database connection pools and ML models.
  • st.session_state initialization must always be guarded with a 'not in' check — without it, every re-run resets the state and users lose their progress.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Streamlit re-runs your entire Python script on every widget interaction — no callbacks, no event loop
  • @st.cache_data caches serializable returns (DataFrames, API responses) — @st.cache_resource caches non-serializable objects (DB connections, ML models)
  • st.session_state persists data across re-runs within a single browser session — but is lost on page refresh
  • Use st.form() to batch widget inputs and prevent re-runs on every keystroke during data entry
  • The #1 production mistake: loading data outside a cache decorator — every slider move re-queries the database
  • Biggest misconception: Streamlit is only for prototypes. With proper caching and Docker deployment, it handles internal tools at enterprise scale
🚨 START HERE
Streamlit Debug Cheat Sheet
Quick commands and checks when a Streamlit app is misbehaving in development or production.
🟠App is slow — need to identify which function is consuming the most execution time.
Immediate ActionAdd timing instrumentation around the suspected functions using time.perf_counter(), or install streamlit-profiler for per-function execution time breakdown.
Commands
python -m cProfile -s cumtime -m streamlit run app.py 2>&1 | head -30
pip install streamlit-profiler && streamlit-profiler run app.py
Fix NowProfile output shows cumulative time per function — the top entry is your bottleneck. Cache it with @st.cache_data or rewrite the query before touching anything else.
🟡Cache seems to not be working — data reloads on every interaction.
Immediate ActionVerify the function is decorated and all arguments passed to it are hashable.
Commands
Add st.write(st.cache_data) to inspect cache stats — shows hits and misses per decorated function.
Check if any argument to the cached function is a mutable type — list, dict, DataFrame — these bust the cache silently on every call.
Fix NowConvert mutable arguments to immutable equivalents — tuple instead of list — or remove them from the function signature entirely.
🟡Docker container runs but browser shows connection refused.
Immediate ActionVerify the server is listening on 0.0.0.0, not 127.0.0.1.
Commands
docker exec <container> curl -s http://localhost:8501/_stcore/health
docker logs <container> --tail 50
Fix NowEnsure ENTRYPOINT includes --server.address=0.0.0.0. Without it, Streamlit binds to localhost only and is unreachable from outside the container.
Production IncidentInternal dashboard hammered the production database — uncached query fired on every widget interactionA data team deployed a Streamlit sales dashboard that queried a production PostgreSQL database directly. Every slider adjustment re-ran the script, re-executing a 4-second query. Under 10 concurrent users, the database connection pool was exhausted in under 3 minutes, causing cascading failures across the payment service.
SymptomDashboard loads in 4 seconds. Moving any slider causes another 4-second wait. Database monitoring shows 150+ active connections against a normal baseline of 20. Payment service starts returning 503 errors. The DBA messages the team: 'Who is running SELECT * FROM orders 200 times per minute?'
AssumptionThe team assumed Streamlit handled request batching automatically. Nobody on the team had read about the re-run-on-interaction model before deploying. They tested with one developer, one browser tab, and a couple of slider clicks — performance seemed acceptable.
Root causeThe dashboard loaded sales data via a raw SQL query at the top of the script, with no @st.cache_data decorator. Streamlit's execution model means every widget interaction — slider move, checkbox toggle, dropdown selection — triggers a full script re-execution from top to bottom. With 10 concurrent users each making 3 to 5 interactions per minute, the database received over 200 queries per minute against a 50-million-row orders table. The connection pool (max_connections=100) was exhausted in 3 minutes. The query itself made things worse — SELECT * with no pagination, no column projection, no date filter.
Fix1. Wrapped the data loading function with @st.cache_data(ttl=300) — cache expires every 5 minutes, reducing queries from 200 per minute to roughly 1 per 5 minutes per user. 2. Replaced SELECT * with a parameterized query filtered by the selected date range — column projection dropped result size by 80%. 3. Added st.form() around all input widgets so re-runs only happen on explicit submit, not on every keystroke or slider nudge. 4. Moved the dashboard connection to a read replica instead of hammering the primary. 5. Added a Streamlit-specific connection pooler using st.connection with SQLAlchemy, capped at 5 connections.
Key Lesson
Streamlit re-runs the entire script on every interaction — uncached database queries will destroy your database under any real concurrency.@st.cache_data is not optional for production apps — it is the single most important performance decision you will make.Always point analytics dashboards at a read replica — never query the production primary from a UI layer.st.form() prevents re-runs during data entry — use it for any multi-input workflow where users adjust several controls before committing.Test with realistic concurrency before you deploy — 10 simultaneous users behave nothing like a single developer clicking through one scenario.
Production Debug GuideCommon symptoms when a Streamlit app is slow or broken in production.
Dashboard takes 5+ seconds to respond to any interaction.Check whether every data loading function is decorated with @st.cache_data. If any are missing the decorator, wrap them immediately. If the decorator is already present and the app is still slow, check whether the cache is being invalidated unexpectedly — passing a mutable argument like a DataFrame directly into a cached function will bust the cache on every call because the hash changes.
App works fine for one user but crashes or slows dramatically under concurrent access.Each Streamlit session runs in its own thread. Check for shared mutable state that is not thread-safe — a global dictionary or list that multiple sessions write to will corrupt silently. Use @st.cache_resource for shared objects like database connection pools, which are designed to be shared safely. Review server logs for RuntimeError or threading exceptions.
Cached data is stale — users are seeing results that do not reflect recent database changes.The cache TTL has not expired yet. Either reduce the ttl parameter on @st.cache_data — for example, ttl=60 for data that changes frequently — or call st.cache_data.clear() programmatically. For user-controlled refresh, add a clearly labeled 'Refresh Data' button that calls the clear function and immediately triggers a re-run.
App shows a white screen or 'Connection lost' error in the browser.The Streamlit server process crashed or a Python exception bubbled up and was not caught. Check the terminal or server logs for the full traceback. The most common cause is an unhandled exception in the script body that only surfaces for certain widget value combinations — for example, a division by zero when a slider is at its minimum. Add try/except blocks around critical sections and use st.error() to surface failures gracefully rather than crashing the session.
st.session_state values reset unexpectedly between interactions.Session state is keyed by widget identity. If you dynamically generate widget keys that change between re-runs — for example, keys derived from loop indices or timestamps — state for the old keys is orphaned and new keys start empty. Use stable, descriptive string keys: st.text_input('Name', key='user_name'). Also verify that every initialization follows the correct guard pattern: if 'key' not in st.session_state: — without the guard, the initialization line fires on every re-run and overwrites whatever the user set.

Most data insights die in Jupyter notebooks. A data scientist builds a forecasting model, but only someone who can run Python can actually see it. Streamlit fixes this — it turns any Python script into a live web app with zero frontend code.

The core trade-off: Streamlit re-runs your entire script on every interaction. This makes the programming model dead simple — your code stays linear, no callback wiring. But it also means un-cached operations like database queries, model loading, and file reads fire on every slider move. Without disciplined caching, your app grinds to a halt after the second click.

Productionizing Streamlit requires three things: caching decorators on every expensive operation, st.session_state for cross-interaction state, and a deployment strategy — Docker, Streamlit Community Cloud, or Kubernetes. Miss any of these and you have a prototype that breaks under real usage.

How Streamlit's Execution Model Actually Works (This Changes Everything)

Before you write a single widget, you need to understand Streamlit's most important — and most surprising — design decision: every time a user interacts with your app, Streamlit re-runs your entire Python script from top to bottom. Every. Single. Time.

This is completely different from how most web frameworks operate. There is no event loop, no callbacks, no onclick handler wiring. When a user moves a slider, Streamlit re-executes your script with the new slider value baked in as the widget's return value. It sounds expensive, and it can be — but it is also what makes Streamlit so easy to reason about.

The upside: your app logic stays linear and readable, exactly like a regular Python script. The downside: if you are loading a 2GB CSV or running a complex SQL query on every re-run, your app will be unusably slow within seconds. That is why caching is not optional — it is the single most critical design decision in any Streamlit app.

execution_model_demo.py · PYTHON
12345678910111213141516171819202122232425262728293031
import streamlit as st
import datetime

# io.thecodeforge: Tracking the execution lifecycle
# This line runs EVERY time the user interacts with anything in the app.
st.write(f"Script last ran at: {datetime.datetime.now().strftime('%H:%M:%S')}")

st.title("Understanding Streamlit's Re-Run Model")

# When the user moves this slider, the ENTIRE script above AND below re-executes.
temperature_celsius = st.slider(
    label="Set temperature (°C)",
    min_value=-20,
    max_value=50,
    value=22
)

temperature_fahrenheit = (temperature_celsius * 9 / 5) + 32

st.metric(
    label="Temperature in Fahrenheit",
    value=f"{temperature_fahrenheit:.1f} °F",
    delta=f"{temperature_celsius} °C input"
)

if temperature_celsius > 35:
    st.warning("That's dangerously hot. Stay hydrated and limit outdoor exposure.")
elif temperature_celsius < 0:
    st.info("Below freezing — roads may be icy. Check local transport advisories.")
else:
    st.success("Comfortable temperature range.")
▶ Output
Script last ran at: 14:32:07
[Title: Understanding Streamlit's Re-Run Model]
[Slider rendered at 22°C by default]
Temperature in Fahrenheit: 71.6 °F (+22 °C input)
[Green success box]: Comfortable temperature range.
Mental Model
Streamlit as a Replay Engine
Streamlit is not an event-driven framework — it is a replay engine that re-executes your script with updated widget values on every interaction.
  • No callbacks, no event loop — your code runs top-to-bottom on every interaction
  • Widget return values change between re-runs, but the code structure stays identical
  • This makes the mental model dead simple: write a script, add widgets, done
  • The cost: every uncached operation re-executes — this is why caching is mandatory, not optional
  • Think of it as replaying your script with new inputs each time, not patching specific components
📊 Production Insight
Every widget interaction triggers a full script re-run.
Uncached operations — DB queries, file reads, model inference — fire on every slider move.
Rule: if an operation takes more than 100ms, it must live inside a cache decorator.
🎯 Key Takeaway
Streamlit re-runs your entire script on every interaction — this is its defining design decision.
The simplicity comes at a real cost: every uncached operation re-executes on every slider move.
Caching is not a performance optimization in Streamlit — it is a correctness requirement.
When Does the Re-Run Model Break Down?
IfAll expensive operations are cached and widgets are lightweight
UseRe-run model works great — sub-100ms response times, dead-simple code to maintain
IfLoading large files or running DB queries without caching
UseApp becomes unusable after 2 to 3 interactions — wrap everything expensive in @st.cache_data immediately
IfNeed to persist state across re-runs — multi-step wizard, login flow, accumulated user inputs
UseUse st.session_state — it is the only mechanism that survives a re-run within the same browser session
IfNeed push-based updates — WebSocket data streams, Kafka topics, live sensor feeds
UseStreamlit is not the right tool for this pattern — use Dash, Panel, or a custom FastAPI plus React frontend

Caching and State: Making Your App Fast and Stateful

Streamlit gives you two caching decorators and they solve different problems. @st.cache_data is for functions that return data — CSVs, API responses, processed DataFrames. It serializes the return value using pickle, which means every user session gets its own copy. @st.cache_resource is for non-serializable objects like database connections or ML models — it stores the object reference directly and shares it across all sessions.

Then there is st.session_state — a dictionary that persists for the lifetime of a user's session. It disappears on page refresh, but it survives every re-run within that session. It is how you build multi-step forms, track login status, or accumulate user inputs without losing them between interactions.

caching_and_state_demo.py · PYTHON
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
import streamlit as st
import pandas as pd
import time

# ── CACHING DATA (io.thecodeforge standard) ───────────────────────────────────
@st.cache_data(ttl=3600)  # Cache expires after 1 hour — reduces DB load significantly
def load_sales_data(num_records: int) -> pd.DataFrame:
    """
    Simulates a slow data fetch. With @st.cache_data, this only
    runs once per hour regardless of how many times the user
    interacts with widgets.
    """
    time.sleep(2)  # Simulated network/DB latency
    return pd.DataFrame({
        "record_id": range(num_records),
        "revenue": [i * 1.5 for i in range(num_records)]
    })

# ── CACHING RESOURCES ─────────────────────────────────────────────────────────
@st.cache_resource
def init_connection():
    """
    @st.cache_resource is for non-serializable infrastructure.
    This object is created once and shared across all user sessions.
    Use it for DB pools and ML models — never for DataFrames.
    """
    return {"status": "connected", "provider": "ForgeCloud"}

# ── SESSION STATE ─────────────────────────────────────────────────────────────
# Always guard initialization — without this check, state resets on every re-run.
if "analysis_count" not in st.session_state:
    st.session_state.analysis_count = 0

st.title("Forge Analytics Dashboard")

col1, col2 = st.columns([3, 1])
with col1:
    record_count = st.slider("Number of records to load", 100, 10000, 1000)
with col2:
    st.metric("Analyses Run", st.session_state.analysis_count)

if st.button("Run Analysis"):
    st.session_state.analysis_count += 1
    df = load_sales_data(record_count)  # Cached — will not re-query on every click
    st.dataframe(df.head(10))
    st.write(f"Analysis #{st.session_state.analysis_count} complete. Loaded {len(df)} records.")

conn = init_connection()  # Shared resource — created once, reused across sessions
st.caption(f"Connection status: {conn['status']} via {conn['provider']}")
▶ Output
Forge Analytics Dashboard
[Slider: Number of records to load — set at 1000]
[Metric: Analyses Run — 0]
[Button: Run Analysis]

After clicking Run Analysis:
Analysis #1 complete. Loaded 1000 records.
[DataFrame: first 10 rows displayed]
Connection status: connected via ForgeCloud
⚠ Watch Out: The Initialization Trap
Never initialize session_state keys unconditionally at the top of your script like st.session_state.count = 0. That line runs on every re-run and silently resets whatever the user has accumulated. Always wrap initialization in: if 'key' not in st.session_state: — that is the correct, safe pattern.
📊 Production Insight
@st.cache_data serializes returns — it will raise a serialization error on open file handles or live DB connections.
@st.cache_resource does NOT serialize — it stores the object reference directly and shares it across all sessions.
Rule: use @st.cache_data for data, @st.cache_resource for infrastructure objects. Mixing them up causes either silent bugs or outright exceptions.
🎯 Key Takeaway
@st.cache_data is for serializable data — DataFrames, API responses, computed aggregates.
@st.cache_resource is for non-serializable infrastructure — DB pools, ML models, file handles.
Choosing the wrong decorator causes either pickle serialization errors or silent cache misses that are extremely difficult to debug.
Choosing the Right Cache Decorator
IfFunction returns a DataFrame, list, dict, or any primitive value
UseUse @st.cache_data — it serializes the return and gives each session its own safe copy
IfFunction returns a DB connection pool, ML model loaded into memory, or file handle
UseUse @st.cache_resource — it stores the reference without serialization and shares it across all sessions
IfCached data becomes stale after a database update or pipeline run
UseAdd ttl=300 to @st.cache_data for automatic expiry, or call st.cache_data.clear() from a refresh button for manual invalidation
IfFunction takes a DataFrame as an argument
UseThe DataFrame will bust the cache on most calls because its hash changes with content — pass immutable parameters like column names, date strings, or filter values instead

Building a Real Multi-Page Data App with Layout and Forms

Real data apps require navigation, structured layouts, and forms that do not re-run the entire script on every character the user types. Streamlit handles multi-page navigation through a pages/ directory — any Python file placed there is automatically discovered and shown in the sidebar. Layout primitives like st.columns() and st.tabs() handle visual organization within a page.

Forms are particularly important for production apps. Without st.form(), every keystroke in a text input triggers a full script re-run — which means every keystroke fires your cached data loading function's cache key check, re-renders the entire chart, and redraws the page. With st.form(), all widget changes inside the form are buffered locally and a single re-run fires only when the user explicitly clicks the submit button.

professional_dashboard.py · PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
import streamlit as st
import pandas as pd
import numpy as np

# io.thecodeforge: Professional dashboard layout
# st.set_page_config MUST be the first Streamlit call — anything before it raises StreamlitAPIException.
st.set_page_config(
    page_title="Forge Intelligence Hub",
    page_icon="🔬",
    layout="wide",
    initial_sidebar_state="expanded"
)

# ── SIDEBAR CONTROLS ──────────────────────────────────────────────────────────
with st.sidebar:
    st.header("Control Panel")
    st.divider()
    mode = st.radio(
        "Analysis Mode",
        ["Standard", "Advanced"],
        help="Advanced mode enables cohort segmentation and confidence intervals."
    )
    date_range = st.date_input("Reporting Period", [])

# ── KPI HEADER ROW ────────────────────────────────────────────────────────────
st.title("Forge Intelligence Hub")
kpi1, kpi2, kpi3, kpi4 = st.columns(4)
kpi1.metric("Revenue", "$1.24M", delta="+12.3%")
kpi2.metric("Active Users", "8,412", delta="+340")
kpi3.metric("Churn Rate", "2.4%", delta="-0.5%", delta_color="inverse")
kpi4.metric("Model Accuracy", "94.1%", delta="+1.2%")

st.divider()

# ── TABBED CONTENT ────────────────────────────────────────────────────────────
overview_tab, forecast_tab, settings_tab = st.tabs(["Overview", "Forecast", "Settings"])

with overview_tab:
    chart_data = pd.DataFrame(
        np.random.randn(30, 3),
        columns=["Revenue", "Cost", "Profit"]
    )
    st.line_chart(chart_data)

with forecast_tab:
    # st.form() buffers ALL widget interactions — re-run only fires on submit.
    # Without this, every slider nudge or text keystroke triggers a full re-run.
    with st.form("forecast_parameters"):
        st.subheader("Configure Forecast")
        col_a, col_b = st.columns(2)
        with col_a:
            horizon = st.slider("Forecast horizon (days)", 7, 90, 30)
            confidence = st.selectbox("Confidence interval", ["80%", "90%", "95%"])
        with col_b:
            target_metric = st.text_input("Target metric", placeholder="e.g. daily_revenue")
            include_weekends = st.checkbox("Include weekends", value=True)

        submitted = st.form_submit_button("Run Forecast", type="primary")

    if submitted:
        if not target_metric:
            st.error("Target metric is required before running the forecast.")
        else:
            with st.spinner(f"Running {horizon}-day forecast for '{target_metric}'..."):
                # In production this would call your ML backend API
                st.success(f"Forecast complete: {horizon} days, {confidence} CI, {'weekends included' if include_weekends else 'weekdays only'}.")

with settings_tab:
    st.info("Settings are persisted per session. Changes here reset on page refresh.")
    theme = st.selectbox("Dashboard theme", ["Light", "Dark", "System"])
    refresh_interval = st.number_input("Auto-refresh interval (seconds)", min_value=30, value=300)
▶ Output
Forge Intelligence Hub
[Sidebar: Analysis Mode radio, date range picker]
[KPI row: Revenue $1.24M +12.3%, Active Users 8412 +340, Churn 2.4% -0.5%, Accuracy 94.1% +1.2%]
[Tabs: Overview | Forecast | Settings]
[Overview tab: line chart rendered]
[Forecast tab: form with slider, selectbox, text input, checkbox, Submit button]
[Settings tab: theme selector, refresh interval input]
💡Pro Tip: st.set_page_config() Must Come First — No Exceptions
If you call any other Streamlit function before st.set_page_config(), you will get a StreamlitAPIException that halts the entire app. This includes st.write(), st.title(), and even importing a module that calls a Streamlit function at import time. Make st.set_page_config() the absolute first line after your imports.
📊 Production Insight
st.form() buffers all widget changes and triggers a single re-run on explicit submit.
Without forms, every keystroke in a text_input fires a full script re-run — which under load means dozens of unnecessary cache checks and re-renders per user per minute.
Rule: always wrap multi-input workflows in st.form() to prevent cascading re-runs during data entry.
🎯 Key Takeaway
st.form() is the key to performant multi-input workflows — it batches all interactions into a single re-run on submit.
Multi-page apps use the pages/ directory convention — each .py file is auto-discovered with no configuration required.
Always call st.set_page_config() as the absolute first Streamlit command — any call before it raises an exception that crashes the app on startup.
Layout and Form Strategy
IfSingle-page app with a few widgets and one chart
UseUse st.columns() for side-by-side layout — no multi-page complexity needed
IfMultiple distinct views — dashboard overview, configuration settings, detailed reports
UseUse the pages/ directory for auto-discovered navigation — each .py file becomes a separate page in the sidebar
IfUser needs to fill multiple fields before triggering any processing
UseUse st.form() — it prevents re-runs on every keystroke and fires once cleanly on submit
IfNeed real-time updates as the user types — search-as-you-type filtering, live validation
UseDo NOT use st.form() — use individual widgets outside a form so each keystroke triggers the update

Data Persistence: The SQL Backend

Streamlit's st.session_state is ephemeral by design. It lives in server memory, scoped to a single browser session, and disappears the moment the user refreshes the page, closes the tab, or the server restarts. For anything that needs to survive beyond a single session — audit logs, saved analysis results, user preferences, cross-session dashboards — you must write to an external persistent store.

At the enterprise level, a structured SQL backend is the standard approach. The pattern is straightforward: use @st.cache_resource to create a shared database connection pool once, and write session events to an audit table on key user actions. This gives you a complete record of dashboard activity without impacting the read performance of your main queries.

io/thecodeforge/db/app_audit.sql · SQL
12345678910111213141516171819202122232425262728293031323334353637
-- io.thecodeforge: Persistence Layer for Streamlit Session Activity
-- This table captures dashboard interactions that must survive beyond a single session.
-- st.session_state cannot be used for this — it is lost on every page refresh.

CREATE TABLE IF NOT EXISTS io.thecodeforge.dashboard_activity (
    id              SERIAL PRIMARY KEY,
    session_id      TEXT        NOT NULL,
    user_email      TEXT,
    action_performed TEXT       NOT NULL,
    widget_context  JSONB,                          -- captures which filters/params were active
    interaction_ts  TIMESTAMP   DEFAULT CURRENT_TIMESTAMP
);

-- Index on session_id for fast per-session retrieval
CREATE INDEX IF NOT EXISTS idx_dashboard_activity_session
    ON io.thecodeforge.dashboard_activity (session_id);

-- Index on interaction_ts for time-range analysis of dashboard usage
CREATE INDEX IF NOT EXISTS idx_dashboard_activity_ts
    ON io.thecodeforge.dashboard_activity (interaction_ts DESC);

-- Example: record a forecast run triggered from the Streamlit UI
INSERT INTO io.thecodeforge.dashboard_activity
    (session_id, user_email, action_performed, widget_context)
VALUES
    (
        'sess_882',
        'editor@thecodeforge.io',
        'run_forecast',
        '{"horizon_days": 30, "confidence": "95%", "target_metric": "daily_revenue"}'
    );

-- Retrieve activity for a specific session (useful for debugging user-reported issues)
SELECT action_performed, widget_context, interaction_ts
FROM   io.thecodeforge.dashboard_activity
WHERE  session_id = 'sess_882'
ORDER  BY interaction_ts DESC;
▶ Output
Query executed: CREATE TABLE
Query executed: CREATE INDEX
Query executed: CREATE INDEX
Query executed: INSERT 1 row affected.

SELECT result:
action_performed | widget_context | interaction_ts
------------------+-------------------------------------------------------------+------------------------
run_forecast | {"horizon_days": 30, "confidence": "95%", "target_metric"...} | 2026-04-20 14:32:07
🔥Why External Persistence Matters
st.session_state is browser-session scoped — it disappears on refresh, tab close, or any server restart. Any data that must survive beyond a single session must be written to an external store. PostgreSQL handles audit trails and structured data well. Redis is the right choice for ephemeral cross-session state like rate limiting or short-lived user tokens. S3 or object storage works for saving large analysis outputs or exported reports.
📊 Production Insight
st.session_state is lost on every page refresh — it is not a persistence mechanism, it is a re-run coordination mechanism.
For audit trails and cross-session data, write to a database on key user actions.
Rule: session state is for UI coordination only — counters, step tracking, form progress. Persistent data goes to an external store, no exceptions.
🎯 Key Takeaway
st.session_state is for ephemeral UI state only — it disappears on refresh and is never shared between users.
Persistent data — audit logs, user preferences, saved results — must go to an external store.
The SQL audit table with a JSONB widget_context column is the production-standard approach for tracing dashboard interactions at enterprise scale.
When to Use Session State vs External Store
IfTracking current page, selected filters, wizard step progress, or form field values mid-entry
UseUse st.session_state — fast, no external dependencies, ideal for transient UI state
IfSaving user preferences, audit logs, analysis results, or anything that must survive a refresh
UseWrite to PostgreSQL or Redis — session state is lost on refresh and cannot be relied on for persistence
IfSharing state between multiple users viewing the same dashboard simultaneously
UseUse an external store — st.session_state is strictly per-user and per-session, never shared
IfState must survive server restarts or application deployments
UseExternal store is mandatory — st.session_state lives entirely in server memory and is wiped on restart

Java Integration: Consuming Dashboards via API

In hybrid infrastructure environments, your Streamlit app often serves as the UI layer for a Java or Go-based compute engine. The pattern is clean: the Java service owns the business logic and heavy computation, exposes it via a REST endpoint, and Streamlit calls that endpoint, caches the response, and handles visualization. This separation of concerns keeps your Streamlit script lightweight and your backend independently testable and deployable.

The critical rule: always cache the API call with @st.cache_data. Without it, Streamlit calls your Java backend on every single re-run — which means every slider move, every checkbox toggle, every character typed fires an HTTP request to your backend service. Under even modest concurrency, this becomes a self-inflicted DDoS.

io/thecodeforge/api/DashboardController.java · JAVA
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
package io.thecodeforge.api;

import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

import java.time.Instant;

/**
 * io.thecodeforge: REST API consumed by the Streamlit frontend.
 * Streamlit calls this endpoint via requests.get() with @st.cache_data applied.
 * All business logic and heavy computation lives here — not in the dashboard script.
 */
@RestController
@RequestMapping("/api/v1/forge-metrics")
public class DashboardController {

    /**
     * Returns summary metrics for the Streamlit dashboard KPI row.
     * Streamlit caches this response — this endpoint typically receives
     * 1 request per 5 minutes per user, not 1 per interaction.
     */
    @GetMapping("/summary")
    public ResponseEntity<MetricResponse> getSummary(
            @RequestParam(defaultValue = "30") int horizonDays) {

        // In production: query your data warehouse or aggregation service here.
        // The compute stays in Java; the visualization stays in Streamlit.
        MetricResponse response = new MetricResponse(
            1_204_847.50,
            "USD",
            0.941,
            horizonDays,
            Instant.now().toString()
        );

        return ResponseEntity.ok(response);
    }

    /**
     * Record used as the JSON response body.
     * Streamlit receives this as a dict after requests.get().json().
     */
    record MetricResponse(
        double revenue,
        String currency,
        double modelAccuracy,
        int forecastHorizonDays,
        String generatedAt
    ) {}
}
▶ Output
Spring Boot application started on port 8080.
GET /api/v1/forge-metrics/summary?horizonDays=30
HTTP 200 OK
{
"revenue": 1204847.50,
"currency": "USD",
"modelAccuracy": 0.941,
"forecastHorizonDays": 30,
"generatedAt": "2026-04-20T14:32:07Z"
}
💡Streamlit as a Frontend for Microservices
Streamlit excels as a thin UI layer over existing backend APIs. The Java service handles heavy compute, complex business rules, and data access. Streamlit handles layout, visualization, and user interaction. Use @st.cache_data(ttl=300) on every function that calls an external API — without it, you are firing an HTTP request to your backend on every slider nudge and every re-run.
📊 Production Insight
Streamlit calling a Java or Go backend API on every re-run without caching will saturate your backend under real concurrency.
Always cache the API call with @st.cache_data(ttl=...) — the TTL depends on how fresh the data needs to be.
Rule: treat Streamlit as a presentation layer. Business logic, data access, and compute belong in the backend service, not in the dashboard script.
🎯 Key Takeaway
Streamlit works best as a thin presentation layer over existing backend services — not as a compute engine.
Cache every API call with @st.cache_data to prevent re-run storms from saturating your backend.
For heavy compute, keep the logic in a dedicated microservice and have Streamlit consume the result — this also makes the backend independently testable.
Integration Architecture Decision
IfLightweight compute — pandas aggregations, simple statistical summaries, small dataset filtering
UseDo the computation inside Streamlit in Python — no external API needed, just cache the result
IfHeavy compute — ML model inference, large-scale ETL, complex multi-join SQL, data warehouse queries
UseOffload to a Java or Go microservice and call via REST — cache the response in Streamlit with an appropriate TTL
IfReal-time data streaming — Kafka consumer, WebSocket feed, live sensor data
UseStreamlit does not support push-based updates natively. Use st.empty() with a polling loop as a stopgap, or switch to Dash or Panel for a proper streaming UI

Deploying Your Streamlit App — From Local to Live

For production deployments, Docker is the standard. It guarantees that your runtime environment — including system-level dependencies for libraries like OpenCV, PyTorch, or GeoPandas — is identical from local development to production. It also makes secrets management, health checking, and container orchestration straightforward.

The single most common Docker deployment mistake with Streamlit: forgetting --server.address=0.0.0.0. Without it, Streamlit binds to 127.0.0.1 inside the container. The app starts, the process runs, but no external connection can reach it. You see 'connection refused' in the browser and nothing obviously wrong in the logs.

Dockerfile · DOCKERFILE
123456789101112131415161718192021222324252627282930313233343536373839404142
# io.thecodeforge: Production Streamlit Container
# Built on python:3.11-slim to minimize image size while retaining pip and venv support.
FROM python:3.11-slim

WORKDIR /app

# Install system-level dependencies.
# curl is required for the HEALTHCHECK command below.
# Add any system packages your Python libraries need here (e.g., libgdal-dev for GeoPandas).
RUN apt-get update \
    && apt-get install -y --no-install-recommends curl \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first to leverage Docker layer caching.
# If requirements.txt does not change, this layer is reused on rebuild.
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application source after dependencies to keep the source-change rebuild fast.
COPY . .

# Streamlit's default port. Expose it so orchestrators can route traffic correctly.
EXPOSE 8501

# Health check using Streamlit's built-in health endpoint.
# interval: how often to check. timeout: how long to wait. retries: failures before unhealthy.
HEALTHCHECK \
    --interval=30s \
    --timeout=5s \
    --start-period=10s \
    --retries=3 \
    CMD curl --fail http://localhost:8501/_stcore/health || exit 1

# --server.address=0.0.0.0 is REQUIRED in Docker.
# Without it, Streamlit binds to 127.0.0.1 inside the container
# and external connections get 'connection refused' with no obvious error.
ENTRYPOINT [
    "streamlit", "run", "app.py",
    "--server.port=8501",
    "--server.address=0.0.0.0",
    "--server.headless=true"
]
▶ Output
Step 1/8 : FROM python:3.11-slim
Step 2/8 : WORKDIR /app
Step 3/8 : RUN apt-get update && apt-get install -y --no-install-recommends curl
Step 4/8 : COPY requirements.txt .
Step 5/8 : RUN pip install --no-cache-dir -r requirements.txt
Step 6/8 : COPY . .
Step 7/8 : EXPOSE 8501
Step 8/8 : HEALTHCHECK ...
Successfully built a8f3c9d12e44
Successfully tagged thecodeforge/streamlit-dashboard:latest
⚠ Watch Out: Secrets in Git History
Add .streamlit/secrets.toml to your .gitignore before your very first commit — before you write a single credential into it. If you accidentally commit secrets, treat every exposed credential as compromised and rotate immediately. In production, use your platform's native secrets manager: AWS Secrets Manager, GCP Secret Manager, or Kubernetes Secrets mounted as environment variables.
📊 Production Insight
Without --server.address=0.0.0.0, Streamlit binds to localhost inside the container and external connections fail silently.
The HEALTHCHECK endpoint at /_stcore/health is Streamlit's built-in liveness check — wire it to your load balancer's health probe.
Rule: always set 0.0.0.0 in Docker deployments and always configure the health check — these two lines prevent the two most common production deployment failures.
🎯 Key Takeaway
Docker is the production-standard deployment for Streamlit — it guarantees environment consistency and makes health checking and secrets management tractable.
Always bind to 0.0.0.0 inside containers — localhost binding is a silent deployment failure that shows nothing in the app logs.
For enterprise deployments, Streamlit sits behind a reverse proxy with an auth layer — it has no built-in authentication and should never be exposed directly to the internet without one.
Deployment Strategy Decision
IfInternal tool, fewer than 20 users, no compliance requirements, public GitHub repo
UseStreamlit Community Cloud — free, zero-config, deploys directly from GitHub on every push
IfInternal tool, 20 to 100 users, custom system dependencies, private repo
UseDocker container on your own infrastructure — AWS ECS Fargate, GCP Cloud Run (always-on), or a single VM behind Nginx
IfPublic-facing app, 100+ concurrent users, need auto-scaling
UseKubernetes with Horizontal Pod Autoscaler — Streamlit does not scale horizontally without sticky sessions at the load balancer layer, so configure session affinity
IfNeed authentication, SSO, or enterprise security controls
UseDeploy behind a reverse proxy — Nginx or Caddy — with an authentication layer such as oauth2-proxy or Cloudflare Access. Streamlit has no built-in authentication mechanism.
🗂 Streamlit vs Dash vs Gradio
Choosing the right Python web framework for your data app
Feature / AspectStreamlitDash (Plotly)Gradio
Learning curveMinimal — pure Python script style, no frontend knowledge requiredModerate — callback-based reactive model requires understanding Input/Output wiringMinimal — but strongly opinionated toward ML inference interfaces
Re-run modelFull script re-runs on every interaction — simple but requires disciplined cachingTargeted callbacks — only the components affected by an Input updateEvent-driven per component — each function maps to specific UI elements
Best forData dashboards, internal tools, rapid prototyping, ML result visualizationComplex production-grade analytics apps where fine-grained update control mattersML model demos, inference UIs, sharing models with non-technical stakeholders
Layout controlGood — columns, tabs, expanders, sidebar. Limited CSS customization without componentsExcellent — full CSS and HTML control, Bootstrap integration, arbitrary component placementLimited — opinionated grid layout, not suitable for complex multi-section dashboards
State managementst.session_state dictionary — simple but ephemeral, lost on refreshExplicit callback Output/Input wiring — more verbose but gives precise control over what updatesImplicit per-function state — simple for single-function interfaces, awkward for multi-step flows
Concurrency modelEach session runs in its own thread — shared objects need @st.cache_resource for safetyAsync callbacks supported — better suited for high-concurrency production workloadsSingle-user focus by default — sharing a model demo link spins up separate instances
Production deploymentDocker, Streamlit Community Cloud, Kubernetes with sticky sessionsDocker, Gunicorn/uWSGI, standard WSGI deployment — same as any Flask appHugging Face Spaces (native), Docker, or standalone server
Custom JavaScriptSupported via st.components.v1 — but requires wrapping components manuallyNative — arbitrary Dash components can include React and JavaScriptNot supported — Gradio controls the entire frontend

🎯 Key Takeaways

  • Streamlit's re-run-on-interaction model is its superpower and its main operational risk — mastering caching is non-negotiable before any production deployment.
  • @st.cache_data is for serializable values like DataFrames and API responses; @st.cache_resource is for shared non-serializable objects like database connection pools and ML models.
  • st.session_state initialization must always be guarded with a 'not in' check — without it, every re-run resets the state and users lose their progress.
  • Use st.form() to batch widget interactions into a single re-run on submit — it is the single most effective way to prevent re-run storms during multi-input data entry.
  • Never commit .streamlit/secrets.toml to version control — manage credentials via your platform's native secrets manager and rotate anything that was ever exposed.

Interview Questions on This Topic

  • QDescribe three concrete strategies for optimizing a slow Streamlit app that has 50 concurrent users and re-runs taking 8+ seconds.Mid-levelReveal
    First and most impactful: wrap every data loading function with @st.cache_data. Without caching, every re-run by every user re-executes the full data pipeline. With caching, 50 users sharing the same parameters hit the cache instead of the database — this alone can reduce load by 98%. Second: add st.form() around multi-widget inputs — Streamlit Community Cloud, Streamlit Cloud connections, and ML models loaded via @st.cache_resource. This is non-negotiable. Third: use st.form() for multi-input workflows so a re-run fires once on submit instead of once per keystroke. Additionally, minimize work in the script body — move expensive setup into cached functions, keep the top-level script as lightweight as possible. Use st.empty() and st.container() for partial UI updates where appropriate, and point the dashboard at a read replica rather than the production primary database.
  • QExplain the difference between @st.cache_data and @st.cache_resource. What happens if you try to cache an open file handle or database connection with @st.cache_data?Mid-levelReveal
    @st.cache_data serializes the return value using pickle and stores a copy per cache key. It is designed for data: DataFrames, lists, dicts, primitives. If you try to cache an open file handle or database connection with @st.cache_data, it will raise a serialization error at runtime because file handles and connection objects cannot be pickled. @st.cache_resource does not serialize — it stores the original object reference directly in memory. It is designed for infrastructure objects: database connection pools, ML models loaded into GPU memory, open file handles. The critical behavioral difference: @st.cache_data creates a separate copy per session, making it safe for multi-user apps. @st.cache_resource shares the same object instance across all sessions — which is what you want for a single connection pool, but it also means you need thread-safe objects.
  • QHow does st.session_state facilitate the creation of multi-step wizards or complex data entry forms?JuniorReveal
    st.session_state is a dictionary that persists across script re-runs within a single browser session. For a multi-step wizard, you store the current step index and all accumulated user inputs in session state. Each re-run reads the current step from state, renders the appropriate UI for that step, and updates state when the user advances or goes back. The critical implementation detail: always initialize session state keys with a 'if key not in st.session_state' guard. Without this guard, the initialization line executes on every re-run and resets whatever the user entered on previous steps — this is the most common bug in multi-step Streamlit flows. st.form() pairs naturally with this pattern by ensuring a re-run fires once on submit rather than once per keystroke, so partial form data does not trigger premature state updates.
  • QWhat are the security implications of using st.file_uploader in a public-facing dashboard, and how do you mitigate them?Mid-levelReveal
    st.file_uploader allows users to upload arbitrary files to your server. The risks are: (1) uploading malicious executables or scripts disguised as data files — an attacker uploads a .csv that is actually a Python script and hopes the app executes it; (2) denial-of-service via extremely large file uploads that exhaust disk space or server memory; (3) path traversal attacks if the file is saved to a predictable or user-controlled location on disk. Mitigations: validate file extensions and MIME types immediately after upload using the Python-magic library, not just the filename extension which is trivially spoofed. Set a hard file size limit using the maxUploadSize server config option. Process uploaded files entirely in memory — never write them to a predictable path on disk. If disk persistence is required, write to a sandboxed temporary directory with a randomized name. For public-facing apps, add rate limiting at the reverse proxy layer to prevent upload flooding.
  • QCan Streamlit run on AWS Lambda or GCP Cloud Run in request-based serverless mode? Explain the architectural constraints.SeniorReveal
    Streamlit cannot run on AWS Lambda at all. Lambda is a request-response serverless model with a hard execution timeout of 15 minutes and no support for long-lived WebSocket connections. Streamlit's frontend communicates with the server via a persistent WebSocket — this is how widget interactions trigger re-runs without full HTTP round-trips. Lambda terminates connections between requests, which fundamentally breaks the Streamlit communication model. GCP Cloud Run can work, but only in always-allocated CPU mode — not in the default request-based mode. In request-based mode, Cloud Run pauses container CPU between HTTP requests, which drops the WebSocket connection and causes the browser to show 'Connection lost'. With always-allocated CPU and a minimum instance count of 1, Cloud Run keeps the container alive and WebSocket connections stay open. For production Streamlit deployments, the correct targets are ECS Fargate, GKE or Kubernetes, a plain VM behind Nginx, or Streamlit Community Cloud — all of which support long-lived TCP connections without the constraints of pure serverless.

Frequently Asked Questions

Is Streamlit good for production apps or just prototyping?

Streamlit is production-ready for internal tools, data dashboards, and apps with moderate traffic. Teams run it at enterprise scale behind Docker and Kubernetes with proper caching in place. For very high-traffic public apps — thousands of concurrent users — or apps that need fine-grained component-level updates without full script re-runs, Dash or a React plus FastAPI stack may be more appropriate. The limiting factor is not Streamlit's code quality; it is the full-script re-run model, which does not fit every use case.

How do I add authentication to a Streamlit app?

For simple internal tools, the streamlit-authenticator library provides username/password flows with hashed credentials. For apps deployed on Streamlit Community Cloud, you can restrict access to specific GitHub accounts or use OAuth2 with Google or GitHub. For enterprise SSO — SAML, OIDC, Active Directory — the standard approach is to deploy Streamlit behind a reverse proxy like Nginx or Caddy with an authentication layer such as oauth2-proxy or Cloudflare Access. Streamlit itself has no built-in authentication mechanism and should not be exposed to the internet without one.

Why does my Streamlit app lose all its data when I refresh the page?

Refreshing the browser starts a new session, which clears st.session_state entirely. Session state lives in server memory and is scoped to a single browser session — it was never designed to survive a refresh. For data that must persist across page refreshes, browser sessions, or server restarts, write it to an external store: PostgreSQL for structured data, Redis for short-lived key-value state, or S3 for large result files. Load it back at the start of each session.

How can I make my Streamlit app look more professional?

Start with st.set_page_config(layout='wide') to use the full browser width instead of Streamlit's default narrow column. Define a custom theme in .streamlit/config.toml — primary color, background color, and font. Organize content with st.tabs() to reduce vertical scrolling, st.columns() for side-by-side layouts, and st.expander() to hide secondary information behind a toggle. Use st.metric() for KPI cards instead of plain st.write() for numbers. For icons and logos, st.image() accepts URLs and local file paths. If you need more visual control than Streamlit's built-in components allow, st.components.v1.html() lets you inject raw HTML and CSS.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousPydantic for Data ValidationNext →Playwright Python — Browser Automation and Testing
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged