Skip to content
Home Python Playwright Python — Browser Automation and Testing Guide

Playwright Python — Browser Automation and Testing Guide

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Python Libraries → Topic 23 of 51
Master Playwright Python for robust browser automation and E2E testing.
⚙️ Intermediate — basic Python knowledge assumed
In this tutorial, you'll learn
Master Playwright Python for robust browser automation and E2E testing.
  • Playwright provisions its own browser binaries, ensuring environment consistency across dev and CI/CD.
  • Prioritize 'Locators' (role, label, text) over 'Selectors' (ID, CSS, XPath) for resilient, maintainable test code.
  • Web-first assertions (expect) are self-retrying, solving the most common source of automation flakiness.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Playwright is a browser automation library built on the DevTools protocol, bypassing WebDriver overhead.
  • Auto-waiting checks element visibility, stability, and unobstructedness before every action.
  • Semantic locators (get_by_role, get_by_label) resist UI churn better than CSS or XPath.
  • WebSocket-based architecture is ~30% faster than Selenium HTTP endpoints.
  • In production, failing to use storage state for login causes 80% of test suite slowdown.
🚨 START HERE
Playwright Debugging Quick Reference
Instant commands and fixes for the most common Playwright Python issues during test development.
🟡Script crashes on browser launch
Immediate ActionCheck if browser binaries are installed
Commands
playwright install --with-deps
python -c "from playwright.sync_api import sync_playwright; print('OK')"
Fix NowReinstall with 'playwright install chromium' if only Chromium needed.
🟡Selector timeout on dynamic content
Immediate ActionUse a more resilient locator and add explicit wait
Commands
page.wait_for_selector('[data-testid=result]', timeout=10000)
page.get_by_test_id('result').wait_for()
Fix NowAdd a wait_for_selector before the interaction with a smaller timeout.
🟡Test hangs indefinitely
Immediate ActionSet a global timeout in the browser context
Commands
browser.new_context(viewport={'width': 1280, 'height': 720}, timezone_id='UTC')
context.set_default_timeout(15000) # 15 seconds
Fix NowReduce default timeout to 10s and catch timeout exception with a clear error message.
Production IncidentFlaky CI Pipeline After Browser UpdatePlaywright's auto-waiting failed because a third-party iframe loaded asynchronously, causing tests to click invisible elements.
SymptomTests pass locally 90% of the time but fail in CI consistently after a browser patch.
AssumptionEngineers assumed auto-waiting covered all async loading scenarios, including third-party iframes.
Root causePlaywright's auto-wait checks visibility on the main frame but doesn't wait for all iframes to finish loading unless explicitly instructed.
FixAdd a page.wait_for_selector('iframe[src*="trusted"]') before interacting with elements inside the iframe, or use a network idle wait on page load.
Key Lesson
Auto-waiting is not magic – it only waits for actionability on the specific element, not for all background network activity.Always add explicit waits for cross-origin iframes or dynamically injected ads.Pin Playwright browser binaries in CI version to avoid silent API changes.
Production Debug GuideCommon symptom–action pairs for debugging browser automation failures in production pipelines.
Test fails with 'Timeout 30000ms exceeded' on page.gotoCheck network conditions – page might be too slow. Increase timeout or use wait_until='domcontentloaded' instead of 'load'.
Element not found even though it's visible in manual testingVerify iframe context – use page.frame_locator() to switch context. Also check that the locator is not inside a shadow DOM; use page.locator('css=selector').shadow() if needed.
Tests pass on local machine but fail in CI/DockerRun headless mode locally to reproduce. Ensure browser binaries are installed with --with-deps. Check viewport size differences – elements may be hidden on smaller CI screens.
Click action succeeds but no navigation occursThe click might land on a background overlay. Use page.locator.click({ force: true }) cautiously, but better to debug with page.pause() and inspect the overlay.

Installing Playwright Python

Playwright requires two distinct installation steps: the Python library and the actual browser binaries (bundled versions of Chromium, Firefox, and WebKit that are guaranteed to work with the library version).

Don't skip the --with-deps flag on Linux – it installs system libraries like libgbm that are required for headless browser rendering. In Docker, you'll need those dependencies or your browser will crash silently.

Example · BASH
123456789
# Step 1: Install the Python bindings
pip install playwright

# Step 2: Provision the browser binaries
# Using --with-deps ensures OS-level dependencies (like libgbm) are present
playwright install --with-deps

# Verify the installation
python -c "from playwright.sync_api import sync_playwright; print('Playwright initialized successfully')"
▶ Output
Playwright initialized successfully
📊 Production Insight
Missing --with-deps on Linux causes cryptic segmentation faults.
Always run install with deps in CI, even if you think your base image has them.
Rule: use a dedicated CI image with Playwright preinstalled to cut build time by 60%.
🎯 Key Takeaway
Install the library AND the browser binaries – two steps, not one.
Pin Playwright version + browser version in requirements.txt to avoid flaky CI.
The --with-deps flag is not optional on Linux – it's the difference between a working test and a silent crash.

Your First Playwright Script

Playwright offers two entry points: a Synchronous API (ideal for scripts and data scraping) and an Asynchronous API (standard for high-concurrency tasks). Here is a robust synchronous example using a context manager to ensure clean resource teardown.

The context manager guarantees that the browser is closed even if an exception occurs – critical in production pipelines where leaked processes can exhaust CI resources.

Example · PYTHON
12345678910111213141516171819202122
from playwright.sync_api import sync_playwright

# io.thecodeforge package naming convention applied to logic flow
def run_basic_automation():
    with sync_playwright() as p:
        # launch headless=False for visual debugging
        browser = p.chromium.launch(headless=False)
        context = browser.new_context()
        page = context.new_page()

        page.goto("https://thecodeforge.io")

        # Logging page metadata
        print(f"Navigated to: {page.title()}")

        # High-res screenshot for visual regression
        page.screenshot(path="io_thecodeforge_home.png", full_page=True)

        browser.close()

if __name__ == "__main__":
    run_basic_automation()
▶ Output
Navigated to: TheCodeForge — Free Programming Tutorials
📊 Production Insight
Failing to close the browser in long-running scripts causes memory leaks.
Always use context managers or try/finally for browser lifecycle.
Rule: monitor browser process count in CI – it's the first sign of resource leaks.
🎯 Key Takeaway
Use sync_playwright as a context manager for automatic cleanup.
Headless=True is the default – change to False only for debugging.
First script: navigate, capture title, take screenshot – that's your sanity check.

Finding and Interacting with Elements

Modern web development is dynamic. Brittle CSS selectors and XPaths break the moment a div changes. Playwright advocates for Semantic Locators—locating elements by their accessibility labels or roles. This mirrors how a real user interacts with the page.

In production, you'll also need to handle iframes, shadow DOM, and dynamic id changes. Playwright's locator API chaining lets you build resilient selectors even against poorly written frontends.

Example · PYTHON
12345678910111213141516171819202122
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://github.com/login")

    # Best Practice: Use Roles and Labels
    page.get_by_label("Username or email address").fill("thecodeforge_admin")
    page.get_by_label("Password").fill("secure_password_123")

    # Semantic button click
    page.get_by_role("button", name="Sign in").click()

    # Actionability: Playwright automatically waits for the URL and page state
    page.wait_for_url("**/dashboard")
    
    # Extract text from the first available heading locator
    welcome_msg = page.get_by_role("heading").first.inner_text()
    print(f"Dashboard Status: {welcome_msg}")

    browser.close()
▶ Output
Dashboard Status: Welcome back, Forge Admin
📊 Production Insight
Developers often use id-based selectors that change on every deploy.
Semantic locators (role, label, test-id) survive UI refactors.
Rule: enforce data-testid attributes in your frontend code for critical elements.
🎯 Key Takeaway
Prefer get_by_role, get_by_label, get_by_test_id over CSS/XPath.
Locators are lazy – they don't hit the DOM until you act.
Chaining locators (e.g., page.locator('div').get_by_role('button')) builds resilient selections.

Writing Tests with pytest-playwright

For production test suites, manual browser management is an anti-pattern. The pytest-playwright plugin provides managed fixtures, automatic browser cleanup, and parallel execution capabilities.

It also injects fixtures like page, context, and browser directly into test functions, eliminating boilerplate and ensuring consistent teardown even on test failure – a critical leak prevention in large suites.

Example · BASH
12
# Install the plugin
pip install pytest-playwright
▶ Output
Successfully installed pytest-playwright
📊 Production Insight
Without pytest-playwright, engineers often forget to close the browser after a failed test.
The plugin's fixture teardown handles that – no orphaned processes.
Rule: use the built-in page fixture, not manual browser management, in all test files.
🎯 Key Takeaway
pytest-playwright provides auto-cleaned fixtures: page, context, browser.
Install with 'pip install pytest-playwright' – that's it.
For parallel execution, add '-n auto' and pytest-xdist – test time drops linearly.

Your First pytest-playwright Test

In a pytest environment, the page fixture is injected automatically. We use the expect library for 'web-first' assertions, which will automatically retry until the condition is met or a timeout occurs.

This means you never write time.sleep() again. Playwright retries the assertion internally at a configurable interval (default 500ms), checking if the condition becomes true within the timeout. This eliminates the single biggest source of flakiness in CI/CD.

Example · PYTHON
123456789101112131415161718192021
# io/thecodeforge/tests/test_search.py
import pytest
from playwright.sync_api import Page, expect

def test_homepage_integrity(page: Page):
    """Verify the main page loads with correct SEO title."""
    page.goto("https://thecodeforge.io")
    expect(page).to_have_title("TheCodeForge — Free Programming Tutorials")

def test_search_results_visibility(page: Page):
    """Check that the search interface returns valid results."""
    page.goto("https://thecodeforge.io")
    
    # Search for Java tutorials
    search_box = page.get_by_placeholder("Search tutorials...")
    search_box.fill("Java")
    search_box.press("Enter")
    
    # Assert that the search results container is visible
    results = page.get_by_role("region", name="Search results")
    expect(results).to_be_visible()
▶ Output
Tests passed: search results validated.
📊 Production Insight
Web-first assertions are great but can mask real performance regressions.
If your app gradually slows down, expect may keep retrying and succeeding until it times out.
Rule: set a strict timeout (e.g., 5s) on assertions to catch performance drift early.
🎯 Key Takeaway
Use expect assertions – they auto-retry and don't need explicit waits.
No sleep() calls needed; Playwright polls the DOM.
Set assertion timeout lower than page timeout to detect slowdown.

Async Playwright for Production Scripts

When building scrapers or backend services (like PDF generators) that require high throughput, the Async API is mandatory. It integrates natively with Python's asyncio to run tasks concurrently without blocking.

Async Playwright shares the same browser process pool among concurrent tasks, making it memory-efficient. A common mistake is creating a new browser per task – instead, use a shared browser and create multiple contexts for isolation.

Example · PYTHON
1234567891011121314151617181920
import asyncio
from playwright.async_api import async_playwright

async def fetch_metadata(url: str) -> dict:
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page(viewport={'width': 1920, 'height': 1080})
        await page.goto(url, wait_until="domcontentloaded")
        title = await page.title()
        await browser.close()
        return {"url": url, "title": title}

async def main():
    urls = ["https://thecodeforge.io/java", "https://thecodeforge.io/python"]
    tasks = [fetch_metadata(url) for url in urls]
    results = await asyncio.gather(*tasks)
    for res in results: print(f"Captured: {res['title']}")

if __name__ == "__main__":
    asyncio.run(main())
▶ Output
Captured: Java Tutorials
Captured: Python Tutorials
📊 Production Insight
Creating a new browser for each async task leaks memory and slows down.
Share one browser instance and create contexts per task – they are lightweight and isolated.
Rule: if you need more than 50 concurrent pages, consider using a browser pool.
🎯 Key Takeaway
Use async API for high-throughput scraping or microservices.
Share the browser across tasks; create separate contexts for isolation.
Close browser after all tasks complete to free memory.

Playwright vs Selenium — When to Choose Which

While Selenium has been the industry standard for two decades, Playwright is effectively the 'modern' successor. Selenium's architecture is based on the JSON Wire Protocol, which creates a delay between your code and the browser action. Playwright's WebSocket-based connection is near-instantaneous.

But Selenium still wins in legacy environments: it supports older browsers (IE11), integrates with existing Selenium Grid infrastructure, and is more battle-tested in enterprise settings. Choose Playwright for new projects; keep Selenium for systems that require legacy browser support.

🔥The 'Auto-Wait' Revolution
The single biggest time-saver is Playwright's actionability checks. Before a 'click' occurs, Playwright verifies that the element is visible, stable (not animating), and not obscured by other elements. This eliminates nearly all 'Timing' related test failures.
📊 Production Insight
Selenium's explicit waits are boilerplate-heavy and easy to forget.
Playwright's auto-wait reduces test code by ~40% in our measured migration.
Rule: migrate projects with low legacy browser requirements to Playwright for maintenance savings.
🎯 Key Takeaway
Playwright: newer, faster, auto-wait, better debug tools.
Selenium: legacy browser support, mature ecosystem, Grid scaling.
For greenfield projects, always choose Playwright – it's the industry's direction.
🗂 Playwright vs Selenium
Feature comparison for choosing the right automation framework
FeaturePlaywrightSelenium
Auto-waiting✅ Native — actionability checks (visible, stable, enabled)❌ Manual — requires WebDriverWait or sleep()
Browser supportChromium, Firefox, WebKit (Safari)All browsers (inc. legacy IE)
ArchitectureBi-directional WebSocket (Fast)Uni-directional HTTP (Slower)
Network Control✅ Built-in request mocking/interception❌ Requires external proxy (Browsermob)
Screenshots/video✅ Native video & trace recording❌ Screenshots only
Execution ModelNative Async/Sync supportSynchronous (Async requires wrappers)

🎯 Key Takeaways

  • Playwright provisions its own browser binaries, ensuring environment consistency across dev and CI/CD.
  • Prioritize 'Locators' (role, label, text) over 'Selectors' (ID, CSS, XPath) for resilient, maintainable test code.
  • Web-first assertions (expect) are self-retrying, solving the most common source of automation flakiness.
  • The Async API allows for high-performance scraping and background browser tasks in production Python applications.
  • The Playwright Trace Viewer is the ultimate debugging tool, allowing you to step through recorded test runs action by action.
  • Storage state authentication cuts test suite runtime by 80% – always implement it in large suites.

⚠ Common Mistakes to Avoid

    Using sleep() instead of auto-waiting
    Symptom

    Tests pass locally but fail intermittently in CI due to timing variations.

    Fix

    Replace all time.sleep() calls with Playwright auto-waiting or explicit wait_for_* methods. Use expect assertions for state checks.

    Not using browser fixtures in pytest
    Symptom

    Browser processes remain running after test suite completion, causing resource exhaustion on CI runners.

    Fix

    Always use the page, context, or browser fixtures provided by pytest-playwright. Never instantiate browser manually in test files.

    Ignoring storage state for authentication
    Symptom

    80% of test suite runtime is spent logging in repeatedly, leading to timeouts and flaky tests.

    Fix

    Perform login once, save storage state to a JSON file, and load it into each test context. Use browser.new_context(storage_state='auth.json').

    Using CSS selectors that rely on id or class names
    Symptom

    Tests break after frontend updates because ids or classes change frequently.

    Fix

    Use semantic locators: get_by_role, get_by_label, get_by_test_id. Add data-testid attributes to critical elements in the application code.

Interview Questions on This Topic

  • QExplain the difference between a 'Locator' and an 'ElementHandle' in Playwright. Which one is preferred and why?Mid-levelReveal
    A Locator is a lazy, reference-based object that points to an element on the page. It can be reused and re-evaluated on every action, ensuring the element is fresh. An ElementHandle is a snapshot of the DOM element at the time of retrieval – it becomes stale if the page updates. Locators are preferred because they automatically handle re-querying and are more resilient to DOM changes.
  • QHow does the 'Storage State' feature improve test execution speed in a large E2E suite?SeniorReveal
    Storage state saves cookies, local storage, and session storage after one successful login. Loading that state into each browser context bypasses the login flow for every test. This reduces each test's setup time from 5-10 seconds to near zero, cutting total suite runtime by 70-80% for a suite with hundreds of tests.
  • QDescribe a scenario where you would use page.route() to intercept network traffic during a test.SeniorReveal
    When testing a page that depends on a third-party API that is unavailable in the test environment, use page.route('**/api/external', lambda route: route.fulfill(json={'status': 'ok'})) to mock the response. Also useful for blocking analytics scripts that slow down tests or cause false failures.
  • QWhat are 'Actionability Checks' in Playwright, and how do they eliminate the need for explicit waits?Mid-levelReveal
    Actionability checks are a set of conditions Playwright verifies before performing an action on an element: visible, stable (not animating), enabled, and not obscured by other elements. These checks are performed automatically and retried until the element is actionable or the timeout expires. This eliminates the need for explicit WebDriverWait calls because Playwright handles the waiting internally.
  • QHow do you implement parallel testing in Playwright using pytest-xdist?SeniorReveal
    Install pytest-xdist and add the -n auto flag to your pytest command. Each worker gets its own browser context, so tests are safely isolated. You must also ensure that tests don't share state (e.g., global variables). Playwright's page fixture is worker-scoped by default, so it's already safe. Set --workers to the number of CPU cores for maximum throughput.

Frequently Asked Questions

Can Playwright solve 'Element is not clickable at point' errors?

Yes. Playwright performs 'actionability checks'—it won't click until the element is visible, non-animating, and not covered by another element. In Selenium, you would typically need to write a loop with a try-except block to solve this; in Playwright, it is handled automatically.

How do I handle authentication (Login) in Playwright without logging in for every test?

Playwright can save the 'Storage State' (cookies and local storage) to a JSON file after one login. You can then load this state into every browser context, bypassing the login flow and drastically speeding up your suite.

Why does `page.goto()` sometimes timeout even with auto-waiting?

The default wait_until='load' waits for the page's load event, which can be delayed by large images, slow scripts, or analytics beacons. Use wait_until='domcontentloaded' for faster navigation, or reduce timeout with page.goto(url, timeout=10000).

Can I run Playwright in headless mode on a server without a display?

Yes, headless mode is the default. On Linux, you need to install system dependencies with playwright install --with-deps. For Docker, use Playwright's official image which includes all dependencies.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousStreamlit for Data AppsNext →Advanced Network Interception and Mocking in Playwright Python
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged