Key Takeaways
- pytest for everything: Test functions start with
test_, files start withtest_. Zero config to get started. - Fixtures for setup/teardown:
@pytest.fixturewithyieldfor cleanup. Share viaconftest.py. - Coverage ≠ quality: 90% coverage with bad tests is worse than 70% coverage with good ones. Test behaviour, not implementation.
- Hypothesis for edge cases: Property-based testing finds bugs humans miss — use it for pure functions and validators.
Introduction
Direct Answer: How do I set up pytest with coverage and CI in Python 2026?
Install: pip install pytest pytest-cov pytest-asyncio hypothesis. Write tests in test_*.py files with def test_*() functions. Run: pytest. With coverage: pytest --cov=myapp --cov-report=term-missing. For async: @pytest.mark.asyncio. For parametrise: @pytest.mark.parametrize("input,expected", [(1,1),(2,4)]). For fixtures: @pytest.fixture def db(): ... in conftest.py. GitHub Actions: create .github/workflows/test.yml with run: pytest --cov=myapp --cov-report=xml and uses: codecov/codecov-action@v4. Run the full suite locally before pushing — a CI pass is only valuable if the tests are meaningful.
Part 1: Basic pytest Setup
pip install pytest pytest-cov pytest-asyncio hypothesis --break-system-packages
mkdir -p myapp tests
touch myapp/__init__.py tests/__init__.py
# myapp/validators.py — module to test
from typing import Optional
def validate_email(email: str) -> bool:
"""Return True if email format is valid."""
import re
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email.strip()))
def calculate_discount(price: float, discount_pct: float) -> float:
"""Apply a discount percentage to a price."""
if not 0 <= discount_pct <= 100:
raise ValueError(f"Discount must be 0-100, got {discount_pct}")
if price < 0:
raise ValueError(f"Price cannot be negative, got {price}")
return round(price * (1 - discount_pct / 100), 2)
def chunk_list(lst: list, n: int) -> list[list]:
"""Split a list into chunks of size n."""
if n <= 0:
raise ValueError("Chunk size must be positive")
return [lst[i:i+n] for i in range(0, len(lst), n)]
# tests/test_validators.py
import pytest
from myapp.validators import validate_email, calculate_discount, chunk_list
class TestValidateEmail:
def test_valid_email(self):
assert validate_email("[email protected]") is True
def test_valid_email_with_subdomain(self):
assert validate_email("[email protected]") is True
def test_invalid_no_at(self):
assert validate_email("not-an-email") is False
def test_invalid_no_domain(self):
assert validate_email("user@") is False
def test_empty_string(self):
assert validate_email("") is False
@pytest.mark.parametrize("email,expected", [
("[email protected]", True),
("[email protected]", True),
("no-domain@", False),
("@no-user.com", False),
("spaces [email protected]", False),
])
def test_parametrized(self, email, expected):
assert validate_email(email) == expected
class TestCalculateDiscount:
def test_ten_percent_discount(self):
assert calculate_discount(100.0, 10) == 90.0
def test_zero_discount(self):
assert calculate_discount(100.0, 0) == 100.0
def test_full_discount(self):
assert calculate_discount(100.0, 100) == 0.0
def test_invalid_discount_over_100(self):
with pytest.raises(ValueError, match="Discount must be 0-100"):
calculate_discount(100.0, 101)
def test_negative_price_raises(self):
with pytest.raises(ValueError, match="Price cannot be negative"):
calculate_discount(-10.0, 10)
class TestChunkList:
def test_even_split(self):
assert chunk_list([1,2,3,4], 2) == [[1,2],[3,4]]
def test_uneven_split(self):
assert chunk_list([1,2,3,4,5], 2) == [[1,2],[3,4],[5]]
def test_empty_list(self):
assert chunk_list([], 3) == []
def test_chunk_larger_than_list(self):
assert chunk_list([1,2,3], 10) == [[1,2,3]]
def test_invalid_chunk_size(self):
with pytest.raises(ValueError):
chunk_list([1,2,3], 0)
pytest -v
Expected output:
tests/test_validators.py::TestValidateEmail::test_valid_email PASSED
tests/test_validators.py::TestValidateEmail::test_invalid_no_at PASSED
...
tests/test_validators.py::TestChunkList::test_even_split PASSED
========================= 20 passed in 0.14s =========================
Part 2: Fixtures and conftest.py
# tests/conftest.py — shared fixtures
import pytest
import sqlite3
from pathlib import Path
@pytest.fixture(scope="session")
def db_path(tmp_path_factory):
"""Create a temporary test database for the entire test session."""
db = tmp_path_factory.mktemp("data") / "test.db"
return str(db)
@pytest.fixture(scope="session")
def db_connection(db_path):
"""Session-scoped database connection — created once, shared across all tests."""
conn = sqlite3.connect(db_path)
conn.execute("""CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY, email TEXT UNIQUE, name TEXT, active INTEGER DEFAULT 1
)""")
conn.commit()
yield conn
conn.close()
@pytest.fixture(autouse=True)
def clean_db(db_connection):
"""Clean database before each test (function-scoped by default)."""
db_connection.execute("DELETE FROM users")
db_connection.commit()
yield
# No teardown needed — next test's autouse fixture handles cleanup
# tests/test_db.py — using fixtures
def test_insert_user(db_connection):
db_connection.execute("INSERT INTO users (email, name) VALUES (?, ?)", ("[email protected]", "Alice"))
db_connection.commit()
row = db_connection.execute("SELECT name FROM users WHERE email = ?", ("[email protected]",)).fetchone()
assert row[0] == "Alice"
def test_unique_email_constraint(db_connection):
import sqlite3
db_connection.execute("INSERT INTO users (email, name) VALUES (?, ?)", ("[email protected]", "Bob"))
db_connection.commit()
with pytest.raises(sqlite3.IntegrityError):
db_connection.execute("INSERT INTO users (email, name) VALUES (?, ?)", ("[email protected]", "Carol"))
Part 3: Coverage Report
pytest --cov=myapp --cov-report=term-missing --cov-report=html
Expected output:
---------- coverage: platform linux, python 3.12.3 ----------
Name Stmts Miss Cover Missing
---------------------------------------------------------
myapp/__init__.py 0 0 100%
myapp/validators.py 21 0 100%
---------------------------------------------------------
TOTAL 21 0 100%
Coverage HTML written to dir htmlcov/
========================= 20 passed in 0.47s =========================
Part 4: Hypothesis — Property-Based Testing
# tests/test_properties.py
from hypothesis import given, strategies as st
from myapp.validators import calculate_discount, chunk_list
@given(
price=st.floats(min_value=0.01, max_value=10000, allow_nan=False, allow_infinity=False),
discount=st.floats(min_value=0, max_value=100, allow_nan=False)
)
def test_discount_always_lowers_price(price, discount):
"""Property: discount should always reduce or maintain price."""
result = calculate_discount(round(price, 2), round(discount, 2))
assert result <= round(price, 2)
assert result >= 0
@given(
lst=st.lists(st.integers()),
n=st.integers(min_value=1, max_value=50)
)
def test_chunking_preserves_all_elements(lst, n):
"""Property: chunking should never lose or add elements."""
chunks = chunk_list(lst, n)
flattened = [item for chunk in chunks for item in chunk]
assert flattened == lst
assert all(len(chunk) <= n for chunk in chunks)
pytest tests/test_properties.py -v
Expected output:
tests/test_properties.py::test_discount_always_lowers_price PASSED (ran 100 examples)
tests/test_properties.py::test_chunking_preserves_all_elements PASSED (ran 100 examples)
Hypothesis ran each test with 100 random inputs, including edge cases like price=0.01, discount=0.0, empty lists, single-element lists.
Part 5: GitHub Actions CI Pipeline
# .github/workflows/test.yml
name: Test
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.12", "3.13"] # Test on multiple Python versions
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: "pip"
- name: Install dependencies
run: |
pip install -e ".[dev]" # Assumes pyproject.toml with [project.optional-dependencies.dev]
- name: Run tests with coverage
run: |
pytest --cov=myapp --cov-report=xml --cov-report=term-missing -v
- name: Enforce coverage threshold
run: |
pytest --cov=myapp --cov-fail-under=80 # Fail if coverage drops below 80%
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: ./coverage.xml
Conclusion
pytest with coverage, fixtures, and hypothesis provides a complete sovereign testing stack. The GitHub Actions pipeline ensures every PR is tested before merge, coverage is tracked, and the test suite runs on multiple Python versions.
People Also Ask
What coverage percentage should I aim for?
80% is a reasonable minimum for production code. 100% is rarely the right goal — it’s easy to write tests that cover lines without testing behaviour. Aim for high coverage on business logic (validation, calculations, data transformations) and lower coverage on boilerplate (config loading, simple getters). Coverage tools show you what’s NOT tested — use them to find gaps in critical paths, not to chase a number.
Part 3: Advanced Fixture Patterns
Fixtures are one of pytest’s most powerful features. Use them to encapsulate setup, teardown, and shared context.
3.1 Session vs function scope
A session-scoped fixture creates resources once per test session, while a function-scoped fixture recreates them for every test.
@pytest.fixture(scope='session')
def data_dir(tmp_path_factory):
return tmp_path_factory.mktemp('data')
@pytest.fixture
def sample_file(data_dir):
path = data_dir / 'sample.csv'
path.write_text('id,name
1,Alice
')
return path
Use function scope for isolation and session scope for expensive setup such as database containers.
3.2 Auto-use fixtures
Autouse fixtures run automatically for tests in the same directory tree.
@pytest.fixture(autouse=True)
def enforce_strict(monkeypatch):
monkeypatch.setenv('PYTHONWARNINGS', 'error')
This is useful for enforcing policies, such as no network calls during tests.
Part 4: Property-Based Testing with Hypothesis
Hypothesis explores edge cases by generating inputs automatically. Use it for validators and pure functions.
from hypothesis import given, strategies as st
from myapp.validators import calculate_discount
@given(price=st.floats(min_value=0, max_value=10000), discount_pct=st.floats(min_value=0, max_value=100))
def test_calculate_discount_falls_in_range(price, discount_pct):
result = calculate_discount(price, discount_pct)
assert 0 <= result <= price
Hypothesis can find floating-point edge cases and unexpected behavior faster than hand-written tests.
Part 5: Async Testing with pytest-asyncio
For async Python code, use pytest.mark.asyncio or anyio.
import pytest
from myapp.api import fetch_data
@pytest.mark.asyncio
def test_fetch_data_returns_json():
response = await fetch_data('http://localhost:8000/data')
assert response['status'] == 'ok'
Use httpx.AsyncClient for local async HTTP tests and avoid live external services by mocking the network layer.
Part 6: Mocking and Isolation
Mock external dependencies with unittest.mock.
from unittest.mock import patch
from myapp.external import send_email
@patch('myapp.external.send_email')
def test_notify_user(mock_send_email):
notify_user('[email protected]')
mock_send_email.assert_called_once()
Mocking keeps tests deterministic and prevents accidental external side effects.
Part 7: Coverage Reporting and Quality Gates
Use coverage metrics as a gate, but focus on meaningful tests.
pytest --cov=myapp --cov-report=term-missing --cov-report=html
A coverage.xml file is useful for CI integrations such as Codecov or SonarQube. Set a reasonable threshold like 80% for critical modules, and make exceptions where necessary.
Part 8: Docker-Based Test Isolation
Use Docker when the tests require a database or service stack that should not run on the host. A simple docker-compose test environment keeps dependencies isolated.
version: '3.9'
services:
postgres:
image: postgres:17
environment:
POSTGRES_USER: test
POSTGRES_PASSWORD: test
POSTGRES_DB: testdb
ports:
- '5433:5432'
Run tests against the containerized database and tear it down after the suite completes.
Part 9: GitHub Actions CI Pipeline
A reproducible CI workflow is essential for sovereign development.
name: Python Test
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install uv --break-system-packages
- run: uv sync
- run: uv run pytest --cov=myapp --cov-report=xml
- uses: codecov/codecov-action@v4
with:
files: ./coverage.xml
This workflow ensures tests run in a clean environment and coverage reports are generated consistently.
Part 10: Test Data Management
Keep test data small and deterministic. Use fixtures to generate data rather than embedding large files in the repository.
@pytest.fixture
def sample_users():
return [
{'id': 1, 'email': '[email protected]'},
{'id': 2, 'email': '[email protected]'},
]
If you need larger datasets, store them in a tests/fixtures folder and load them with json.load() or pandas.read_csv() during tests.
Part 11: Security and Dependency Checks
Run pip-audit or safety as part of your test pipeline to catch vulnerable dependencies.
uv run python -m pip audit
This adds a security dimension to your test workflow and keeps dependencies in check.
Part 12: Regression Tests and Baselines
When a bug is fixed, add a regression test to prevent it from returning. Keep a dedicated regression suite for critical business logic.
Part 13: Ownership and Test Documentation
Document your testing practices in a TESTING.md file. Include:
- how to run tests locally
- how to add new tests
- CI expectations
- test naming conventions
This documentation helps new contributors follow your sovereign Python standards.
Part 14: Further Reading
Part 16: Test Strategy and the Testing Pyramid
A strong test suite balances unit, integration, and end-to-end tests.
- unit tests: fast, isolated, logic-level checks
- integration tests: verify interaction between components (e.g. DB + API)
- end-to-end tests: validate the full user flow
Aim for a high volume of unit tests, a moderate number of integration tests, and a small number of E2E tests. This keeps the suite fast and maintainable.
Part 17: Mutation Testing and Test Quality
Mutation testing measures whether your tests catch intentional code changes. Tools like mutmut or cosmic-ray can help.
A simple mutation test example:
uv run mutmut run
uv run mutmut results
Use mutation testing sparingly to identify weak assertions and improve test coverage quality.
Part 18: Test Failure Debugging
When a test fails, reproduce it locally with pytest -k substring. Use the -vv flag for verbose output.
pytest -k "test_invalid_discount" -vv
If a test is flaky, isolate it and use pytest-rerunfailures in CI as a temporary mitigation while you fix the root cause.
Part 19: Mocking External Services
For local sovereignty, avoid hitting external services during tests. Use mocking for HTTP, database, or file storage calls.
from unittest.mock import patch
@patch('myapp.external.requests.get')
def test_fetch_remote_data(mock_get):
mock_get.return_value.json.return_value = {'status': 'ok'}
assert fetch_remote_data('http://example.com') == 'ok'
Mocking keeps tests deterministic and ensures they can run offline.
Part 20: Test Data Factories and Builders
Use data factories to generate realistic test objects without repetition.
import factory
from myapp.models import User
class UserFactory(factory.Factory):
class Meta:
model = User
username = factory.Faker('user_name')
email = factory.Faker('email')
Factories make tests easy to read and maintain, especially when the same object shape is reused.
Part 21: Local Test Coverage and Quality Gates
Use coverage reports in CI and local dev. Fail the build if coverage drops below your threshold.
Example pytest.ini snippet:
[tool:pytest]
addopts = --cov=myapp --cov-fail-under=75
This ensures your tests provide a minimum level of coverage even as the code evolves.
Part 22: Docker-Based Test Isolation for Databases
A containerized database makes integration tests safe.
docker-compose up -d postgres-test
uv run pytest -q
docker-compose down -v
Keep the test database configuration separate from production and use environment variables for credentials.
Part 23: Testing Asynchronous Workflows
For async code, use pytest-asyncio or asyncio fixtures.
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
def test_async_endpoint():
async with AsyncClient(app=app, base_url='http://test') as client:
response = await client.get('/ping')
assert response.status_code == 200
Async tests should be stable and avoid relying on network timing.
Part 24: Contract and Schema Testing
If your application exposes APIs, test the contract with schema validation.
Use pydantic models or JSON Schema to validate responses.
from pydantic import BaseModel
class UserResponse(BaseModel):
id: int
email: str
active: bool
Then validate the API response in a test:
user = UserResponse.parse_obj(response.json())
This catches schema drift early.
Part 25: Security-Focused Tests
Add tests for common security concerns such as injection, invalid input, and permission boundaries.
Examples:
- ensure SQL injection inputs do not crash queries
- verify unauthorized users cannot access protected endpoints
- check that secrets are not included in logs
Security tests should be part of the standard suite, not an afterthought.
Part 26: Performance Regression Tests
For critical code paths, add very small performance assertions.
import time
start = time.perf_counter()
result = expensive_operation()
assert time.perf_counter() - start < 0.5
These tests can catch regressions in hot paths, but keep them generous enough to avoid flaky failures.
Part 27: Local Test Suite Maintenance
Review the test suite periodically. Remove obsolete tests, refactor duplicated fixtures, and update coverage goals as the codebase evolves.
A healthy test suite is one that is fast, reliable, and relevant.
Part 28: Developer Onboarding and Test Culture
Encourage developers to run the test suite before commits, and make it easy to do so with shortcuts.
Example Makefile:
test:
uv run pytest
coverage:
uv run pytest --cov=myapp --cov-report=term-missing
A make test command reduces friction and helps enforce the testing culture.
Part 29: Canary and Staging Test Environments
Run tests in a staging environment before production. This can be a local VM, a containerized stack, or a dedicated staging server.
The staging environment should mirror production dependencies and configuration as closely as possible.
Part 30: Final Sovereign Testing Checklist
- unit tests cover the core logic
- integration tests verify component interactions
- E2E tests exercise the full flow
- fixtures are reusable and isolated
- async code is tested with pytest-asyncio
- external calls are mocked for offline testing
- coverage thresholds are enforced in CI
- documentation explains how to run and extend tests
- security and contract tests are included
- restore and rerun regeneration scripts are documented
Part 31: Test Automation and Local Developer Feedback
Use watch mode or test runners that re-run tests on file changes. This gives developers quick feedback and encourages test-driven development.
Example with pytest-watch:
uv run ptw --onfail "echo tests failed"
This keeps the feedback loop tight and encourages better test hygiene.
Part 32: Test Dependency Management
Keep test dependencies separated from runtime dependencies. Use a requirements-dev.txt file or a separate environment.
Example:
pytest
pytest-asyncio
pytest-cov
factory-boy
mypy
This makes it easier to reproduce the test environment locally.
Part 33: Static Analysis and Type Checking
Combine tests with static analysis. Run mypy, ruff, or pyright as part of your testing pipeline.
A command like this is useful:
uv run ruff check src tests
uv run mypy src
Static checks catch issues before the code reaches runtime tests.
Part 34: Test Data Security
Treat test fixtures as code. Do not embed real production data or secrets inside test files.
If you need realistic data shapes, anonymize or synthesize it.
Part 35: Acceptance and Behavior Tests
Acceptance tests validate end-to-end behavior from the user perspective. Use tools such as selenium, playwright, or API-level acceptance frameworks.
Write acceptance tests for the most critical workflows and run them periodically, not necessarily on every commit.
Part 36: Test Result Reporting and Diagnostics
Provide analyzed test results for developers. Use pytest --junitxml=results.xml and a CI test report viewer, or a local HTML report.
This helps teams see failing tests, durations, and flaky behavior quickly.
Part 37: Flaky Test Management
Identify flaky tests and quarantine them with markers until they are fixed.
import pytest
@pytest.mark.flaky
def test_unstable_behavior():
assert some_unstable_function()
Keep the majority of the suite stable and fast.
Part 38: Final Developer-Focused Checklist
- tests run quickly on local machines
- test failures provide clear diagnostics
- test dependencies are isolated
- static analysis runs alongside tests
- no production secrets are used in test fixtures
- acceptance tests cover high-value workflows
- flaky tests are tracked and improved
- documentation explains the testing workflow
Part 39: Test Failure Patterns and Root Cause Analysis
When tests fail repeatedly, look for patterns such as order dependence, shared state, or timing issues.
A common root cause is insufficient isolation between tests. Make sure fixtures clean up after themselves and do not share mutable global state.
Part 40: Local Regression Testing and Snapshot Tests
Use snapshot tests for stable outputs that should not change over time. This is useful for CLI output, serialized JSON, or generated documentation.
Example with pytest-approvaltests:
from approvaltests import verify
def test_rendered_output():
verify(render_report())
Snapshot tests capture regressions in structured output.
Part 41: Test Environment Parity
Keep the test environment close to production. That means matching Python versions, package dependencies, and configuration options.
If your production app runs in Docker, run the test suite in the same image for the most accurate validation.
Part 42: Managing Test Flows and Multi-Step Scenarios
For workflows that involve multiple steps, write tests that assert each step explicitly.
For example, a sign-up workflow test can verify:
- user creation
- email confirmation token generation
- profile initialization
Breaking multi-step scenarios into assertions helps reveal exactly where a failure occurs.
Part 43: Local Performance Regression Monitoring
Track test suite duration over time. If test run time grows significantly, identify the slow tests and decide whether to optimize them or move them to a separate pipeline.
A local developer workflow should stay under a few minutes for the core suite.
Part 44: Final Testing Operations Summary
Review your testing operations periodically, confirm that onboarding docs are current, and ensure the suite remains fast, deterministic, and comprehensive.
Part 45: Local Test Environment Hygiene
Keep temporary files and test artifacts isolated. Use temporary directories and tmp_path fixtures so tests do not leave residual files behind.
def test_writes_file(tmp_path):
path = tmp_path / 'test.txt'
path.write_text('hello')
assert path.read_text() == 'hello'
45.1 Cleanup and orphaned resources
Ensure that any external resources created by tests are cleaned up. If a test starts a local server, stop it at the end of the test.
45.2 Test environment reproducibility
Keep a conftest.py that defines shared fixtures and environment setup. This makes it easy to reproduce the test configuration across machines.
A clean test environment reduces failures caused by developer-specific settings and makes the suite more reliable.
Part 46: Local Test Review and Refactor
Periodically review the test suite and refactor high-maintenance tests. Remove duplicated setup, simplify assertions, and keep the suite readable.
A cleaner test suite reduces cognitive load and makes it easier for developers to keep tests current.
Part 47: Developer Collaboration and Review
Encourage code review of tests as well as production code. A strong review process catches brittle assumptions, duplicated effort, and hidden dependencies.
Reviewers should focus on whether tests are meaningful, whether they would fail if the logic changes, and whether they help document the intended behavior.
Further Reading
- Python 3.12 Getting Started Guide 2026 — Python setup before testing
- GitHub Actions Tutorial 2026 — CI/CD foundation
- Build a REST API with FastAPI — testing FastAPI routes with
httpx
Tested on: Ubuntu 24.04 LTS. pytest 8.3.4, hypothesis 6.112.0. Last verified: May 2, 2026.