test: add e2e tests for Crawlee crawlers as Apify Actors by vdusek · Pull Request #784 · apify/apify-sdk-python

vdusek · 2026-02-12T17:02:35Z

Summary

Add 6 new E2E tests verifying each Crawlee crawler type (BasicCrawler, HttpCrawler, BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, AdaptivePlaywrightCrawler) works correctly when deployed as an Actor on the Apify platform.
Each test deploys an Actor that crawls a local 5-page e-commerce test server, exercises link discovery (enqueue_links / add_requests), data extraction (push_data), and KVS storage (Actor.set_value).
Shared test infrastructure in conftest.py: ASGI test server, Playwright Dockerfile template, product data expectations, and a _verify_crawler_results helper that checks run status, dataset contents, and KVS records.

Motivation

Ensuring that Crawlee crawlers can run on the Apify platform.
There are many interactions with Apify storages, which also helps validate the Apify storage client.

Issue

Relates: Improve test suite: increase coverage and reduce total execution time #785

Test plan

CI passes

codecov · 2026-02-12T17:04:39Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.57%. Comparing base (596821b) to head (d390af2).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #784      +/-   ##
==========================================
- Coverage   82.06%   81.57%   -0.49%     
==========================================
  Files          46       46              
  Lines        2698     2698              
==========================================
- Hits         2214     2201      -13     
- Misses        484      497      +13

Flag	Coverage Δ
e2e	`?`
integration	`55.67% <ø> (ø)`
unit	`67.97% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vdusek · 2026-02-13T14:02:49Z

@janbuchar, @Mantisus - not necessary requesting a review, up to you guys, mostly just to keep you informed about the SDK test suite improvements.

Add 6 e2e tests (one per crawler type) verifying that each Crawlee crawler works correctly when deployed as an Actor on the Apify platform. Each test exercises link discovery, data extraction (push_data), and KVS storage against a local 5-page e-commerce test server. Crawlers covered: BasicCrawler, HttpCrawler, BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, AdaptivePlaywrightCrawler. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Playwright browser process uses ~244MB at startup, exceeding the 256MB default. Both PlaywrightCrawler and AdaptivePlaywrightCrawler tests timed out due to memory pressure. Add memory_mbytes parameter to make_actor and set it to 1024MB for Playwright tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…wler e2e tests Move Actor source code from triple-quoted string constants into standalone files under actor_source/, so they benefit from syntax highlighting, linting, and type-checking. Load them at runtime via Path.read_text() helpers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

These functions are imported across modules, so they are part of the test package's public API and shouldn't use the private convention. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…traints The env var was never set anywhere. Use sys.version_info directly. Also drop version constraints from additional_requirements in e2e tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…r e2e tests Consolidate the two separate server.py files (actor_source_base and test_crawlee_crawlers/actor_source) into a single base server with a category-based depth structure and an infinite /deep/N chain. Add max_crawl_depth=2 to all crawler constructors to test depth limiting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add direct product links to the base server homepage so Scrapy spiders (which look for /products/ links on the start page) work without their own server.py. Now all e2e tests share a single server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update test_actor_on_platform_max_crawl_depth to use the /deep/N URL pattern from the shared server instead of the old infinite pagination URLs that no longer exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

janbuchar · 2026-02-13T15:50:06Z

Maybe I'll be able to check this out later in more detail, for now I'll just share one thing - we should collocate the "Test Actors" with their testcases, instead of having two big folders, actors and tests.

vdusek · 2026-02-16T08:57:21Z

Maybe I'll be able to check this out later in more detail, for now I'll just share one thing - we should collocate the "Test Actors" with their testcases, instead of having two big folders, actors and tests.

I absolutely agree. However, since this PR is only adding new E2E tests to the existing setup, I'd prefer not to change the structure here. Let's handle this in a dedicated PR and discuss the structure beforehand.

github-actions bot assigned vdusek Feb 12, 2026

github-actions bot added this to the 134th sprint - Tooling team milestone Feb 12, 2026

github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Feb 12, 2026

vdusek marked this pull request as draft February 12, 2026 17:03

vdusek mentioned this pull request Feb 12, 2026

Improve test suite: increase coverage and reduce total execution time #785

Open

6 tasks

vdusek added the adhoc Ad-hoc unplanned task added during the sprint. label Feb 12, 2026

vdusek force-pushed the test/e2e-crawlee-crawlers branch from df6a90a to 3f14ef3 Compare February 13, 2026 09:26

vdusek requested a review from Pijukatel February 13, 2026 09:34

vdusek marked this pull request as ready for review February 13, 2026 09:34

vdusek and others added 9 commits February 13, 2026 16:06

test: remove underscore prefix from shared test helper functions

d7d3adf

These functions are imported across modules, so they are part of the test package's public API and shouldn't use the private convention. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: remove unused INTEGRATION_TESTS_PYTHON_VERSION and version cons…

5f34e28

…traints The env var was never set anywhere. Use sys.version_info directly. Also drop version constraints from additional_requirements in e2e tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: rename Dockerfile.playwright to playwright.Dockerfile

f8b2b45

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vdusek force-pushed the test/e2e-crawlee-crawlers branch from 9f3cb17 to 5277d25 Compare February 13, 2026 15:28

Pijukatel approved these changes Feb 16, 2026

View reviewed changes

Unify naming

d390af2

vdusek merged commit 0e4816d into master Feb 16, 2026
29 of 31 checks passed

vdusek deleted the test/e2e-crawlee-crawlers branch February 16, 2026 08:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add e2e tests for Crawlee crawlers as Apify Actors#784

test: add e2e tests for Crawlee crawlers as Apify Actors#784
vdusek merged 10 commits intomasterfrom
test/e2e-crawlee-crawlers

vdusek commented Feb 12, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

vdusek commented Feb 13, 2026

Uh oh!

janbuchar commented Feb 13, 2026

Uh oh!

vdusek commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vdusek commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Issue

Test plan

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vdusek commented Feb 13, 2026

Uh oh!

janbuchar commented Feb 13, 2026

Uh oh!

vdusek commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vdusek commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading