website-to-screenshot
Discover all pages of a website via sitemap or full crawl, capture a full-page PNG screenshot of each page, and bundle the results into a single ZIP archive. Always runs asynchronously. Requires a paid plan.
Endpoint
POST /v1/convert/website-to-screenshot
Content-Type: application/json
Output format: ZIP archive containing one PNG screenshot per discovered page.
Mode: Always asynchronous. Returns HTTP 202 immediately.
Authentication
This endpoint requires a private API key. Public keys are not supported for website capture.
X-API-Key: sk_live_your_private_key
Request Parameters
Website Discovery Parameters
| Parameter | Type | Required | Default | Description | Plan Gating |
|---|---|---|---|---|---|
url |
string |
Yes | -- | The website base URL (e.g. https://example.com). Used as the root for page discovery. |
-- |
crawl_mode |
string |
No | "auto" |
URL discovery method. One of "auto", "sitemap", or "full". See Crawl Modes below. |
Sitemap requires Starter+, Full requires Pro+ |
include_patterns |
string[] |
No | null |
Regex patterns to whitelist discovered URLs. Only used in full crawl mode. |
-- |
exclude_patterns |
string[] |
No | System defaults | Regex patterns to blacklist URLs. Only used in full crawl mode. When omitted, uses built-in defaults that exclude static assets, login/admin/cart pages, and deep pagination. |
-- |
Notification Parameters
| Parameter | Type | Required | Default | Description | Plan Gating |
|---|---|---|---|---|---|
output_filename |
string |
No | Auto-generated | Custom base name for the output ZIP file. Timestamp is appended automatically. | -- |
notification_email |
string |
No | Project owner email | Email address to notify when the job completes. | -- |
callback_url |
string |
No | -- | Webhook URL to receive a POST request on completion. | Requires webhook access |
Browser & Rendering Parameters
These settings apply to the capture of each individual page within the website.
| Parameter | Type | Required | Default | Description | Plan Gating |
|---|---|---|---|---|---|
viewport_width |
integer |
No | 1920 |
Browser viewport width in pixels. The screenshot width matches this value. | -- |
viewport_height |
integer |
No | 1080 |
Browser viewport height in pixels. Used as a reference for rendering and viewport-unit calculation. | -- |
load_media |
boolean |
No | true |
Wait for all images and videos to fully load before capture. | -- |
enable_scroll |
boolean |
No | true |
Scroll each page to trigger lazy-loading content. | -- |
handle_sticky_header |
boolean |
No | true |
Detect sticky/fixed headers and handle them before capture. | -- |
handle_cookies |
boolean |
No | true |
Auto-dismiss cookie consent banners. | -- |
wait_for_images |
boolean |
No | true |
Wait for all <img> elements to finish loading. |
-- |
Authentication & Custom Requests
| Parameter | Type | Required | Default | Description | Plan Gating |
|---|---|---|---|---|---|
auth |
object |
No | null |
HTTP Basic Auth credentials applied to every page. Format: {"username": "...", "password": "..."}. |
Requires basic auth access |
cookies |
array |
No | null |
Array of cookie objects injected before each page load. Maximum 50 cookies. | Requires basic auth access |
headers |
object |
No | null |
Custom HTTP headers sent with every request. Maximum 20 headers. | Requires basic auth access |
single_page and pdf_options parameters are not applicable to screenshots. Each page is always captured as a single full-page PNG image.
Crawl Modes
"auto" (default)
Uses the highest crawl mode your plan allows. If your plan supports full crawling, it runs a full crawl. If your plan supports sitemap-only, it runs sitemap discovery.
"sitemap"
Discovers pages by parsing the website's sitemap.xml:
- Fetches
{base_url}/sitemap.xml(30-second timeout) - If the root element is
<sitemapindex>, recursively fetches each child sitemap - Extracts all
<url><loc>entries from<urlset>elements - Returns the full list of discovered URLs
Returns an error if the sitemap is missing, returns a non-200 status, contains invalid XML, or has no URLs.
"full"
Performs a comprehensive two-phase crawl:
Phase 1 -- Seed discovery:
- Parses
robots.txtfor sitemap directives and crawl rules - Checks standard sitemap paths (
/sitemap.xml,/wp-sitemap.xml,/sitemap_index.xml, etc.) - Discovers RSS/Atom feeds from
<link>tags and common feed paths - Extracts seed URLs from all discovered sources
Phase 2 -- Breadth-first link crawl:
- Starts from the base URL plus all seed URLs
- Visits each page and enqueues same-domain links
- Applies
include_patternsandexclude_patternsto filter links - Respects
robots.txtrules - Detects and avoids infinite URL traps (calendar pages, faceted filters, etc.)
- Deduplicates URLs by normalizing scheme, host, query parameters, and stripping tracking parameters (
utm_*,fbclid,gclid, etc.)
Default exclude patterns (when exclude_patterns is not provided):
- Static assets:
*.pdf,*.zip,*.jpg,*.png,*.gif,*.svg,*.css,*.js,*.xml,*.json,*.mp4,*.webm,*.woff,*.woff2 - Protected paths:
/login,/admin,/cart,/checkout - Deep pagination: URLs with
page=parameters exceeding 3 digits
Response
202 Accepted (immediate)
{
"status": "processing",
"batch_id": "550e8400-e29b-41d4-a716-446655440000",
"url_count": 42,
"total_discovered": 42,
"discovery_method": "sitemap",
"output_format": "zip"
}
| Field | Description |
|---|---|
batch_id |
UUID for tracking the job via batch status polling or webhook. |
url_count |
Number of pages that will be captured. |
total_discovered |
Total pages discovered by the crawl. |
discovery_method |
"sitemap" or "full_crawl" depending on the effective crawl mode. |
Batch Status Polling
Poll with the batch_id from the 202 response:
GET /v1/convert/batch/{batch_id}
X-API-Key: sk_live_your_private_key
Returns aggregate status, per-URL statuses, and a presigned download URL for the ZIP when complete. See Batch Status Polling for the full response schema.
Webhook Callback Payload
When callback_url is provided, Enconvert sends a POST request on completion:
{
"job_id": "batch-uuid",
"status": "success",
"batch_id": "550e8400-e29b-41d4-a716-446655440000",
"gcs_uri": "env/files/{project_id}/url-to-screenshot/website_20260405_123456789.zip",
"filename": "website_20260405_123456789.zip",
"file_size": 12345678,
"total_tasks": 42,
"successful_tasks": 40,
"failed_tasks": 2,
"tasks": [
{"url": "https://example.com/", "status": "success", "filename": "example_20260405_001.png"},
{"url": "https://example.com/about", "status": "success", "filename": "example_20260405_002.png"},
{"url": "https://example.com/broken", "status": "failed", "error": "Timeout"}
]
}
Email Notification
A completion email is sent to notification_email (or the project owner's email by default) when the job finishes, regardless of success or failure.
Subscription Plan Gating
| Feature | Free | Starter | Pro | Enterprise |
|---|---|---|---|---|
| Website capture | No | Yes | Yes | Yes |
| Sitemap crawl mode | No | Yes | Yes | Yes |
| Full crawl mode | No | No | Yes | Yes |
| Webhook callbacks | No | No | Yes | Yes |
| HTTP Basic Auth | No | Yes | Yes | Yes |
| Cookie injection | No | Yes | Yes | Yes |
| Custom headers | No | Yes | Yes | Yes |
| Batch size limit | 0 | Plan-based | Plan-based | Unlimited |
| Monthly conversions | 100 | Plan-based | Plan-based | Unlimited |
403 Forbidden.
Code Examples
Python (Private Key)
import requests
import time
# Start the website capture
response = requests.post(
"https://api.enconvert.com/v1/convert/website-to-screenshot",
headers={"X-API-Key": "sk_live_your_private_key"},
json={
"url": "https://example.com",
"crawl_mode": "sitemap",
"output_filename": "example-screenshots",
"viewport_width": 1440,
"viewport_height": 900
}
)
data = response.json()
print(f"Batch ID: {data['batch_id']}")
print(f"Pages found: {data['url_count']}")
# Poll for completion
batch_id = data["batch_id"]
while True:
status = requests.get(
f"https://api.enconvert.com/v1/convert/batch/{batch_id}",
headers={"X-API-Key": "sk_live_your_private_key"}
).json()
print(f"Status: {status['status']} ({status['completed']}/{status['total']})")
if status["status"] in ("completed", "partial", "failed"):
if status.get("zip_download_url"):
print(f"Download: {status['zip_download_url']}")
break
time.sleep(5)
PHP (Private Key)
$ch = curl_init("https://api.enconvert.com/v1/convert/website-to-screenshot");
curl_setopt_array($ch, [
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
"Content-Type: application/json",
"X-API-Key: sk_live_your_private_key"
],
CURLOPT_POSTFIELDS => json_encode([
"url" => "https://example.com",
"crawl_mode" => "sitemap",
"output_filename" => "example-screenshots",
"viewport_width" => 1440,
"viewport_height" => 900
])
]);
$response = json_decode(curl_exec($ch), true);
curl_close($ch);
echo "Batch ID: " . $response["batch_id"] . "\n";
echo "Pages found: " . $response["url_count"] . "\n";
Node.js (Private Key)
const response = await fetch("https://api.enconvert.com/v1/convert/website-to-screenshot", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-API-Key": "sk_live_your_private_key"
},
body: JSON.stringify({
url: "https://example.com",
crawl_mode: "sitemap",
output_filename: "example-screenshots",
viewport_width: 1440,
viewport_height: 900
})
});
const data = await response.json();
console.log(`Batch ID: ${data.batch_id}`);
console.log(`Pages found: ${data.url_count}`);
// Poll for completion
const poll = async () => {
const status = await fetch(
`https://api.enconvert.com/v1/convert/batch/${data.batch_id}`,
{ headers: { "X-API-Key": "sk_live_your_private_key" } }
).then(r => r.json());
console.log(`Status: ${status.status} (${status.completed}/${status.total})`);
if (["completed", "partial", "failed"].includes(status.status)) {
if (status.zip_download_url) console.log(`Download: ${status.zip_download_url}`);
return;
}
setTimeout(poll, 5000);
};
poll();
Go (Private Key)
package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
func main() {
body, _ := json.Marshal(map[string]interface{}{
"url": "https://example.com",
"crawl_mode": "sitemap",
"output_filename": "example-screenshots",
"viewport_width": 1440,
"viewport_height": 900,
})
req, _ := http.NewRequest("POST", "https://api.enconvert.com/v1/convert/website-to-screenshot", bytes.NewBuffer(body))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-API-Key", "sk_live_your_private_key")
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
respBody, _ := io.ReadAll(resp.Body)
fmt.Println(string(respBody))
}
With Webhook Callback
{
"url": "https://example.com",
"crawl_mode": "full",
"callback_url": "https://your-server.com/webhook/enconvert",
"output_filename": "example-full-site-screenshots",
"include_patterns": [".*\\/blog\\/.*", ".*\\/docs\\/.*"]
}
With Authentication (Password-Protected Site)
{
"url": "https://staging.example.com",
"crawl_mode": "sitemap",
"auth": {
"username": "admin",
"password": "staging-password"
},
"cookies": [
{"name": "session_token", "value": "abc123", "domain": "staging.example.com"}
]
}
Error Responses
| Status | Condition |
|---|---|
400 Bad Request |
Missing or empty url parameter |
400 Bad Request |
No URLs found in sitemap |
400 Bad Request |
Timeout fetching sitemap (30-second limit) |
400 Bad Request |
Non-200 response from sitemap URL |
400 Bad Request |
Invalid XML in sitemap |
400 Bad Request |
Unrecognized sitemap format |
400 Bad Request |
No pages discovered (full crawl found zero URLs) |
400 Bad Request |
Invalid auth, cookies, or headers structure |
402 Payment Required |
Monthly conversion limit exceeded by discovered page count |
402 Payment Required |
Storage limit reached |
403 Forbidden |
Website crawling not available on current plan (Free plan) |
403 Forbidden |
Full crawl mode requires Pro plan or higher |
403 Forbidden |
Discovered page count exceeds batch size limit |
403 Forbidden |
Feature not available on plan (webhook, basic auth) |
500 Internal Server Error |
Crawl or capture failure |
Limits
| Limit | Value |
|---|---|
| Sitemap fetch timeout | 30 seconds |
| Global crawl timeout (full mode) | 10 minutes |
| Max crawl depth (full mode) | 10 levels |
| Per-page crawl timeout (full mode) | 30 seconds |
| Crawler memory limit | 512 MB |
| Infinite trap threshold | 20 URLs per URL pattern |
| Robots.txt fetch timeout | 10 seconds |
| Max pages per crawl | Plan's batch size limit |
| Maximum cookies per request | 50 |
| Maximum custom headers per request | 20 |
| Webhook delivery timeout | 30 seconds |
| Monthly conversions | Plan-dependent |
| File retention | Plan-dependent |