Performance and Capacity
This page covers capacity planning, health check behavior, burst traffic patterns, and performance best practices for your private fleet.
Fleet Capacity
Your fleet's total concurrency is the product of the number of workers and the per-worker concurrency limit.
Total capacity = workers × per-worker concurrency
Example: 3 workers × 6 sessions = 18 concurrent sessions
Set your client-side concurrency limit below total fleet capacity to leave headroom for burst variance. For example, with 18 total slots, limiting client concurrency to 15–16 prevents the queue from filling during minor load spikes.
For concurrency and queue configuration, see Worker Settings.
Health Checks and Rejection Thresholds
Workers monitor CPU and memory utilization before accepting new sessions. When a worker's CPU or memory exceeds 90%, it rejects incoming requests with:
HTTP 500: "Health checks have failed, rejecting"
This is expected behavior under sustained overload. The worker is protecting itself from becoming unresponsive. Reduce burst load, stagger requests (see below), or add workers to the fleet if this occurs regularly.
Health check rejections may not appear in standard dashboard metrics. If you see unexplained HTTP 500 errors with this message, your workers are hitting resource limits.
Burst Traffic
When many requests arrive simultaneously (for example, from a scheduled job), all requests compete for the same workers at the same instant. Browser launch is the most CPU-intensive phase, so a burst of launches can overload a worker even when the fleet has available capacity overall.
Stagger your initial requests by 5–10 seconds. Once the first batch of browsers is running, subsequent requests queue naturally and don't spike CPU in the same way. Even spreading the first few requests provides significant relief.
Performance Degradation Over Time
Workers can exhibit gradual performance degradation over weeks of continuous operation: session connection times increase, page load times lengthen, and screenshot or PDF generation slows. This degradation is visible in your own client-side timing metrics but may not appear in the dashboard.
When you observe an upward trend in session times over days:
- Perform a Relaunch from the dashboard. This provisions new VMs and typically restores baseline performance.
- If Relaunch from the dashboard does not improve performance, contact Browserless support to request a relaunch onto new hardware.
See Fleet Operations for the distinction between Restart and Relaunch.
Client-Side Best Practices
These practices reduce unnecessary resource consumption on your workers:
- Always close browser connections in
finallyorcatchblocks. Unclosed sessions occupy a concurrency slot until the global timeout fires. - Use shorter global timeouts. A 5-minute timeout means a crashed session occupies a slot for up to 5 minutes. A 1–2 minute timeout cleans up zombie sessions faster. See Worker Settings to configure the global timeout.
- Avoid serializing large objects inside browser sessions. Logging or serializing large DOM structures or data objects causes memory spikes on the worker.
Special Instance Types
Standard instances cover the requirements of 99% of private deployments. For specific workloads, Browserless can provision special instances. A few examples:
- GPU-enabled instances: suited for rendering-heavy workloads such as maps, complex data visualizations, or WebGL content.
- Apple M1 instances: suited for workloads that benefit from ARM-native performance.
To request a special instance type, contact Browserless support. Changing instance type requires a Relaunch.