Watch the product announcement
Efficiency gains that pay off
Server-like concurrency, in a serverless world
In-function concurrency enables a single Vercel Function to handle multiple invocations simultaneously, optimizing resource usage and turning efficiency gains into savings.
It fails to efficiently utilize available resources during periods of inactivity.
This significantly optimizes your compute footprint and resource efficiency.
“Many of our API endpoints were lightweight and involved external requests, resulting in idle compute time. By leveraging in-function concurrency, we were able to share compute resources between invocations, cutting costs by over 50% with zero code changes.
”
AI workloads
Run tasks with reduced latency and higher concurrency, delivering faster, scalable results for all users—regardless of the workload size.
Business-critical APIs
Ensure fast, resilient API responses under heavy traffic, keeping smooth and consistent experiences.
Server-side and partial pre-rendering
Generate dynamic pages with minimal latency, allowing for faster load times and seamless interactions.
Middleware
Perform authentication checks and apply personalization, with the power of fluid computing.
Vercel Functions
Bridging servers and serverless.
Taking the best of servers and serverless to create a new model in computing, scaling business-critical workloads efficiently across global environments.
Features |
---|
Cold start handling |
Scaling |
Concurrency |
Operational overhead |
Pricing model |
CPU efficiency |
Fluid compute for dynamic web applications
In-function concurrency
Run multiple invocations on a single function instance, reducing idle compute time and lowering costs.
Cold-start reduction
Functions are pre-warmed and optimized with bytecode caching, ensuring faster response times.
Streaming
Send data to users as it becomes available, improving performance for AI, media, and real-time apps.
Cross-region failover
Ensures high availability by rerouting traffic to backup regions during outages.
Dynamic scaling
Automatically adjusts concurrency and resource allocation based on real-time demand.
Post-response tasks
Keep functions running after sending a response to handle tasks like logging or database updates.