Resource Allocation Excel

Workload-Adapted Resource Allocation for LLM Distributed Serving in Serverless Clusters

Abstract: Large language models increasingly rely on pipeline parallelism for distributed inference, but existing systems face critical challenges in serverless environments: heterogeneous request ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Workload-Adapted Resource Allocation for LLM Distributed Serving in Serverless Clusters

Trending now