we needed to handle over 25,000 inferences per second (and over 1 billion inferences per day), at a latency of under 20ms
benchmark的重要性
we needed to handle over 25,000 inferences per second (and over 1 billion inferences per day), at a latency of under 20ms
benchmark的重要性