Identifying key DL model deployment requirements
To determine the most suitable deployment strategy from a variety of options, it is essential to identify and define seven key requirements. These are latency and availability, cost, scalability, model hardware, data privacy, safety, and trust and reliability requirements. Let’s dive into each of these requirements in detail:
- Latency and availability requirements: These are two closely connected components and should be defined together. Availability requirements refer to the desired level of uptime and accessibility of the model’s prediction. Latency requirements refer to the maximum acceptable delay or response time that the models must meet to provide timely predictions or results. A deployment with a low availability requirement usually can tolerate high latency predictions, and vice versa. One reason is that a low-latency capable infrastructure can’t ensure low latency if it is not available when model...