Too Long; Didn't Read
A model server is a web server that hosts the deep learning model and allows it to be accessed over standard network protocols. The model server can be accessed across devices as long as they are connected via a common network. In this writeup, we will explore a part of a deployment that deals with hosting the deep learning model to make it available across the web for inference, known as model servers. In this example, we will be dealing with images: REST API request-response and gRPC API. We will first learn how to build our own, and then explore the Triton Inference Server (by Nvidia).