I like this abstraction
At work, we host a lot of inference endpoints and for that we use Sagemaker. Sagemaker uses this abstraction, and I’ve grown quite fond of it:
default_model_fn(model_dir, context=None): This loads in the model and acts as asetup()function. This returns the model and other related services that your endpoint needs.default_input_fn(input_data, content_type, context=None): This is basically an input adapter. Take the raw unsanitzed input in string or bytes and parse it. This returns validated input.default_predict_fn(data, model, context=None, ): This is your bread and butter, take the model (and other related services) and make a prediction by passing it tomodel.forward(). This returns a prediction.default_output_fn(prediction, accept, context=None): This is basically an output adapter. Take the prediction the model made, serialize it to whatever output type you need your endpoint to spit out. This returns serialized output.
Comments