less than 1 minute read

At work, we host a lot of inference endpoints and for that we use Sagemaker. Sagemaker uses this abstraction, and I’ve grown quite fond of it:

  • default_model_fn(model_dir, context=None): This loads in the model and acts as a setup() function. This returns the model and other related services that your endpoint needs.
  • default_input_fn(input_data, content_type, context=None): This is basically an input adapter. Take the raw unsanitzed input in string or bytes and parse it. This returns validated input.
  • default_predict_fn(data, model, context=None, ): This is your bread and butter, take the model (and other related services) and make a prediction by passing it to model.forward(). This returns a prediction.
  • default_output_fn(prediction, accept, context=None): This is basically an output adapter. Take the prediction the model made, serialize it to whatever output type you need your endpoint to spit out. This returns serialized output.

Comments