snappet pipeline 1

we are building a unified architecture now for pipelines at snappet

this is cool

question to answer: what is our first output going to be? our milestone? simple model with some features

steps of the pipeline we are setting up

some things to take into account

we must support sequences from the start: some models use raw rows and others use sequences, we must handle this
we want our models to work in snappet ids and have the model as part of the mapper
we want to handle cold starts (unknown ids) gracefully with defaults or non predictions
pad and ignore for data prep otherwise missing data becomes a thing
model should be able to init lightweight
after training model should have schema/translation baked in