In this page, we'll customize our Potassium app.
Iteration speed developing AI/ML backends can feel very slow at times, due to reloading the model into memory on every code change.
Thankfully, the Banana CLI dev server includes Hot Reload, which watches your for changes and reloads the models if the init block changes
See it for yourself by changing the @app.handler() logic to return all outputs from the inference, rather than just the 0th index
return Response(
json = {"outputs": outputs[0]},
return Response(
json = {"outputs": outputs},
Save and you'll see the dev server update immediately
Hot reloading 🔥
Then calling the model:
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Hello I am a [MASK] model."}' http://localhost:8000/
you'll see the new handler logic run
"score": 0.13177461922168732,
"sequence": "hello i am a fashion model.",
"token": 4827,
"token_str": "fashion"
"score": 0.1120428815484047,
"sequence": "hello i am a role model.",
"token": 2535,
"token_str": "role"
"score": 0.022045975551009178,
"sequence": "hello i am a model model.",
"token": 2944,
"token_str": "model"}
In the next page, we'll use the clientside SDKs to call our Potassium server