Links

Customizing

In this page, we'll customize our Potassium app.
Iteration speed developing AI/ML backends can feel very slow at times, due to reloading the model into memory on every code change.
Thankfully, the Banana CLI dev server includes Hot Reload, which watches your app.py for changes and reloads the models if the init block changes
See it for yourself by changing the @app.handler() logic to return all outputs from the inference, rather than just the 0th index
from:
return Response(
json = {"outputs": outputs[0]},
status=200
)
to:
return Response(
json = {"outputs": outputs},
status=200
)
Save app.py and you'll see the dev server update immediately
------
Hot reloading 🔥
Reloaded
------
Then calling the model:
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Hello I am a [MASK] model."}' http://localhost:8000/
you'll see the new handler logic run
{"outputs":[
{
"score": 0.13177461922168732,
"sequence": "hello i am a fashion model.",
"token": 4827,
"token_str": "fashion"
},
{
"score": 0.1120428815484047,
"sequence": "hello i am a role model.",
"token": 2535,
"token_str": "role"
},
...
{
"score": 0.022045975551009178,
"sequence": "hello i am a model model.",
"token": 2944,
"token_str": "model"}
]}
In the next page, we'll use the clientside SDKs to call our Potassium server