Links

Potassium

In this page, we'll get familiar with Potassium
Opening up app.py, you'll see
```python
from potassium import Potassium, Request, Response
from transformers import pipeline
import torch
app = Potassium("my_app")
# @app.init runs at startup, and loads models into the app's context
@app.init
def init():
device = 0 if torch.cuda.is_available() else -1
model = pipeline('fill-mask', model='bert-base-uncased', device=device)
context = {
"model": model
}
return context
# @app.handler runs for every call
@app.handler()
def handler(context: dict, request: Request) -> Response:
prompt = request.json.get("prompt")
model = context.get("model")
outputs = model(prompt)
return Response(
json = {"outputs": outputs[0]},
status=200
)
if __name__ == "__main__":
app.serve()
```
Each Potassium app has two vital components:
@app.init runs on startup and loads any heavy objects, such as models, into memory. The return value saves as the app's context, for use later.
@app.handler() is the HTTP POST request handler, ran on every call. In this example, it uses the preloaded models from the context and the prompt from the input json to run inference and return the output.
Model loads from disk to GPU, depending on the model, can take many minutes! For this reason, we load them in advance of the handlers, so they're hot and ready to go.
In the next section, we'll customize our Potassium app.