app.py
is the most important file in the framework.
A full, working Potassium app looks like this:
from potassium import Potassium, Request, Response
from transformers import pipeline
import torch
app = Potassium("my_app")
# @app.init runs at startup, and initializes the app's context
@app.init
def init():
device = 0 if torch.cuda.is_available() else -1
model = pipeline('fill-mask', model='bert-base-uncased', device=device)
context = {
"model": model,
"hello": "world"
}
return context
# @app.handler is an http post handler running for every call
@app.handler()
def handler(context: dict, request: Request) -> Response:
prompt = request.json.get("prompt")
model = context.get("model")
outputs = model(prompt)
return Response(
json = {"outputs": outputs},
status=200
)
if __name__ == "__main__":
app.serve()
Documentation
@app.init
@app.init
def init():
device = 0 if torch.cuda.is_available() else -1
model = pipeline('fill-mask', model='bert-base-uncased', device=device)
return {
"model": model
}
The @app.init
decorated function runs once on server startup, and is used to load any reuseable, heavy objects such as:
Your AI model, loaded to GPU
The return value is a dictionary which saves to the app's context
, and is used later in the handler functions.
There may only be one @app.init
function.
@app.handler()
@app.handler("/")
def handler(context: dict, request: Request) -> Response:
prompt = request.json.get("prompt")
model = context.get("model")
outputs = model(prompt)
return Response(
json = {"outputs": outputs},
status=200
)
The @app.handler
decorated function runs for every http call, and is used to run inference or training workloads against your model(s).
Banana serverless currently only supports handlers at the root "/"
Advanced Documentation
Refer to the Potassium github repo for additional and experimental features.