Welcome to 🍌 Docs

Banana provides serverless GPU inference hosting for machine learning models. Use Banana for scale.
Banana is the cheapest way to run GPU machine learning servers in the cloud, with autoscaling.
It has clientside SDKs you add to your code to initiate the calls, and a serving framework that runs on the GPUs to handle the calls.
You deploy the serving framework by pushing to main, and it builds and optimizes on Banana before being deployed to the GPUs, ready to be called by the SDK.
Deploy your first model on Banana with our quickstart tutorial.
Alternatively, jump straight to the documentation you need using the navigation bar on the left.
Last modified 3mo ago