Our vision is to make Banana the best way to scale ML in production.
This is a big vision, and so we're starting off focused on winning in a niche.
Right now we do best on inference jobs that...
you need control over: the code, weights, optimizations, as opposed to a black-box api
take minutes not hours to complete
require < 40GB of GPU RAM
need to be run fast & so require a good GPU like an A100
have spiky traffic coming from unpredictable user demand
must scale fast so customers get a good UX without you paying for idle GPUs
But you always surprise us
Our users have figured out how to do things we didn't expect on Banana
some training jobs do run, if done in minutes not hours to avoid timeouts
multiple models may run in one app safely
batch jobs can still save you $ if you know your traffic and time your scaling well
out-of-box APIs do still run on Banana, you don't need to customize if you don't want
you can keep machines running always on to maximize speed, you don't need to scale up/down
What this means
Banana has GPUs you can use to run your ML apps.
There are many ways to do this, and we focus primarily on the area we want to offer the best experience. This doesn't stop you from exploring other ways to use Banana, but please do note the experience may not be as a smooth.
If you are running a high traffic workload and require native support for a usecase not supported above please contact our sales team at https://www.banana.dev/sales & we may be able to prioritize it accordingly.