Exo CUDA Inference API provides distributed GPU-accelerated inference for AI models using NVIDIA CUDA. ## Features - OpenAI-compatible API (use OpenAI SDK directly) - Support for Llama 3, Mistral, Stable Diffusion XL, Whisper, and more - NVIDIA GPU acceleration for fast inference - Scalable infrastructure handling thousands of requests - JSON & streaming response support - Multiple model…

2 subscribers
6 endpoints
The in-depth APIMemo review for this API hasn't been published yet — the data below comes straight from the public marketplace listing.

exo-cuda-inference endpoints

MethodEndpointDescription
POST chat
/v1/chat/completions
OpenAI-compatible chat completion endpoint. Send a conversation and receive AI-generated text response.
POST completions
/v1/completions
Text completion endpoint. Generate AI text completions for given prompts.
POST embeddings
/v1/embeddings
Generate vector embeddings for text. Useful for semantic search and similarity.
POST images
/v1/images/generations
Generate images from text prompts using Stable Diffusion XL.
POST transcriptions
/v1/audio/transcriptions
Convert audio to text. Upload audio file and get transcription.
GET models
/v1/models
List all available AI models on this inference API.

exo-cuda-inference pricing

PlanPriceRate limitQuotas
BASIC Free
  • Requests: 1,000 / monthly
PRO $19 / month
  • Requests: 50,000 / monthly
ULTRA $49 / month
  • Requests: 200,000 / monthly
MEGA $149 / month
  • Requests: 1,000,000 / monthly

More Artificial Intelligence/Machine Learning APIs

View all →
  • An almost free AI image generation API for cost-conscious developers. including text to image, object…

    Artificial Intelligence/Machine LearningFreemium56 subscribers
  • Harness the potential (100x affordable) of OPEN AI ( with internet access ), Claude 3 , GPT-4 (at…

    Artificial Intelligence/Machine LearningFreemium8.9k subscribers
  • Professional astrology API with natal charts, transits, synastry analysis. 23 house systems, fixed stars,…

    Artificial Intelligence/Machine LearningFreemium186 subscribers
  • Detects ChatGPT, GPT4 & Gemini Content: Simple Way & High Accuracy; OpenAI Detection API; AI Essay Detector…

    Artificial Intelligence/Machine LearningFreemium1.7k subscribers
  • 100x affordable than OpenAI same AI, with Chatgpt Vision, GPT4o vision , GPT 3.5. image processing ,Text to…

    Artificial Intelligence/Machine LearningFreemium1.8k subscribers
  • The ChatGPT 4 API from PR Labs is a multi-model AI gateway hosted on RapidAPI that bundles access to GPT-4o,…

    ReviewedArtificial Intelligence/Machine LearningFreemium21.2k subscribers