LocalAI

LocalAI

OpenAI-compatible local AI inference server and API

42.1kstars
3.4kforks
Last commit: 19h ago
Repo age: 3y old
LocalAI screenshot

LocalAI is a self-hostable AI inference server that provides a drop-in, OpenAI-compatible REST API for running models locally or on-premises. It supports multiple model families and backends, enabling text, image, and audio workloads on consumer hardware, with optional GPU acceleration.

Key Features

  • OpenAI-compatible REST API for integrating with existing apps and SDKs
  • Multi-backend local inference, including GGUF via llama.cpp and Transformers-based models
  • Image generation support (Diffusers/Stable Diffusion-class workflows)
  • Audio capabilities such as speech generation (TTS) and voice-related features
  • Web UI for basic testing and model management
  • Model management via gallery and configuration files, with automatic backend selection
  • Optional distributed and peer-to-peer inference capabilities

Use Cases

  • Replace cloud LLM APIs for private chat and internal tooling
  • Run local multimodal prototypes (text, image, audio) behind a unified API
  • Provide an on-prem inference endpoint for products needing OpenAI API compatibility

Limitations and Considerations

  • Capabilities and quality depend heavily on the selected model and backend
  • Some advanced features may require GPU-specific images or platform-specific setup

LocalAI is a practical foundation for building a local-first AI stack, especially when OpenAI API compatibility is a requirement. It offers flexible deployment options and broad model support to cover common generative AI workloads.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Ollama

Ollama

Run and manage large language models locally with an API

159.6k
14.2k
Last commit: 16h ago

Ollama is a local LLM runtime that lets you pull, run, and customize models, offering a CLI and REST API for chat, generation, and embeddings.

Alternative to:
OpenAI API
OpenAI API
+15
Jina

Jina

Cloud-native Python framework for serving multimodal AI services

21.8k
2.2k
Last commit: 9mo ago

Open-source Python framework to build, scale, and deploy multimodal AI services and pipelines with gRPC/HTTP/WebSocket support and Kubernetes/Docker integration.

Alternative to:
Baseten
Baseten
+12
Willow

Willow

Open-source, privacy-focused voice assistant platform

3k
113
Last commit: 6mo ago

Self-hosted voice assistant platform for ESP32 devices with on-device wake-word and command recognition, Home Assistant integration, and an optional inference server for...

Alternative to:
Amazon Alexa
Amazon Alexa
+9
Speaches

Speaches

OpenAI API-compatible server for speech-to-text and text-to-speech

2.8k
356
Last commit: 20d ago

Self-hosted, OpenAI API-compatible server for streaming transcription, translation, and speech generation using faster-whisper and TTS engines like Piper and Kokoro.

Alternative to:
OpenAI API
OpenAI API
+9
Unblink

Unblink

AI camera monitoring with federated vision workers

1.3k
152
Last commit: 1d ago

Open-source AI camera monitoring that routes camera streams through a relay/node proxy and broadcasts frames to federated AI workers for detections, summaries, and alerts...

Alternative to:
Blue Iris
Blue Iris
+10
withoutBG

withoutBG

Open-source image background removal with local models and hosted API

755
33
Last commit: 1mo ago

Open-source background-removal toolkit offering Focus/Snap local models, a Docker web app and Python SDK, plus a Pro API (Inferentia‑accelerated) for production use.

Alternative to:
remove.bg
remove.bg