OpenSource-Hub
L

LocalAI

SHA-256
46.2k stars·AI Productivity·SHA-256 checksum verified

LocalAI is the open-source AI engine to run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. Drop-in API compatibility with OpenAI, Anthropic, and ElevenLabs.

Smart Download

Download Download Version

v4.2.2 · 130.5 MB

Run any AI model on any hardware locally. No GPU needed, drop-in API replacement for OpenAI.

Core Features

  • No GPU needed: runs on CPU, Apple Silicon, AMD/Intel/Vulkan GPUs, and more
  • Drop-in API replacement: fully compatible with OpenAI, Anthropic, ElevenLabs - zero code change
  • 36+ backends: llama.cpp, vLLM, transformers, whisper, diffusers, MLX, etc.
  • Multi-user ready: API key auth, user quotas, role-based access control
  • Built-in AI agents: tool use, RAG, MCP protocol for autonomous agents

What It Can't Do

  • macOS DMG is not signed by Apple; first run requires quarantine attribute removal. Docker GPU acceleration requires proper driver installation and device mapping. Model downloads are large (several GB); ensure stable internet. Some backends (e.g. vLLM) need modern hardware.

Use Cases

  • Self-hosting LLM chatbots as a drop-in replacement for OpenAI API
  • Running speech recognition or image generation on edge devices like Raspberry Pi
  • Building a team AI platform with user permissions and quota management
  • Local AI development and testing without internet or API costs

Detailed Introduction

LocalAI is a free, open-source AI engine that lets you run large language models, image generators, voice assistants, and more on your own hardware — even without a GPU. It provides a drop-in replacement for the OpenAI API, so you can switch from cloud services to local hosting with zero code changes. With 36+ backends (llama.cpp, vLLM, transformers, whisper, diffusers, MLX, etc.), it supports NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or pure CPU. Built-in multi-user authentication, role-based access, and AI agents with tool use, RAG, and MCP enable enterprise-grade deployments. All data stays on your infrastructure, ensuring complete privacy.

Troubleshooting & FAQ (2)

Troubleshooting
How to fix 'reasoning_effort=none' not working in LocalAI 4.3.4?

This is a known regression in LocalAI versions after 4.0.0 (issue #10072). The parameter reasoning_effort=none should prevent the model from producing reasoning tokens and speed up responses, but a bug in newer versions causes it to be ignored. As a temporary workaround, downgrade to LocalAI v4.0.0 or v3.12.1, where the feature was reported to function correctly with llama-cpp backend models like Qwen3. If downgrading is not possible, you can also try forcing the model to skip reasoning by setting top_p=0 and temperature=0, or using a non-reasoning model for latency-sensitive tasks. For a permanent fix, monitor the GitHub issue #10072 and upgrade once the patch is released. Ensure your model configuration file correctly maps the reasoning_effort option to the backend parameter (e.g., in llama-cpp, it should map to --reasoning-effort none).

GitHub Issue #10072
Troubleshooting
Why are some LocalAI v4.3.2 Docker images missing from Docker Hub?

A CI build failure prevented publishing of several v4.3.2 tags. Affected missing tags: v4.3.2, v4.3.2-gpu-nvidia-cuda-12, v4.3.2-gpu-nvidia-cuda-13, v4.3.2-gpu-vulkan, v4.3.2-gpu-intel. Successfully published tags: v4.3.2-gpu-hipblas, v4.3.2-nvidia-l4t-arm64, v4.3.2-nvidia-l4t-arm64-cuda-13. As a workaround, use the localai/localai:master image.

GitHub Issue #10041

Tags

LLMAI EngineSelf-hostedOpenAI CompatiblePrivacyMulti-user

Getting Started

1

Download installer

Click the button above to download the installer for your system

2

Install the software

Install the appropriate package for your distro (dpkg / rpm / AppImage)

3

macOS: Download the DMG, drag to Applications. First launch may require: sudo xattr -d com.apple.quarantine /Applications/LocalAI.app

4

Docker (CPU): docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

5

Docker (NVIDIA GPU): add --gpus all e.g. docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

Install Guide
  1. macOS: Download the DMG, drag to Applications. First launch may require: sudo xattr -d com.apple.quarantine /Applications/LocalAI.app
  2. Docker (CPU): docker run -ti --name local-ai -p 8080:8080 localai/localai:latest
  3. Docker (NVIDIA GPU): add --gpus all e.g. docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12
File Integrity

SHA-256 checksum verified

Checksum extracted from GitHub official Release page

SHA256 Checksum

544eb221c2a5ec84467c1eb92851d98348c5e8eec9bf0346bd942e302faad73b

This checksum is extracted from the GitHub Release page. Verify file integrity after download.

All SHA-256 checksums on this platform are extracted from the project's official GitHub Release page, without any modification. You can independently verify them on the GitHub Releases page.

Open Source Transparency

View GitHub Source
Environment Guide

Uninstall Info

macOS: Move LocalAI.app to Trash and empty it. Docker: docker stop local-ai; docker rm local-ai; docker rmi localai/localai:latest and related tags.

No Extra Dependencies

Ready to use after download. No additional runtime required.

Project Info
LicenseMIT
Last Updated2026-06-26 06:55:08
GitHub RepositoryOfficial Website

Having issues? Check the FAQ below

2 FAQs

Similar Projects