Skip to main content
GreenGeeks logotype

Chroma VPS Docker Hosting

Self-hosting Chroma on a GreenGeeks VPS gives a private RAG backend with fast retrieval, RAM headroom for HNSW indices, and zero per-query cloud fees.

  • Private RAG with no cloud fees
  • RAM headroom for HNSW
  • 12-ms vector queries
Chroma VPS Docker Hosting | GreenGeeks
GoogleTrustpilotWordPresscPanelPHP
Why GreenGeeks

Why Run Chroma on GreenGeeks

GreenGeeks gives a Chroma server strong CPU throughput, RAM for the HNSW index, fast SSD for persistence, 24/7 uptime, and a three-to-one renewable power match.

Strong CPU Throughput for Queries

Strong CPU throughput lets the HNSW index answer top-k vector queries with low latency under load.

RAM Headroom for HNSW Indices

Generous RAM keeps the hot HNSW index resident in memory, where insert and query latency stay flat.

Fast SSD for SQLite and Index

Fast SSD storage holds chroma.sqlite3 and the HNSW directory, without slow writes blocking queries.

24/7 Uptime for a Private RAG API

A 24/7 uptime target keeps a private RAG API reachable for every retrieval call an agent runs daily.

Self-Managed VPS

Self-Managed VPS Plans

Full root access, guaranteed resources, and unmetered transfer — you take control.

VPS 4GB

Start small with reliable VPS performance.

Special PriceSave 50%
Original price: $19.99$9.99/month

Renews at $19.99/month

Core Resources

  • 2 vCPU
  • 4 GB RAM
  • 80 GB SSD Storage
  • Unmetered Transfer
30-day money back guarantee!

VPS 8GB

Scale up apps, databases, and containers.

Special PriceSave 50%
Original price: $39.99$19.99/month

Renews at $39.99/month

Core Resources

  • 4 vCPU
  • 8 GB RAM
  • 160 GB SSD Storage
  • Unmetered Transfer
30-day money back guarantee!
Most Popular

VPS 16GB

Run production workloads with more resources.

Special PriceSave 50%
Original price: $79.99$39.99/month

Renews at $79.99/month

Core Resources

  • 8 vCPU
  • 16 GB RAM
  • 320 GB SSD Storage
  • Unmetered Transfer
30-day money back guarantee!

VPS 32GB

High-capacity VPS for demanding applications.

Special PriceSave 45%
Original price: $109.99$59.99/month

Renews at $109.99/month

Core Resources

  • 16 vCPU
  • 32 GB RAM
  • 640 GB SSD Storage
  • Unmetered Transfer
30-day money back guarantee!

What is Chroma?

Chroma is an open-source, AI-native embedding database for storing, indexing, and retrieving high-dimensional vectors that LLM applications use to look up relevant context at query time. The project is released under Apache 2.0, free for commercial use, with more than 26,000 GitHub stars and over 11 million downloads per month as of 2025. The server-mode binary listens on TCP port 8000 by default.

The open-source Chroma server is the self-hosted path the company also sells as Chroma Cloud. On a VPS the operator runs it from PyPI as chroma run --path /chroma_db_path or from the chromadb/chroma Docker image, with the persistent directory on local SSD and the HNSW indices loaded into RAM at start.

What You Can Build with Chroma

Retrieval-augmented generation pipelines for chatbots over private documents are the dominant use case. PDFs, Markdown, HTML, and support tickets are chunked, embedded, stored in Chroma, and the top-k chunks for each user query are passed to an LLM such as Claude, GPT-4, or a locally hosted model on the same VPS. Semantic search over legal documents, scientific papers, or internal wikis follows.

Beyond RAG, Chroma serves as long-term memory for AI agents that store observations and recall relevant ones to ground later decisions. Small to mid-scale recommendation systems compare item embeddings by cosine distance, and code search systems retrieve similar snippets from an embedded corpus of repositories. The official Chroma MCP server exposes collections to Model Context Protocol clients.

What You Can Build with Chroma

The Key Features of Chroma

Collections are the primary unit of organization. Each collection holds embeddings, documents, and metadata, with its own embedding function and its own HNSW index, under a tenant and database hierarchy for multi-tenant setups. The default embedding function is Sentence Transformers all-MiniLM-L6-v2 at 384 dimensions, run through ONNX Runtime with no external API key. OpenAI, Cohere, Google, Hugging Face, Jina, Voyage, and Ollama are supported as alternates.

Supported HNSW distance metrics are L2 (the default), cosine, and inner product, set when the collection is created. Metadata filtering at query time uses operators such as $eq, $gt, $in, $and, and $or, with full-text $contains operators on document text. Hybrid search combining dense vectors, full-text, and regex reached general availability in 2025.

The Key Features of Chroma

Frequently Asked Questions

Everything you need to know about self-hosting Chroma on GreenGeeks VPS.

Chroma is an open-source, AI-native embedding database for storing and querying high-dimensional vectors that LLM applications use to retrieve relevant context at query time during inference. It is released under the Apache 2.0 license, free for commercial use, and has gathered more than 26,000 GitHub stars and over 11 million monthly downloads as of 2025. The project ships a Python and JavaScript client plus a server binary, and treats local development and production deployments as the same API surface for application code.

The dominant use case is retrieval-augmented generation, where a chatbot grounds its answers in a private document corpus through nearest-neighbor lookup. Semantic search over legal documents, scientific papers, support tickets, or internal wikis follows the same pattern at a different scale. Other common uses include long-term memory for AI agents, small to mid-scale recommendation systems, and code search over an embedded corpus of source repositories. Each pattern stores embeddings in collections and queries them by similarity to a query vector.

Chroma supports three client modes for local and remote use. EphemeralClient stores data in memory only for tests, PersistentClient writes to a local directory for development and small deployments, and HttpClient connects to a separately running Chroma server. The server mode is the common production path, started with chroma run --path /your/db/path or from the chromadb/chroma Docker image. The same Python or JavaScript code talks to all three modes, so an application can move from local development to a remote VPS without code changes.

The official image is published as chromadb/chroma on Docker Hub for self-hosters. The common command is docker run -p 8000:8000 -v /path/on/host:/chroma/chroma chromadb/chroma, which exposes port 8000 and mounts a host directory as the persistent volume. On a VPS with root access, the operator can install Docker Engine, run the image as a long-lived service, and back up the volume by snapshotting the host directory. The same image is the one used by most self-hosted Chroma tutorials in 2025.

Chroma supports three HNSW distance metrics. L2 squared Euclidean is the default, with cosine and inner product as the alternatives. The metric is set when the collection is created and is fixed for that collection, since the HNSW index is built around it. L2 and inner product are sensitive to vector magnitude, so embeddings are commonly normalized before being added when those metrics are used. Cosine distance handles the magnitude issue internally and is the safe pick for general semantic search workloads.

The open-source single-node version is free under the Apache 2.0 license, with no limits on collection count, document count, or commercial use. Chroma Cloud is a separate paid managed service from the same company, with a free tier up to roughly one million embeddings, and usage-based pricing on storage at around two cents per gigabyte per month above that. Self-hosting on a VPS keeps both the data and the cost predictable for any team running a private retrieval pipeline.

Documents are converted into numerical embeddings by an embedding model, stored in a Chroma collection alongside metadata, and indexed in a Hierarchical Navigable Small World graph. At query time the question is embedded the same way and the database returns the nearest vectors by L2, cosine, or inner-product distance. Each collection has its own embedding function and its own HNSW index, with metadata filters such as $eq, $gt, and $in applied at the same step as the vector search itself.

The minimum recommended RAM is 2 GB on the host. For production sizing, the official formula is N (millions of vectors) equals R (system RAM in GB) times 0.245, with one-thousand-and-twenty-four-dimensional embeddings, three metadata records, and a small document per embedding. That works out to roughly four gigabytes of RAM per one million vectors at 1024 dimensions, with the HNSW index resident in memory at all times during operation. Smaller embeddings like the 384-dim default need proportionally less RAM headroom.

The default embedding function is Sentence Transformers all-MiniLM-L6-v2 at 384 dimensions, run locally through ONNX Runtime. No external API key is required to get started, which is one of the reasons people pick Chroma for prototyping over services that need cloud credentials on the first install. Alternative providers are wired in by name in the client config, including OpenAI, Cohere, Google PaLM and Gemini, Hugging Face, Jina, Voyage, and Ollama, plus custom embedding functions written against a simple Python interface.

A chroma.sqlite3 file at the persistent directory holds system metadata for tenants, databases, collections, and segments. Each collection gets a UUID-named subdirectory containing its HNSW index files alongside the index metadata. Backups happen at the filesystem level, by snapshotting or copying the persistent directory with the server briefly stopped for a consistent point-in-time copy. The older DuckDB backend was removed in version 0.4.0 back in 2023 in favor of SQLite as the unified store for both local and client-server deployments.

Launch your Chroma DB on a VPS

Run self-hosted Chroma on GreenGeeks VPS hosting — strong CPU throughput, RAM headroom for HNSW indices, fast SSD persistence, and 24/7 uptime, all on 300% renewable-powered servers.

  • 24/7 uptime keeps your private RAG API always reachable.

  • RAM headroom keeps the HNSW index resident for flat latency.

  • Fast SSD persistence for chroma.sqlite3 and index files.

  • 300% renewable energy match on every VPS.