Question 1

What is freelm?

Accepted Answer

freelm is a free, open-source LLM client and gateway for Python (pip install freelm) and Node.js/TypeScript (npm install freelm). It pools six free-tier LLM providers — OpenRouter, Google AI Studio (Gemini), NVIDIA NIM, Groq, Cerebras, and Mistral — behind a single OpenAI-compatible call. It handles API-key rotation, cross-provider failover, circuit breaking, quota-aware routing, and live free-model discovery automatically. You supply whichever free keys you have and freelm keeps your application talking to an LLM even when one provider rate-limits or goes down. It is MIT-licensed and maintained by Shahriar Labs.

Question 2

Is freelm actually free?

Accepted Answer

Yes. The freelm package itself is MIT-licensed and costs nothing. It routes requests exclusively to the free tiers of supported providers — OpenRouter (:free models), Google AI Studio (free quota), NVIDIA NIM (build credits), Groq (30 RPM / 14,400 req/day), Cerebras (~1M tokens/day), and Mistral (Experiment tier: 2 RPM, 1B tokens/month). None of these providers require a credit card for their free tier. Your only cost is your own compute and bandwidth; freelm does not proxy requests through any paid service. Rate limit numbers are defaults as of June 2026 and may change — freelm lets you override rpm and rpd per provider.

Question 3

How do I install freelm for Python?

Accepted Answer

Install freelm from PyPI with: pip install freelm (requires Python 3.8 or later). Then set one or more provider API keys as environment variables — for example OPENROUTER_API_KEY, GEMINI_API_KEY, GROQ_API_KEY — and call freelm.FreeLLM.from_env() to create a client. The client auto-detects which keys are present and builds a pool from them. You can also pass providers explicitly: FreeLLM([OpenRouter('sk-or-...'), GoogleAIStudio('AIza...')]) for full control over which providers are used and in what order. No further configuration is required to start making chat completions.

Question 4

How do I install freelm for Node.js or TypeScript?

Accepted Answer

Install the npm package with: npm install freelm (requires Node.js 18 or later). Import and use it as: import { FreeLLM } from 'freelm'; const llm = FreeLLM.fromEnv(); console.log(await llm.text('Hello')). The Node.js port has zero runtime dependencies — it uses the built-in fetch API. It supports the same API surface as the Python version: chat(), text(), stream(), health(), and the drop-in OpenAI shim via import { OpenAI } from 'freelm/compat'. TypeScript types are bundled. The same environment variables (OPENROUTER_API_KEY, GEMINI_API_KEY, etc.) are used for key loading.

Question 5

How does the OpenAI drop-in shim work?

Accepted Answer

freelm ships a compatibility shim that mirrors the OpenAI Python SDK and JS SDK interfaces exactly. In Python, replace 'from openai import OpenAI' with 'from freelm.compat import OpenAI' and your existing client.chat.completions.create(...) calls will work unchanged — backed by FreeLLM.from_env() instead of the OpenAI API. In Node.js, replace 'import OpenAI from openai' with 'import { OpenAI } from freelm/compat'. Use model='auto' (or any virtual model alias) to let freelm pick the best available free model. No other code changes are required, making migration instantaneous.

Question 6

How does the automatic failover and circuit breaker work?

Accepted Answer

When freelm receives a 429 rate-limit response, it cools that key and rotates to the next available key or provider. A 5xx server error or timeout triggers a circuit breaker: that key's breaker opens, requests skip it, and after a cooldown the breaker half-opens to allow one test request through. A 401/403 auth error disables the key permanently for that session. The failover is interleaved across providers — the best model of every provider is tried before any provider's second model — ensuring every provider is reached quickly rather than exhausting one provider's model list first. The max_attempts parameter (default 12) caps total tries per call.

Question 7

What are virtual models and how do they work?

Accepted Answer

Because model names differ across providers (e.g. 'llama-3.3-70b-instruct:free' on OpenRouter vs 'llama3-70b-8192' on Groq), freelm lets you request by intent rather than exact model ID. The built-in aliases are: 'auto' or 'chat' (any available chat model), 'chat:large' or 'large' (a larger/stronger model), 'chat:fast' or 'fast' (a fast/cheap model), 'chat:small' or 'small' (the smallest model), and any 'vendor/model-id' passthrough for exact control. freelm resolves each alias to a concrete model per provider. Free model IDs change constantly, so it discovers them live via each provider's /models API and caches results to disk.

Question 8

What routing strategies does freelm support?

Accepted Answer

freelm supports four routing strategies set at client construction: priority (providers tried in ascending priority integer, deterministic), round_robin (rotates which provider goes first each call, spreading load evenly), quota_aware (ranks providers by current remaining quota headroom — keys nearing their daily limit score lower, pushing traffic to less-used providers), and latency (prefers the provider with the lowest observed EWMA latency, measured per call). The default is priority. You switch strategies per FreeLLM instance. Within any strategy, failover is always interleaved across providers so no single provider monopolizes retries.

One client. Six free LLM providers.
Always up.

Quickstart

Supported Free Providers

OpenRouter

Google AI Studio

NVIDIA NIM

Groq

Cerebras

Mistral

How it keeps working

Multi-Provider Pooling

Automatic Failover

Drop-in OpenAI Shim

Circuit Breakers & Quota Guards

Live Model Discovery

Four Routing Strategies

Frequently Asked Questions

What is freelm?

Is freelm actually free?

How do I install freelm for Python?

How do I install freelm for Node.js or TypeScript?

How does the OpenAI drop-in shim work?

How does the automatic failover and circuit breaker work?

What are virtual models and how do they work?

What routing strategies does freelm support?

Start using free LLMs in 60 seconds

One client. Six free LLM providers.Always up.

Quickstart

Supported Free Providers

OpenRouter

Google AI Studio

NVIDIA NIM

Groq

Cerebras

Mistral

How it keeps working

Multi-Provider Pooling

Automatic Failover

Drop-in OpenAI Shim

Circuit Breakers & Quota Guards

Live Model Discovery

Four Routing Strategies

Frequently Asked Questions

What is freelm?

Is freelm actually free?

How do I install freelm for Python?

How do I install freelm for Node.js or TypeScript?

How does the OpenAI drop-in shim work?

How does the automatic failover and circuit breaker work?

What are virtual models and how do they work?

What routing strategies does freelm support?

Start using free LLMs in 60 seconds

One client. Six free LLM providers.
Always up.