Tutorial: Loading different LLM providers with a unified helper function.

This example demonstrates how to use the load_llm utility to easily switch between multiple LLM providers (OpenAI, OpenRouter, Groq, Ollama) without changing downstream code. The load_llm function abstracts away provider-specific setup and returns a LangChain-compatible LLM instance.

🔑 API Key Setup:

Store your API keys in a .env file in the project root.
The keys will be automatically loaded when llm_loader is imported.

Example .env file:

OPENAI_API_KEY=your_openai_key_here OPENROUTER_API_KEY=your_openrouter_key_here GROQ_API_KEY=your_groq_key_here

👉 For Ollama, no API key is required (runs locally). """

from llm_loader import load_llm

# --- OpenAI Example ---
# Loads an OpenAI model (here, GPT-4o-mini).
# Requires a valid OpenAI API key in your environment (.env or system variable).
llm = load_llm("openai", "gpt-4o-mini")
print(llm.invoke("Hello from OpenAI!"))

# --- OpenRouter Example ---
# Loads a model via OpenRouter (a router service that provides access to multiple LLMs).
# In this case, it uses the open-source GPT-OSS-20B model (free tier).
# Requires OPENROUTER_API_KEY in your environment.
llm = load_llm("openrouter", "openai/gpt-oss-20b:free")
print(llm.invoke("Hello from OpenRouter!"))

content='Hello! 👋 How can I help you today?' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 90, 'prompt_tokens': 76, 'total_tokens': 166, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'openai/gpt-oss-20b:free', 'system_fingerprint': None, 'id': 'gen-1755707385-JsQdKdbu290XqI1iG22Q', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None} id='run--a0bc7ae6-fee2-45cc-bb41-9f5401b8fbc7-0' usage_metadata={'input_tokens': 76, 'output_tokens': 90, 'total_tokens': 166, 'input_token_details': {}, 'output_token_details': {}}

# --- Groq Example ---
# Loads a Groq-hosted model (here, Llama 3.1 8B Instant).
# Groq specializes in low-latency inference.
# Requires GROQ_API_KEY in your environment.
llm = load_llm("groq", "llama-3.1-8b-instant")
print(llm.invoke("Hello from Groq!"))

content='Hello from me as well. Groq is a company known for its high-performance AI computing hardware. They specialize in developing custom ASICs (Application-Specific Integrated Circuits) for AI workloads. Their technology is designed to accelerate AI inference and training, making it a key player in the AI hardware space. What brings you to this conversation today?' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 71, 'prompt_tokens': 40, 'total_tokens': 111, 'completion_time': 0.165241813, 'prompt_time': 0.00627803, 'queue_time': 0.19492126, 'total_time': 0.171519843}, 'model_name': 'llama-3.1-8b-instant', 'system_fingerprint': 'fp_510c177af0', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} id='run--a022eb56-8131-4e33-82c6-151c775d5be1-0' usage_metadata={'input_tokens': 40, 'output_tokens': 71, 'total_tokens': 111}

# --- Ollama Example ---
# Loads a local Ollama model (here, Llama 3).
# Ollama runs models locally on your machine without requiring cloud APIs.
# Make sure Ollama is installed and running on your system.
llm = load_llm("ollama", "llama3")
print(llm.invoke("Hello from Ollama!"))

content="Hello there, Ollama! It's great to meet you! What brings you here today? Do you have a question, or are you just looking for some friendly conversation? I'm all ears (or rather, all text)!" additional_kwargs={} response_metadata={'model': 'llama3', 'created_at': '2025-08-20T16:28:02.392780314Z', 'done': True, 'done_reason': 'stop', 'total_duration': 18208991006, 'load_duration': 16401506467, 'prompt_eval_count': 16, 'prompt_eval_duration': 566493103, 'eval_count': 48, 'eval_duration': 1239875258, 'model_name': 'llama3'} id='run--c59ca93d-27ab-4d1b-9404-e7d70242b021-0' usage_metadata={'input_tokens': 16, 'output_tokens': 48, 'total_tokens': 64}