LangChain is the most widely-used framework for building applications powered by large language models. It provides the building blocks for connecting LLMs to external data, tools, memory, and other AI components — turning a bare language model into a useful application. This guide gets you productive with LangChain v0.3 using modern patterns.
What LangChain Does
A raw LLM API call returns text. LangChain helps you:
- Chain multiple AI steps together into a pipeline
- Connect LLMs to tools like search, calculators, and databases
- Add memory so conversations retain context between turns
- Build RAG applications that answer questions from your documents
- Run agents that decide which tools to call based on the task
LangChain works with OpenAI, Anthropic, Google, and any Ollama-compatible local model.
Installation
pip install langchain langchain-openai langchain-community langchain-core
pip install langchain-ollama # for local models
pip install chromadb # for vector storage
pip install python-dotenv # for API key management
Create a .env file for your API keys:
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
Core Concept: LangChain Expression Language (LCEL)
LangChain v0.3 uses the | (pipe) operator to chain components. This is LCEL (LangChain Expression Language):
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Define components
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_template("Explain {topic} in simple terms.")
parser = StrOutputParser()
# Chain them with |
chain = prompt | llm | parser
# Run it
result = chain.invoke({"topic": "quantum entanglement"})
print(result)
Every component in LangChain implements the same invoke(), stream(), and batch() interface, making them composable.
Working with Different LLMs
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_ollama import ChatOllama
# OpenAI
gpt = ChatOpenAI(model="gpt-4o", temperature=0.7)
# Anthropic
claude = ChatAnthropic(model="claude-3-5-sonnet-20241022")
# Local model via Ollama
llama = ChatOllama(model="llama3.2", temperature=0.5)
# All have the same interface
for llm in [gpt, claude, llama]:
response = llm.invoke("What is 2 + 2?")
print(response.content)
Prompt Templates
Prompt templates separate your prompts from your code:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Simple template
simple = ChatPromptTemplate.from_template(
"You are a {role}. Answer: {question}"
)
# Multi-turn chat template
chat = ChatPromptTemplate.from_messages([
("system", "You are an expert in {domain}. Be concise."),
MessagesPlaceholder("history"), # slot for conversation history
("human", "{question}")
])
# Format and inspect
print(chat.format_messages(
domain="cybersecurity",
history=[],
question="What is SQL injection?"
))
Adding Memory to Conversations
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
llm = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder("history"),
("human", "{input}")
])
chain = prompt | llm
# Wrap with history management
store = {}
def get_session_history(session_id: str):
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
with_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history"
)
# Session ID keeps conversations separate
config = {"configurable": {"session_id": "user-123"}}
print(with_history.invoke({"input": "My name is Alex."}, config=config).content)
print(with_history.invoke({"input": "What's my name?"}, config=config).content)
# Output: "Your name is Alex."
Building a RAG Application
RAG (Retrieval-Augmented Generation) lets your LLM answer questions from your documents.
Step 1: Load and Split Documents
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load a PDF
loader = PyPDFLoader("security-report.pdf")
docs = loader.load()
# Or load a whole folder
# loader = DirectoryLoader("./docs", glob="**/*.pdf", loader_cls=PyPDFLoader)
# docs = loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = splitter.split_documents(docs)
print(f"Created {len(chunks)} chunks")
Step 2: Create a Vector Store
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# Create and persist the vector store
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="./chroma_db"
)
# Or use a local embedding model via Ollama
from langchain_ollama import OllamaEmbeddings
local_embeddings = OllamaEmbeddings(model="nomic-embed-text")
Step 3: Build the RAG Chain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 5}
)
prompt = ChatPromptTemplate.from_template("""
Answer the question using only the provided context.
If the answer isn't in the context, say you don't know.
Context:
{context}
Question: {question}
""")
llm = ChatOpenAI(model="gpt-4o-mini")
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
RunnableParallel({"context": retriever | format_docs, "question": RunnablePassthrough()})
| prompt
| llm
| StrOutputParser()
)
answer = rag_chain.invoke("What vulnerabilities were found in the Q3 audit?")
print(answer)
Tools and Agents
Agents can call tools to get real-world data or perform actions:
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.tools import tool
from langchain_community.tools import DuckDuckGoSearchRun
import requests
# Built-in tools
search = DuckDuckGoSearchRun()
# Custom tools with the @tool decorator
@tool
def get_ip_info(ip_address: str) -> str:
"""Look up geolocation and ASN info for an IP address."""
response = requests.get(f"https://ipapi.co/{ip_address}/json/", timeout=5)
data = response.json()
return f"IP: {ip_address}, Country: {data.get('country_name')}, ISP: {data.get('org')}"
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely."""
try:
result = eval(expression, {"__builtins__": {}}, {})
return str(result)
except Exception as e:
return f"Error: {e}"
tools = [search, get_ip_info, calculate]
llm = ChatOpenAI(model="gpt-4o", temperature=0)
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to tools. Use them when needed."),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad")
])
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({"input": "What is the current price of Bitcoin and what's 42 * 137?"})
print(result["output"])
Output Parsers
Structured output is critical for production apps:
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
class SecurityFinding(BaseModel):
severity: str = Field(description="Critical, High, Medium, or Low")
title: str = Field(description="Short description of the finding")
affected_component: str = Field(description="What is vulnerable")
recommendation: str = Field(description="How to fix it")
parser = JsonOutputParser(pydantic_object=SecurityFinding)
prompt = ChatPromptTemplate.from_messages([
("system", "Extract security findings from the text as JSON.\n{format_instructions}"),
("human", "{text}")
]).partial(format_instructions=parser.get_format_instructions())
chain = prompt | ChatOpenAI(model="gpt-4o-mini") | parser
result = chain.invoke({
"text": "The login endpoint doesn't rate limit requests, allowing brute force attacks."
})
print(result)
# {'severity': 'High', 'title': 'No rate limiting on login', ...}
Streaming Responses
chain = prompt | llm | StrOutputParser()
# Stream tokens as they arrive
for chunk in chain.stream({"topic": "neural networks"}):
print(chunk, end="", flush=True)
LangSmith Observability
LangSmith traces every chain invocation for debugging:
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "ls__..."
os.environ["LANGCHAIN_PROJECT"] = "my-app"
# All chain invocations now appear in LangSmith dashboard
LangChain gives you the components to build anything from a simple Q&A bot to a complex multi-agent workflow. Start with a simple chain, add retrieval when you need document grounding, and layer in agents when you need the LLM to take actions. The LCEL pipe syntax makes it easy to swap components as your needs evolve.