π LangChain Framework & Applications
Unit 4: Building Production-Ready GenAI Apps
From Simple Chains to Complex AI Systems
π What We'll Build Together
Your Journey
We've learned prompts. Now let's chain them together to build real applications!
π§ What You'll Learn
- LangChain architecture
- Chaining LLM calls
- Memory for conversations
- Agents with tools
- Document processing
- Production apps
ποΈ What You'll Build
- Multi-step AI workflows
- Chatbots with memory
- Document Q&A systems
- Autonomous agents
- Complex pipelines
- Real applications!
π― By the end: You'll have the skills to build production-ready GenAI applications that companies actually use!
π€ Why LangChain?
The Problem Without LangChain
# Without LangChain - messy, repetitive code
import openai
response1 = openai.ChatCompletion.create(...)
result1 = response1['choices'][0]['message']['content']
response2 = openai.ChatCompletion.create(
messages=[{"role": "user", "content": result1}]
)
result2 = response2['choices'][0]['message']['content']
# Managing conversation history manually...
# Handling errors manually...
# Switching models manually...
With LangChain - Clean & Powerful
# With LangChain - clean, composable
from langchain import LLMChain, PromptTemplate
from langchain.chat_models import ChatOpenAI
chain = LLMChain(llm=ChatOpenAI(), prompt=my_prompt)
result = chain.run(input_text)
# Memory, error handling, model switching - all built in!
π§© Composable
Build complex flows from simple components
π Reusable
Write once, use everywhere
π Production-Ready
Error handling, logging, monitoring
ποΈ LangChain Architecture Overview
LangChain Core Components
Models
LLMs, Chat Models, Embeddings
Prompts
Templates, Examples, Selectors
Chains
Sequential, Parallel, Router
Memory
Buffer, Summary, Vector Store
Agents
ReAct, OpenAI Functions, Plan-Execute
Tools
Search, Calculators, APIs
π‘ Think of it as: LEGO blocks for AI applications. Each component snaps together!
π Core Components Explained
1. Models (LLMs)
from langchain_openai import ChatOpenAI
from langchain.llms import OpenAI
# Chat model (GPT-4, GPT-3.5)
chat = ChatOpenAI(model="gpt-4")
# Completion model
llm = OpenAI(temperature=0.7)
Abstraction over different LLM providers
2. Prompt Templates
from langchain import PromptTemplate
prompt = PromptTemplate(
template="Explain {topic} to a {audience}",
input_variables=["topic", "audience"]
)
Reusable prompt structures with variables
3. Output Parsers
from langchain.output_parsers import PydanticOutputParser
parser = PydanticOutputParser(
pydantic_object=MyDataClass
)
Structure LLM outputs into Python objects
4. Document Loaders
from langchain.document_loaders import PDFLoader
loader = PDFLoader("file.pdf")
docs = loader.load()
Load data from various sources
βοΈ Installation & Setup
Installing LangChain
# Basic installation
pip install langchain
# With OpenAI
pip install langchain openai
# With other providers
pip install langchain anthropic google-generativeai
# All extras
pip install langchain[all]
Basic Setup
import os
from langchain_openai import ChatOpenAI
# Set API key
os.environ["OPENAI_API_KEY"] = "your-key-here"
# Or load from .env file
from dotenv import load_dotenv
load_dotenv()
# Initialize model
llm = ChatOpenAI(temperature=0.7, model="gpt-4o-mini")
β οΈ Security: Never commit API keys to Git! Use environment variables or .env files.
βοΈ What Are Chains?
Definition
Chains = Sequences of calls to LLMs, tools, or data processing steps
Simple Chain Flow
"Translate to French"
Format instruction
Generate response
"Traduire en franΓ§ais"
Types of Chains
LLMChain
Basic: Prompt + LLM
SequentialChain
Multiple steps in order
RouterChain
Choose path based on input
π LLMChain: The Building Block
Basic LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
# 1. Create prompt template
prompt = PromptTemplate(
input_variables=["product"],
template="Generate 3 taglines for {product}"
)
# 2. Initialize LLM
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.8
)
# 3. Create chain (LCEL style)
chain = chain = prompt | llm
# 4. Run chain
result = chain.invoke(product="eco-friendly water bottle")
print(result.content)
Output:
1. "Hydrate Sustainably, Live Responsibly"
2. "Pure Water, Pure Planet"
3. "Sip Green, Think Clean"
π‘ Key Benefit: Reusable! Change the product, same chain works.
π Sequential Chains: Multi-Step Processing
Sequential Chain Flow
from langchain.chains import SimpleSequentialChain
# Chain 1: Generate idea
chain1 = LLMChain(llm=llm, prompt=idea_prompt)
# Chain 2: Write opening
chain2 = LLMChain(llm=llm, prompt=opening_prompt)
# Combine them
overall_chain = SimpleSequentialChain(
chains=[chain1, chain2],
verbose=True
)
result = overall_chain.run("mystery novel")
π Prompt Templates in LangChain
Why Templates?
β Without Templates
prompt = f"Translate {text} to {lang}"
# Hardcoded, not reusable
# String formatting errors
# No validation
β With Templates
prompt = PromptTemplate(
template="Translate {text} to {lang}",
input_variables=["text", "lang"]
)
# Reusable, validated, composable
Advanced Template Features
from langchain import PromptTemplate
# 1. Basic template
basic = PromptTemplate.from_template("Tell me about {topic}")
# 2. Multi-variable template
multi = PromptTemplate(
template="""
You are a {role}.
Task: {task}
Context: {context}
Output format: {format}
""",
input_variables=["role", "task", "context", "format"]
)
# 3. Template with few-shot examples
few_shot = PromptTemplate(
template="""
Examples:
{examples}
Now do: {input}
""",
input_variables=["examples", "input"]
)
π¬ Chat Prompt Templates
For Chat Models (GPT-4, Claude, etc.)
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
# System message (sets behavior)
system_template = "You are a helpful {role} who {style}."
system_message = SystemMessagePromptTemplate.from_template(system_template)
# Human message (user input)
human_template = "{user_input}"
human_message = HumanMessagePromptTemplate.from_template(human_template)
# Combine into chat prompt
chat_prompt = ChatPromptTemplate.from_messages([
system_message,
human_message
])
# Format and use
messages = chat_prompt.format_prompt(
role="Python tutor",
style="explains with examples",
user_input="How do I use list comprehensions?"
).to_messages()
Result: Properly formatted messages for chat models with system and user roles
π§ Why Memory Matters
The Problem
LLMs are stateless - they don't remember previous conversations!
β Without Memory
User: "My name is Alice"
AI: "Nice to meet you, Alice!"
User: "What's my name?"
AI: "I don't know your name."
β With Memory
User: "My name is Alice"
AI: "Nice to meet you, Alice!"
User: "What's my name?"
AI: "Your name is Alice!"
π‘ Memory = Store and retrieve conversation history to maintain context
π Types of Memory
| Memory Type | Description | Use Case |
|---|---|---|
| ConversationBufferMemory | Stores all messages | Short conversations |
| ConversationBufferWindowMemory | Keeps last N messages | Medium conversations |
| ConversationSummaryMemory | Summarizes old messages | Long conversations |
| ConversationKGMemory | Extracts knowledge graph | Complex relationships |
| VectorStoreMemory | Semantic search over history | Large context retrieval |
β οΈ Token Limits: More memory = more tokens = higher costs. Choose wisely!
πΎ ConversationBufferMemory Example
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.chat_models import ChatOpenAI
# 1. Create memory
memory = ConversationBufferMemory()
# 2. Create conversation chain with memory
conversation = ConversationChain(
llm=ChatOpenAI(temperature=0.7),
memory=memory,
verbose=True
)
# 3. Have a conversation
response1 = conversation.predict(input="Hi, I'm learning LangChain")
print(response1)
response2 = conversation.predict(input="What was I just learning about?")
print(response2) # Remembers: "You're learning LangChain!"
# 4. View conversation history
print(memory.buffer)
β¨ Magic: The chain automatically includes conversation history in each call!
π€ What Are Agents?
Definition
Agents = AI systems that can decide which tools to use and in what order to accomplish a task
Agent Decision-Making Loop
π ReAct: Reasoning + Acting
ReAct Pattern
Thought β Action β Observation β (repeat) β Answer
Example: "What's 25% of the population of France?"
Thought 1: I need to find France's population first
Action 1: search("population of France")
Observation 1: 67 million
Thought 2: Now I need to calculate 25% of 67 million
Action 2: calculator("67000000 * 0.25")
Observation 2: 16,750,000
Thought 3: I have the answer
Final Answer: 25% of France's population is 16.75 million
π§ Building Your First Agent
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
# 1. Initialize LLM
llm = ChatOpenAI(temperature=0)
# 2. Load tools
tools = load_tools([
"serpapi", # Web search
"llm-math", # Calculator
], llm=llm)
# 3. Initialize agent
agent = initialize_agent(
tools=tools,
llm=llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
# 4. Run agent
result = agent.run("What's the square root of the population of Tokyo?")
print(result)
The agent will:
- Search for Tokyo's population
- Use calculator to find square root
- Return the answer
π οΈ Creating Custom Tools
from langchain.tools import BaseTool
from typing import Optional
# Method 1: Function decorator
from langchain.tools import tool
@tool
def get_word_length(word: str) -> int:
"""Returns the length of a word"""
return len(word)
# Method 2: Custom Tool class
class CustomSearchTool(BaseTool):
name = "custom_search"
description = "Useful for searching company database"
def _run(self, query: str) -> str:
# Your custom logic here
results = search_company_db(query)
return results
def _arun(self, query: str):
raise NotImplementedError("Async not supported")
# Use custom tools with agent
tools = [get_word_length, CustomSearchTool()]
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
π Document Loaders
Loading Data from Various Sources
File Loaders
from langchain.document_loaders import (
PDFLoader,
TextLoader,
CSVLoader,
UnstructuredWordDocumentLoader
)
# Load PDF
pdf_loader = PDFLoader("file.pdf")
docs = pdf_loader.load()
# Load Text
text_loader = TextLoader("file.txt")
docs = text_loader.load()
Web & API Loaders
from langchain.document_loaders import (
WebBaseLoader,
NotionDBLoader,
GoogleDriveLoader
)
# Load webpage
web_loader = WebBaseLoader("https://example.com")
docs = web_loader.load()
# Load from Notion
notion_loader = NotionDBLoader(token)
docs = notion_loader.load()
π¦ 80+ Built-in Loaders: PDF, Word, Excel, HTML, Markdown, JSON, SQL databases, Cloud storage, APIs, and more!
βοΈ Text Splitting: Why & How
Why Split Text?
LLMs have token limits. Large documents must be split into chunks.
Character-based Splitting
from langchain.text_splitter import CharacterTextSplitter
splitter = CharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=200, # Overlap between chunks
separator="\n"
)
chunks = splitter.split_text(long_text)
Recursive Splitting (Smart)
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", " ", ""]
)
chunks = splitter.split_documents(docs)
π‘ Best Practice: Use RecursiveCharacterTextSplitter - it tries to keep paragraphs together!
π― Chunking Strategy Matters
| Chunk Size | Pros | Cons | Use Case |
|---|---|---|---|
| Small (200-500) | Precise retrieval | May lose context | FAQ, definitions |
| Medium (500-1500) | Good balance | - | Most applications |
| Large (1500-3000) | More context | Less precise, costly | Long documents, narratives |
Overlap is Crucial
β No Overlap
Chunk 1: "...important concept is"
Chunk 2: "called machine learning..."
Sentence split across chunks! π±
β With Overlap (200)
Chunk 1: "...important concept is called machine learning..."
Chunk 2: "is called machine learning which..."
Context preserved! β¨
π¬ Building a Complete Chatbot
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory
from langchain.prompts import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
MessagesPlaceholder
)
# 1. Create memory (keeps last 5 exchanges)
memory = ConversationBufferWindowMemory(
k=5,
return_messages=True
)
# 2. Create prompt with memory placeholder
prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(
"You are a helpful AI assistant for a tech company."
),
MessagesPlaceholder(variable_name="history"),
HumanMessagePromptTemplate.from_template("{input}")
])
# 3. Create conversation chain
conversation = ConversationChain(
llm=ChatOpenAI(temperature=0.7),
memory=memory,
prompt=prompt,
verbose=True
)
# 4. Chat loop
while True:
user_input = input("You: ")
if user_input.lower() == "quit":
break
response = conversation.predict(input=user_input)
print(f"AI: {response}")
β‘ Streaming Responses (Like ChatGPT)
Why Streaming?
Users see responses appear word-by-word instead of waiting for the entire response
from langchain.chat_models import ChatOpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# Enable streaming
llm = ChatOpenAI(
temperature=0.7,
streaming=True,
callbacks=[StreamingStdOutCallbackHandler()]
)
# Use with chain
chain = LLMChain(llm=llm, prompt=prompt)
response = chain.run("Write a story about AI")
# Text appears word by word!
β¨ Result: Professional user experience with real-time feedback
π Case Study 1: Document Q&A System
Build a "Chat with Your PDFs" App
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
# 1. Load document
loader = PyPDFLoader("company_handbook.pdf")
documents = loader.load()
# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
# 3. Create embeddings and store in vector DB
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
# 4. Create retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=ChatOpenAI(temperature=0),
chain_type="stuff",
retriever=vectorstore.as_retriever()
)
# 5. Ask questions!
result = qa_chain.run("What's the vacation policy?")
print(result)
π― Result: A working document Q&A system in ~15 lines of code!
π¬ Case Study 2: Research Assistant Agent
Multi-Tool Agent for Research Tasks
from langchain.agents import initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.tools import Tool
from langchain.utilities import WikipediaAPIWrapper, PythonREPL
# Define tools
wikipedia = WikipediaAPIWrapper()
python_repl = PythonREPL()
tools = [
Tool(
name="Wikipedia",
func=wikipedia.run,
description="Search Wikipedia for factual information"
),
Tool(
name="Python_REPL",
func=python_repl.run,
description="Execute Python code for calculations"
)
]
# Create agent
agent = initialize_agent(
tools,
ChatOpenAI(temperature=0),
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
# Complex research query
result = agent.run("""
What was the GDP of Japan in 2020, and what's 15% of that amount?
""")
π‘ The agent will: Search Wikipedia for Japan's GDP β Extract the number β Use Python to calculate 15%
π§ Case Study 3: Customer Support Bot
With Memory + Document Retrieval + Structured Output
from langchain.chains import ConversationChain
from langchain.memory import ConversationSummaryMemory
from langchain.prompts import PromptTemplate
# System prompt for support agent
template = """You are a customer support agent for TechCorp.
Current conversation:
{history}
Customer: {input}
Guidelines:
- Be helpful and empathetic
- Check knowledge base before answering
- Escalate complex issues to human agents
- Always maintain professional tone
Response:"""
prompt = PromptTemplate(
input_variables=["history", "input"],
template=template
)
# Use summary memory for long conversations
memory = ConversationSummaryMemory(
llm=ChatOpenAI(temperature=0),
max_token_limit=1000
)
# Create support chain
support_chain = ConversationChain(
llm=ChatOpenAI(temperature=0.7),
prompt=prompt,
memory=memory,
verbose=True
)
# Handle customer queries
response = support_chain.predict(input="My order hasn't arrived yet")
π Advanced LangChain Patterns
1. Router Chains
Route to different chains based on input type
from langchain.chains.router import MultiPromptChain
# Different chains for different topics
chains = {
"tech": tech_chain,
"sales": sales_chain,
"hr": hr_chain
}
router = MultiPromptChain(chains)
2. MapReduce Chains
Process multiple documents in parallel
from langchain.chains import MapReduceDocumentsChain
# Summarize each doc, then combine
chain = MapReduceDocumentsChain(
combine_documents_chain=combine_chain,
collapse_documents_chain=collapse_chain
)
3. Sequential Chains
Multi-step workflows with complex logic
from langchain.chains import SequentialChain
# Chain outputs feed into next chain
overall_chain = SequentialChain(
chains=[analyze, summarize, recommend]
)
4. Fallback Chains
Try multiple strategies if one fails
from langchain.chains import FallbackChain
# Try GPT-4, fallback to GPT-3.5
chain = FallbackChain(
chains=[expensive_chain, cheap_chain]
)
β¨ LangChain Best Practices
π― Design Patterns
- Start simple: LLMChain first, then add complexity
- Use templates: Separate prompts from code
- Add memory wisely: Balance context vs cost
- Cache embeddings: Don't recompute unnecessarily
- Use verbose mode: Debug with visibility
β‘ Performance Tips
- Async operations: Use async chains for speed
- Batch processing: Process multiple inputs together
- Smaller chunks: Balance retrieval quality vs cost
- Token tracking: Monitor usage to control costs
- Model selection: Use cheaper models where possible
β οΈ Common Pitfalls
- Too much memory: Hitting token limits, high costs
- No error handling: Apps crash on API failures
- Poor chunking: Context lost, bad retrieval
- Verbose in production: Exposing system prompts to users
- No rate limiting: Getting blocked by API providers
π Taking LangChain to Production
What You Need to Consider
π Security
- API key management
- Input sanitization
- Output filtering
- User data privacy
π Monitoring
- Token usage tracking
- Error logging
- Performance metrics
- User feedback
π° Cost Management
- Rate limiting
- Caching results
- Model selection
- Budget alerts
Deployment Architecture
# Example: FastAPI + LangChain
from fastapi import FastAPI
from langchain.chains import ConversationChain
app = FastAPI()
# Initialize chain once
chain = ConversationChain(llm=ChatOpenAI(), memory=memory)
@app.post("/chat")
async def chat(message: str):
try:
response = chain.predict(input=message)
return {"response": response}
except Exception as e:
return {"error": str(e)}
π― Key Takeaways
π§© Core Concepts
- Models: LLM abstraction
- Prompts: Reusable templates
- Chains: Composable workflows
- Memory: Conversation context
- Agents: Autonomous reasoning
- Tools: External capabilities
ποΈ What You Can Build
- Chatbots with memory
- Document Q&A systems
- Research assistants
- Customer support bots
- Data analysis tools
- Multi-step workflows
π‘ Key Principles
- Start simple, add complexity
- Compose from building blocks
- Test iteratively
- Monitor in production
- Balance cost vs performance
- Handle errors gracefully
π₯ The Power of LangChain
From Prompt β Production App in Minutes
LangChain handles the plumbing so you can focus on building amazing AI applications
π Homework Assignment
Assignment: Build a Complete LangChain Application
Due: Next class
Choose ONE Project:
Project 1: Smart Document Assistant
Requirements:
- Load and process 3+ PDF documents
- Implement chunking strategy
- Create Q&A chain with vector store
- Add conversational memory
- Handle errors gracefully
Project 2: Multi-Tool Agent
Requirements:
- Create 2+ custom tools
- Build ReAct agent
- Demonstrate multi-step reasoning
- Add conversation history
- Include verbose logging
Project 3: Chatbot with Personality
Requirements:
- Custom system prompt with role
- Conversation memory (buffer or summary)
- Streaming responses
- Chat history export
- Command-line or web interface
Project 4: Sequential Workflow
Requirements:
- 3+ chained operations
- Each chain with different purpose
- Use prompt templates
- Process complex input β output
- Example: analyze β summarize β recommend
Deliverables (100 points):
- Code (50 pts): Clean, commented, working LangChain application
- Documentation (20 pts): README with setup instructions and usage examples
- Demo Video (20 pts): 3-5 min showing your app in action
- Reflection (10 pts): What worked, what was challenging, what you learned
π Resources & Next Steps
π Official Resources
- LangChain Docs: python.langchain.com
- API Reference: Comprehensive component docs
- GitHub: github.com/langchain-ai/langchain
- Discord: Active community support
π₯ Learning Materials
- LangChain Cookbook: Example recipes
- YouTube Tutorials: Sam Witteveen, Greg Kamradt
- Blog Posts: LangChain blog updates
- Course Labs: Practice notebooks
π Coming Next
π Unit 5: Retrieval Augmented Generation (RAG)
- Vector embeddings and similarity search
- Building production RAG systems
- Advanced retrieval techniques
- Evaluation and optimization
π‘ Practice Tip: Build small, working examples before tackling your assignment. The LangChain cookbook has tons of examples!
π Quick Reference Cheat Sheet
Basic Setup
from langchain.chat_models import ChatOpenAI
from langchain import LLMChain, PromptTemplate
llm = ChatOpenAI(temperature=0.7)
prompt = PromptTemplate.from_template("...")
chain = LLMChain(llm=llm, prompt=prompt)
result = chain.run(input)
With Memory
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
memory = ConversationBufferMemory()
chain = ConversationChain(llm=llm, memory=memory)
result = chain.predict(input="...")
Agent
from langchain.agents import initialize_agent, AgentType
tools = [...]
agent = initialize_agent(
tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION
)
result = agent.run("...")
Document Q&A
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever()
)
result = qa.run("...")
β Questions?
Let's Discuss!
Any questions about:
- LangChain architecture and components?
- Chains and how to compose them?
- Memory systems for conversations?
- Building agents with tools?
- Document loading and processing?
- Your project ideas?
Thank You! π
You now have the power to build production AI apps!
LangChain is your toolkitβstart building!
π§ Share your projects in class Discord!
π» Start with the cookbook examples!
ποΈ Build something amazing for your assignment!
π Next: RAG - the secret sauce of modern AI!