๐ Python Fundamentals for GenAI
Unit 1: Building Your Foundation
Topics 1.2 - 1.7
๐ Today's Journey
What We'll Cover
- ๐ค Python Syntax & Data Types
- ๐ Control Structures & Functions
- ๐ข NumPy for Numerical Operations
- ๐ Pandas for Data Manipulation
- ๐ Matplotlib for Visualization
- ๐ FastAPI for API Development
๐ก Goal: Master the essential Python tools that power GenAI applications. These aren't just basicsโthey're the building blocks of every AI system you'll create!
๐ค Python Basics
Why Python for AI?
โ Advantages
- Simple, readable syntax
- Massive AI/ML libraries
- Large community support
- Industry standard
๐ฏ Use Cases
- Machine Learning
- Data Analysis
- API Development
- Automation
Basic Syntax
# Python is readable and clean
print("Hello, GenAI World!")
# Variables - no declaration needed
name = "Claude"
age = 2
is_ai = True๐ฆ Core Data Types
| Type | Example | Use in GenAI |
|---|---|---|
| int | 42 |
Token counts, dimensions |
| float | 3.14 |
Probabilities, embeddings |
| str | "prompt" |
Text, prompts, responses |
| bool | True |
Flags, conditions |
| list | [1, 2, 3] |
Token sequences, batches |
| dict | {"key": "value"} |
JSON, API responses |
# Examples
tokens = ["Hello", "world", "!"] # list
response = {"role": "assistant", "content": "Hi!"} # dict
temperature = 0.7 # float - controls randomness in LLMsโ๏ธ String Operations (Critical for AI!)
# String manipulation - essential for prompt engineering
prompt = "Explain AI in simple terms"
# Common operations
print(prompt.upper()) # EXPLAIN AI IN SIMPLE TERMS
print(prompt.lower()) # explain ai in simple terms
print(prompt.split()) # ['Explain', 'AI', 'in', 'simple', 'terms']
print(len(prompt)) # 27 - character count
# String formatting - building prompts dynamically
topic = "quantum computing"
prompt = f"Explain {topic} in simple terms"
print(prompt) # Explain quantum computing in simple terms
# Multi-line strings - for complex prompts
system_prompt = """
You are a helpful AI assistant.
Be concise and accurate.
"""๐ Control Flow: If-Else
# Decision making - essential for AI logic
temperature = 0.8
if temperature > 1.0:
print("Too random - outputs will be chaotic")
elif temperature > 0.7:
print("Good for creative tasks")
elif temperature > 0.3:
print("Balanced creativity and accuracy")
else:
print("Very deterministic - good for factual tasks")
# Real GenAI example: filtering responses
response_length = 150
if response_length < 10:
print("Response too short, regenerate")
elif response_length > 500:
print("Response too long, truncate")
else:
print("Response length acceptable")๐ Loops: For & While
# For loop - iterate over sequences
prompts = ["Hello", "How are you?", "Tell me a joke"]
for prompt in prompts:
print(f"Processing: {prompt}")
# In real AI: send_to_llm(prompt)
# Range - for numerical iterations
for i in range(5):
print(f"Batch {i+1} processing...")
# While loop - condition-based iteration
tokens_used = 0
max_tokens = 1000
while tokens_used < max_tokens:
print(f"Generating... Tokens: {tokens_used}")
tokens_used += 100 # simulate token generation
if tokens_used >= max_tokens:
print("Token limit reached!")โ๏ธ Functions: Reusable Code
# Function definition
def create_prompt(topic, style="simple"):
"""Creates a prompt for the LLM"""
if style == "simple":
return f"Explain {topic} in simple terms."
elif style == "technical":
return f"Provide a technical explanation of {topic}."
else:
return f"Discuss {topic}."
# Function calls
prompt1 = create_prompt("neural networks")
prompt2 = create_prompt("transformers", "technical")
print(prompt1)
print(prompt2)Explain neural networks in simple terms.
Provide a technical explanation of transformers.๐ก Best Practice: Functions make your code modular and reusableโcritical when building complex AI pipelines!
โก Lambda Functions: Quick & Simple
# Lambda = anonymous function (one-liner)
# Regular function
def count_tokens(text):
return len(text.split())
# Same as lambda
count_tokens = lambda text: len(text.split())
# Common use: transforming data
prompts = ["hi", "how are you", "tell me about AI"]
token_counts = list(map(lambda p: len(p.split()), prompts))
print(token_counts) # [1, 3, 4]
# Filtering data
long_prompts = list(filter(lambda p: len(p.split()) > 2, prompts))
print(long_prompts) # ['how are you', 'tell me about AI']Use Case: Lambdas are perfect for data preprocessing in AI pipelinesโquick transformations without cluttering your code!
๐ข NumPy: The Foundation of AI Math
Why NumPy?
- Fast numerical computing
- Array operations
- Matrix math
- Foundation for AI libraries
AI Applications
- Vector embeddings
- Matrix operations
- Similarity calculations
- Data preprocessing
# Installing NumPy
# pip install numpy
import numpy as np
# Create arrays
arr = np.array([1, 2, 3, 4, 5])
print(arr) # [1 2 3 4 5]
print(type(arr)) # <class 'numpy.ndarray'>
# 2D array (matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix.shape) # (2, 3) - 2 rows, 3 columnsโ NumPy Operations
import numpy as np
# Basic operations
arr = np.array([1, 2, 3, 4])
print(arr * 2) # [2 4 6 8] - element-wise multiplication
print(arr + 10) # [11 12 13 14] - broadcasting
print(np.sum(arr)) # 10
print(np.mean(arr)) # 2.5
# AI Example: Cosine Similarity (for text embeddings)
embedding1 = np.array([0.5, 0.8, 0.2])
embedding2 = np.array([0.6, 0.7, 0.3])
# Calculate similarity
dot_product = np.dot(embedding1, embedding2)
norm1 = np.linalg.norm(embedding1)
norm2 = np.linalg.norm(embedding2)
similarity = dot_product / (norm1 * norm2)
print(f"Similarity: {similarity:.3f}") # 0.989 (very similar!)๐ฏ Real World: This is exactly how RAG systems calculate which documents are most relevant to your query!
๐ Pandas: Data Manipulation Powerhouse
# pip install pandas
import pandas as pd
# Create DataFrame (like Excel spreadsheet)
data = {
'prompt': ['Explain AI', 'Write code', 'Summarize'],
'tokens': [150, 300, 200],
'model': ['GPT-4', 'GPT-4', 'Claude']
}
df = pd.DataFrame(data)
print(df) prompt tokens model
0 Explain AI 150 GPT-4
1 Write code 300 GPT-4
2 Summarize 200 Claude๐ก Use Case: Track LLM usage, analyze prompt performance, manage training data!
๐ Pandas Data Operations
# Basic operations
print(df['tokens']) # Select column
print(df[df['tokens'] > 150]) # Filter rows
print(df['tokens'].mean()) # 216.67 - average
# Add new column
df['cost'] = df['tokens'] * 0.0001 # Calculate cost
# Group by model
print(df.groupby('model')['tokens'].sum())
# Read/Write files
df.to_csv('llm_usage.csv', index=False)
df_loaded = pd.read_csv('llm_usage.csv')๐ Matplotlib: Visualize Your Data
Why Visualize?
- Understand patterns
- Debug issues
- Communicate insights
- Track model performance
Common Plots
- Line charts (trends)
- Bar charts (comparisons)
- Scatter plots (relationships)
- Histograms (distributions)
# pip install matplotlib
import matplotlib.pyplot as plt
# Simple line plot
epochs = [1, 2, 3, 4, 5]
loss = [0.8, 0.6, 0.4, 0.3, 0.25]
plt.plot(epochs, loss, marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Over Time')
plt.grid(True)
plt.show()๐ More Visualization Examples
# Bar chart - compare model performance
models = ['GPT-3.5', 'GPT-4', 'Claude']
scores = [85, 92, 90]
plt.bar(models, scores, color=['blue', 'green', 'purple'])
plt.ylabel('Accuracy Score')
plt.title('Model Performance Comparison')
plt.ylim(80, 95)
plt.show()
# Scatter plot - token usage vs cost
tokens = [100, 250, 500, 750, 1000]
cost = [0.01, 0.025, 0.05, 0.075, 0.1]
plt.scatter(tokens, cost, s=100, alpha=0.6)
plt.xlabel('Tokens')
plt.ylabel('Cost ($)')
plt.title('Token Usage vs Cost')
plt.grid(True, alpha=0.3)
plt.show()๐ก Pro Tip: Always visualize your training metrics! It helps you spot problems early.
๐ FastAPI: Build AI APIs Fast!
What is an API?
API (Application Programming Interface) = A way for programs to talk to each other
Example: When you use ChatGPT in your app, you're calling OpenAI's API
Why FastAPI?
- Super fast performance
- Easy to learn
- Automatic documentation
- Type checking built-in
GenAI Use Cases
- Expose LLM to web apps
- Build chatbot backends
- Create RAG APIs
- Model serving
โก Your First FastAPI App
# pip install fastapi uvicorn
from fastapi import FastAPI
app = FastAPI()
# Simple endpoint
@app.get("/")
def read_root():
return {"message": "Welcome to GenAI API!"}
# Endpoint with parameters
@app.get("/generate/{prompt}")
def generate_text(prompt: str, max_tokens: int = 100):
# In real app: call LLM here
return {
"prompt": prompt,
"max_tokens": max_tokens,
"response": f"Generated text for: {prompt}"
}
# Run: uvicorn filename:app --reload๐ฎ POST Requests: Sending Data
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
# Define data structure
class PromptRequest(BaseModel):
prompt: str
temperature: float = 0.7
max_tokens: int = 150
# POST endpoint
@app.post("/chat")
def chat_completion(request: PromptRequest):
# Process the prompt
response = {
"prompt": request.prompt,
"settings": {
"temperature": request.temperature,
"max_tokens": request.max_tokens
},
"completion": "This is where LLM response goes"
}
return response๐ก Real World: This is exactly how you'd build a backend for your chatbot!
๐ฏ Putting It All Together
Building a Complete GenAI Pipeline
import numpy as np
import pandas as pd
from fastapi import FastAPI
app = FastAPI()
# Load prompt templates
def load_templates():
df = pd.read_csv('prompt_templates.csv')
return df
# Calculate similarity (NumPy)
def find_best_template(query_embedding, template_embeddings):
similarities = np.array([
np.dot(query_embedding, t) / (np.linalg.norm(query_embedding) * np.linalg.norm(t))
for t in template_embeddings
])
return np.argmax(similarities)
# API endpoint
@app.post("/find-template")
def get_template(query: str):
# This is a simplified example showing integration
templates = load_templates()
return {"best_template": templates.iloc[0]['template']}๐ฏ Key Takeaways
๐ Python Basics
- Clean, readable syntax
- Dynamic typing
- Functions & lambdas
๐ข NumPy
- Fast array operations
- Vector math
- Essential for embeddings
๐ Pandas
- DataFrames for data
- Easy manipulation
- File I/O
๐ Matplotlib
- Visualize data
- Track metrics
- Debug models
๐ FastAPI
- Build APIs quickly
- Serve AI models
- Auto documentation
๐ฏ Together
- Complete AI stack
- Production-ready
- Industry standard
๐ฅ These aren't just librariesโthey're your GenAI toolkit!
Master these, and you can build anything from chatbots to RAG systems to full AI applications.
๐ช Practice Exercise
Challenge: Build a Prompt Analyzer
Create a Python program that:
- Takes a list of prompts (use Pandas to load from CSV)
- Counts tokens in each prompt (use string operations)
- Calculates average, min, max tokens (use NumPy)
- Creates a bar chart of prompt lengths (use Matplotlib)
- Exposes the analysis via FastAPI endpoint
Starter Code Hint:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from fastapi import FastAPI
# Your code here!
def count_tokens(text):
return len(text.split())
# Load data
# Analyze
# Visualize
# Create API๐ Next Steps & Resources
๐ Learn More
- Python: python.org/docs
- NumPy: numpy.org/doc
- Pandas: pandas.pydata.org
- Matplotlib: matplotlib.org
- FastAPI: fastapi.tiangolo.com
๐ฏ Practice Platforms
- Kaggle (datasets + notebooks)
- LeetCode (Python practice)
- Google Colab (free notebooks)
- GitHub (example projects)
๐ Coming Next Week
๐ง Unit 2: Foundations of Large Language Models
- Transformer Architecture
- Attention Mechanisms
- GPT, BERT, LLaMA explained
โ Questions?
Let's Discuss!
Any questions about:
- Python syntax or data types?
- NumPy or Pandas operations?
- Matplotlib visualizations?
- FastAPI endpoints?
- How these fit into GenAI?
Thank You! ๐
Practice these fundamentals
They're the foundation of everything we'll build!
๐ง Questions? Reach out anytime!
๐ป Start coding today!
๐ See you in the next class!