🐍 Python Fundamentals for GenAI

Unit 1: Building Your Foundation

Topics 1.2 - 1.7

📚 Today's Journey

                        What We'll Cover
                        🔤 Python Syntax & Data Types
🔄 Control Structures & Functions
🔢 NumPy for Numerical Operations
📊 Pandas for Data Manipulation
📈 Matplotlib for Visualization
🚀 FastAPI for API Development

                    

💡 Goal: Master the essential Python tools that power GenAI applications. These aren't just basics—they're the building blocks of every AI system you'll create!

1.2 Python Syntax & Data Types

🔤 Python Basics

Why Python for AI?

✅ Advantages

Simple, readable syntax
Massive AI/ML libraries
Large community support
Industry standard

🎯 Use Cases

Machine Learning
Data Analysis
API Development
Automation

Basic Syntax

# Python is readable and clean
        print("Hello, GenAI World!")

        # Variables - no declaration needed
        name = "Claude"
        age = 2
        is_ai = True

1.2 Python Syntax & Data Types

📦 Core Data Types

Type	Example	Use in GenAI
int	`42`	Token counts, dimensions
float	`3.14`	Probabilities, embeddings
str	`"prompt"`	Text, prompts, responses
bool	`True`	Flags, conditions
list	`[1, 2, 3]`	Token sequences, batches
dict	`{"key": "value"}`	JSON, API responses

# Examples
        tokens = ["Hello", "world", "!"]  # list
        response = {"role": "assistant", "content": "Hi!"}  # dict
        temperature = 0.7  # float - controls randomness in LLMs

1.2 Python Syntax & Data Types

✂️ String Operations (Critical for AI!)

# String manipulation - essential for prompt engineering
        prompt = "Explain AI in simple terms"

        # Common operations
        print(prompt.upper())        # EXPLAIN AI IN SIMPLE TERMS
        print(prompt.lower())        # explain ai in simple terms
        print(prompt.split())        # ['Explain', 'AI', 'in', 'simple', 'terms']
        print(len(prompt))           # 27 - character count

        # String formatting - building prompts dynamically
        topic = "quantum computing"
        prompt = f"Explain {topic} in simple terms"
        print(prompt)  # Explain quantum computing in simple terms

        # Multi-line strings - for complex prompts
        system_prompt = """
        You are a helpful AI assistant.
        Be concise and accurate.
        """

1.3 Control Structures & Functions

🔄 Control Flow: If-Else

# Decision making - essential for AI logic
        temperature = 0.8

        if temperature > 1.0:
            print("Too random - outputs will be chaotic")
        elif temperature > 0.7:
            print("Good for creative tasks")
        elif temperature > 0.3:
            print("Balanced creativity and accuracy")
        else:
            print("Very deterministic - good for factual tasks")

        # Real GenAI example: filtering responses
        response_length = 150
        if response_length < 10:
            print("Response too short, regenerate")
        elif response_length > 500:
            print("Response too long, truncate")
        else:
            print("Response length acceptable")

1.3 Control Structures & Functions

🔁 Loops: For & While

# For loop - iterate over sequences
        prompts = ["Hello", "How are you?", "Tell me a joke"]

        for prompt in prompts:
            print(f"Processing: {prompt}")
            # In real AI: send_to_llm(prompt)

        # Range - for numerical iterations
        for i in range(5):
            print(f"Batch {i+1} processing...")

        # While loop - condition-based iteration
        tokens_used = 0
        max_tokens = 1000

        while tokens_used < max_tokens:
            print(f"Generating... Tokens: {tokens_used}")
            tokens_used += 100  # simulate token generation
            if tokens_used >= max_tokens:
                print("Token limit reached!")

1.3 Control Structures & Functions

⚙️ Functions: Reusable Code

# Function definition
        def create_prompt(topic, style="simple"):
            """Creates a prompt for the LLM"""
            if style == "simple":
                return f"Explain {topic} in simple terms."
            elif style == "technical":
                return f"Provide a technical explanation of {topic}."
            else:
                return f"Discuss {topic}."

        # Function calls
        prompt1 = create_prompt("neural networks")
        prompt2 = create_prompt("transformers", "technical")

        print(prompt1)
        print(prompt2)

Explain neural networks in simple terms.
        Provide a technical explanation of transformers.

💡 Best Practice: Functions make your code modular and reusable—critical when building complex AI pipelines!

1.3 Control Structures & Functions

⚡ Lambda Functions: Quick & Simple

# Lambda = anonymous function (one-liner)
        # Regular function
        def count_tokens(text):
            return len(text.split())

        # Same as lambda
        count_tokens = lambda text: len(text.split())

        # Common use: transforming data
        prompts = ["hi", "how are you", "tell me about AI"]
        token_counts = list(map(lambda p: len(p.split()), prompts))
        print(token_counts)  # [1, 3, 4]

        # Filtering data
        long_prompts = list(filter(lambda p: len(p.split()) > 2, prompts))
        print(long_prompts)  # ['how are you', 'tell me about AI']


                    
                        Use Case: Lambdas are perfect for data preprocessing in AI pipelines—quick transformations without cluttering your code!

1.4 NumPy for Numerical Operations

🔢 NumPy: The Foundation of AI Math

Why NumPy?

Fast numerical computing
Array operations
Matrix math
Foundation for AI libraries

AI Applications

Vector embeddings
Matrix operations
Similarity calculations
Data preprocessing

# Installing NumPy
        # pip install numpy

        import numpy as np

        # Create arrays
        arr = np.array([1, 2, 3, 4, 5])
        print(arr)           # [1 2 3 4 5]
        print(type(arr))     # <class 'numpy.ndarray'>

        # 2D array (matrix)
        matrix = np.array([[1, 2, 3], [4, 5, 6]])
        print(matrix.shape)  # (2, 3) - 2 rows, 3 columns

1.4 NumPy for Numerical Operations

➗ NumPy Operations

import numpy as np

        # Basic operations
        arr = np.array([1, 2, 3, 4])
        print(arr * 2)       # [2 4 6 8] - element-wise multiplication
        print(arr + 10)      # [11 12 13 14] - broadcasting
        print(np.sum(arr))   # 10
        print(np.mean(arr))  # 2.5

        # AI Example: Cosine Similarity (for text embeddings)
        embedding1 = np.array([0.5, 0.8, 0.2])
        embedding2 = np.array([0.6, 0.7, 0.3])

        # Calculate similarity
        dot_product = np.dot(embedding1, embedding2)
        norm1 = np.linalg.norm(embedding1)
        norm2 = np.linalg.norm(embedding2)
        similarity = dot_product / (norm1 * norm2)

        print(f"Similarity: {similarity:.3f}")  # 0.989 (very similar!)


                    
                        🎯 Real World: This is exactly how RAG systems calculate which documents are most relevant to your query!

1.5 Pandas for Data Manipulation

📊 Pandas: Data Manipulation Powerhouse

# pip install pandas
        import pandas as pd

        # Create DataFrame (like Excel spreadsheet)
        data = {
            'prompt': ['Explain AI', 'Write code', 'Summarize'],
            'tokens': [150, 300, 200],
            'model': ['GPT-4', 'GPT-4', 'Claude']
        }

        df = pd.DataFrame(data)
        print(df)


                            prompt  tokens   model
        0   Explain AI     150   GPT-4
        1   Write code     300   GPT-4
        2    Summarize     200  Claude
                    
                        💡 Use Case: Track LLM usage, analyze prompt performance, manage training data!

1.5 Pandas for Data Manipulation

🔍 Pandas Data Operations

# Basic operations
        print(df['tokens'])           # Select column
        print(df[df['tokens'] > 150]) # Filter rows
        print(df['tokens'].mean())    # 216.67 - average

        # Add new column
        df['cost'] = df['tokens'] * 0.0001  # Calculate cost

        # Group by model
        print(df.groupby('model')['tokens'].sum())

        # Read/Write files
        df.to_csv('llm_usage.csv', index=False)
        df_loaded = pd.read_csv('llm_usage.csv')

1.6 Data Visualization with Matplotlib

📈 Matplotlib: Visualize Your Data

Why Visualize?

Understand patterns
Debug issues
Communicate insights
Track model performance

Common Plots

Line charts (trends)
Bar charts (comparisons)
Scatter plots (relationships)
Histograms (distributions)

# pip install matplotlib
        import matplotlib.pyplot as plt

        # Simple line plot
        epochs = [1, 2, 3, 4, 5]
        loss = [0.8, 0.6, 0.4, 0.3, 0.25]

        plt.plot(epochs, loss, marker='o')
        plt.xlabel('Epoch')
        plt.ylabel('Loss')
        plt.title('Training Loss Over Time')
        plt.grid(True)
        plt.show()

1.6 Data Visualization with Matplotlib

📊 More Visualization Examples

# Bar chart - compare model performance
        models = ['GPT-3.5', 'GPT-4', 'Claude']
        scores = [85, 92, 90]

        plt.bar(models, scores, color=['blue', 'green', 'purple'])
        plt.ylabel('Accuracy Score')
        plt.title('Model Performance Comparison')
        plt.ylim(80, 95)
        plt.show()

        # Scatter plot - token usage vs cost
        tokens = [100, 250, 500, 750, 1000]
        cost = [0.01, 0.025, 0.05, 0.075, 0.1]

        plt.scatter(tokens, cost, s=100, alpha=0.6)
        plt.xlabel('Tokens')
        plt.ylabel('Cost ($)')
        plt.title('Token Usage vs Cost')
        plt.grid(True, alpha=0.3)
        plt.show()


                    
                        💡 Pro Tip: Always visualize your training metrics! It helps you spot problems early.

1.7 FastAPI for API Development

🚀 FastAPI: Build AI APIs Fast!

What is an API?

API (Application Programming Interface) = A way for programs to talk to each other

Example: When you use ChatGPT in your app, you're calling OpenAI's API

Why FastAPI?

Super fast performance
Easy to learn
Automatic documentation
Type checking built-in

GenAI Use Cases

Expose LLM to web apps
Build chatbot backends
Create RAG APIs
Model serving

1.7 FastAPI for API Development

⚡ Your First FastAPI App

# pip install fastapi uvicorn
        from fastapi import FastAPI

        app = FastAPI()

        # Simple endpoint
        @app.get("/")
        def read_root():
            return {"message": "Welcome to GenAI API!"}

        # Endpoint with parameters
        @app.get("/generate/{prompt}")
        def generate_text(prompt: str, max_tokens: int = 100):
            # In real app: call LLM here
            return {
                "prompt": prompt,
                "max_tokens": max_tokens,
                "response": f"Generated text for: {prompt}"
            }

        # Run: uvicorn filename:app --reload


                    Visit: http://127.0.0.1:8000/docs 
        Automatic interactive documentation! 🎉

1.7 FastAPI for API Development

📮 POST Requests: Sending Data

from fastapi import FastAPI
        from pydantic import BaseModel

        app = FastAPI()

        # Define data structure
        class PromptRequest(BaseModel):
            prompt: str
            temperature: float = 0.7
            max_tokens: int = 150

        # POST endpoint
        @app.post("/chat")
        def chat_completion(request: PromptRequest):
            # Process the prompt
            response = {
                "prompt": request.prompt,
                "settings": {
                    "temperature": request.temperature,
                    "max_tokens": request.max_tokens
                },
                "completion": "This is where LLM response goes"
            }
            return response


                    
                        💡 Real World: This is exactly how you'd build a backend for your chatbot!

🎯 Putting It All Together

Building a Complete GenAI Pipelineimport numpy as np
        import pandas as pd
        from fastapi import FastAPI

        app = FastAPI()

        # Load prompt templates
        def load_templates():
            df = pd.read_csv('prompt_templates.csv')
            return df

        # Calculate similarity (NumPy)
        def find_best_template(query_embedding, template_embeddings):
            similarities = np.array([
                np.dot(query_embedding, t) / (np.linalg.norm(query_embedding) * np.linalg.norm(t))
                for t in template_embeddings
            ])
            return np.argmax(similarities)

        # API endpoint
        @app.post("/find-template")
        def get_template(query: str):
            # This is a simplified example showing integration
            templates = load_templates()
            return {"best_template": templates.iloc[0]['template']}

                    

🎯 Key Takeaways

🐍 Python Basics

Clean, readable syntax
Dynamic typing
Functions & lambdas

🔢 NumPy

Fast array operations
Vector math
Essential for embeddings

📊 Pandas

DataFrames for data
Easy manipulation
File I/O

📈 Matplotlib

Visualize data
Track metrics
Debug models

🚀 FastAPI

Build APIs quickly
Serve AI models
Auto documentation

🎯 Together

Complete AI stack
Production-ready
Industry standard

🔥 These aren't just libraries—they're your GenAI toolkit!

Master these, and you can build anything from chatbots to RAG systems to full AI applications.

💪 Practice Exercise

Challenge: Build a Prompt Analyzer

Create a Python program that:

Takes a list of prompts (use Pandas to load from CSV)
Counts tokens in each prompt (use string operations)
Calculates average, min, max tokens (use NumPy)
Creates a bar chart of prompt lengths (use Matplotlib)
Exposes the analysis via FastAPI endpoint

Starter Code Hint:import pandas as pd
        import numpy as np
        import matplotlib.pyplot as plt
        from fastapi import FastAPI

        # Your code here!
        def count_tokens(text):
            return len(text.split())

        # Load data
        # Analyze
        # Visualize
        # Create API

                    

📚 Next Steps & Resources

📖 Learn More

Python: python.org/docs
NumPy: numpy.org/doc
Pandas: pandas.pydata.org
Matplotlib: matplotlib.org
FastAPI: fastapi.tiangolo.com

🎯 Practice Platforms

Kaggle (datasets + notebooks)
LeetCode (Python practice)
Google Colab (free notebooks)
GitHub (example projects)

📅 Coming Next Week

🧠 Unit 2: Foundations of Large Language Models

Transformer Architecture
Attention Mechanisms
GPT, BERT, LLaMA explained

❓ Questions?

Let's Discuss!

Any questions about:

Python syntax or data types?
NumPy or Pandas operations?
Matplotlib visualizations?
FastAPI endpoints?
How these fit into GenAI?

Thank You! 🎉

Practice these fundamentals

They're the foundation of everything we'll build!

📧 Questions? Reach out anytime!

💻 Start coding today!

🚀 See you in the next class!