๐Ÿ Python Fundamentals for GenAI

Unit 1: Building Your Foundation

Topics 1.2 - 1.7

๐Ÿ“š Today's Journey

What We'll Cover

  • ๐Ÿ”ค Python Syntax & Data Types
  • ๐Ÿ”„ Control Structures & Functions
  • ๐Ÿ”ข NumPy for Numerical Operations
  • ๐Ÿ“Š Pandas for Data Manipulation
  • ๐Ÿ“ˆ Matplotlib for Visualization
  • ๐Ÿš€ FastAPI for API Development

๐Ÿ’ก Goal: Master the essential Python tools that power GenAI applications. These aren't just basicsโ€”they're the building blocks of every AI system you'll create!

1.2 Python Syntax & Data Types

๐Ÿ”ค Python Basics

Why Python for AI?

โœ… Advantages

  • Simple, readable syntax
  • Massive AI/ML libraries
  • Large community support
  • Industry standard

๐ŸŽฏ Use Cases

  • Machine Learning
  • Data Analysis
  • API Development
  • Automation

Basic Syntax

# Python is readable and clean
        print("Hello, GenAI World!")

        # Variables - no declaration needed
        name = "Claude"
        age = 2
        is_ai = True
1.2 Python Syntax & Data Types

๐Ÿ“ฆ Core Data Types

Type Example Use in GenAI
int 42 Token counts, dimensions
float 3.14 Probabilities, embeddings
str "prompt" Text, prompts, responses
bool True Flags, conditions
list [1, 2, 3] Token sequences, batches
dict {"key": "value"} JSON, API responses
# Examples
        tokens = ["Hello", "world", "!"]  # list
        response = {"role": "assistant", "content": "Hi!"}  # dict
        temperature = 0.7  # float - controls randomness in LLMs
1.2 Python Syntax & Data Types

โœ‚๏ธ String Operations (Critical for AI!)

# String manipulation - essential for prompt engineering
        prompt = "Explain AI in simple terms"

        # Common operations
        print(prompt.upper())        # EXPLAIN AI IN SIMPLE TERMS
        print(prompt.lower())        # explain ai in simple terms
        print(prompt.split())        # ['Explain', 'AI', 'in', 'simple', 'terms']
        print(len(prompt))           # 27 - character count

        # String formatting - building prompts dynamically
        topic = "quantum computing"
        prompt = f"Explain {topic} in simple terms"
        print(prompt)  # Explain quantum computing in simple terms

        # Multi-line strings - for complex prompts
        system_prompt = """
        You are a helpful AI assistant.
        Be concise and accurate.
        """
1.3 Control Structures & Functions

๐Ÿ”„ Control Flow: If-Else

# Decision making - essential for AI logic
        temperature = 0.8

        if temperature > 1.0:
            print("Too random - outputs will be chaotic")
        elif temperature > 0.7:
            print("Good for creative tasks")
        elif temperature > 0.3:
            print("Balanced creativity and accuracy")
        else:
            print("Very deterministic - good for factual tasks")

        # Real GenAI example: filtering responses
        response_length = 150
        if response_length < 10:
            print("Response too short, regenerate")
        elif response_length > 500:
            print("Response too long, truncate")
        else:
            print("Response length acceptable")
1.3 Control Structures & Functions

๐Ÿ” Loops: For & While

# For loop - iterate over sequences
        prompts = ["Hello", "How are you?", "Tell me a joke"]

        for prompt in prompts:
            print(f"Processing: {prompt}")
            # In real AI: send_to_llm(prompt)

        # Range - for numerical iterations
        for i in range(5):
            print(f"Batch {i+1} processing...")

        # While loop - condition-based iteration
        tokens_used = 0
        max_tokens = 1000

        while tokens_used < max_tokens:
            print(f"Generating... Tokens: {tokens_used}")
            tokens_used += 100  # simulate token generation
            if tokens_used >= max_tokens:
                print("Token limit reached!")
1.3 Control Structures & Functions

โš™๏ธ Functions: Reusable Code

# Function definition
        def create_prompt(topic, style="simple"):
            """Creates a prompt for the LLM"""
            if style == "simple":
                return f"Explain {topic} in simple terms."
            elif style == "technical":
                return f"Provide a technical explanation of {topic}."
            else:
                return f"Discuss {topic}."

        # Function calls
        prompt1 = create_prompt("neural networks")
        prompt2 = create_prompt("transformers", "technical")

        print(prompt1)
        print(prompt2)
Explain neural networks in simple terms.
        Provide a technical explanation of transformers.

๐Ÿ’ก Best Practice: Functions make your code modular and reusableโ€”critical when building complex AI pipelines!

1.3 Control Structures & Functions

โšก Lambda Functions: Quick & Simple

# Lambda = anonymous function (one-liner)
        # Regular function
        def count_tokens(text):
            return len(text.split())

        # Same as lambda
        count_tokens = lambda text: len(text.split())

        # Common use: transforming data
        prompts = ["hi", "how are you", "tell me about AI"]
        token_counts = list(map(lambda p: len(p.split()), prompts))
        print(token_counts)  # [1, 3, 4]

        # Filtering data
        long_prompts = list(filter(lambda p: len(p.split()) > 2, prompts))
        print(long_prompts)  # ['how are you', 'tell me about AI']

Use Case: Lambdas are perfect for data preprocessing in AI pipelinesโ€”quick transformations without cluttering your code!

1.4 NumPy for Numerical Operations

๐Ÿ”ข NumPy: The Foundation of AI Math

Why NumPy?

  • Fast numerical computing
  • Array operations
  • Matrix math
  • Foundation for AI libraries

AI Applications

  • Vector embeddings
  • Matrix operations
  • Similarity calculations
  • Data preprocessing
# Installing NumPy
        # pip install numpy

        import numpy as np

        # Create arrays
        arr = np.array([1, 2, 3, 4, 5])
        print(arr)           # [1 2 3 4 5]
        print(type(arr))     # <class 'numpy.ndarray'>

        # 2D array (matrix)
        matrix = np.array([[1, 2, 3], [4, 5, 6]])
        print(matrix.shape)  # (2, 3) - 2 rows, 3 columns
1.4 NumPy for Numerical Operations

โž— NumPy Operations

import numpy as np

        # Basic operations
        arr = np.array([1, 2, 3, 4])
        print(arr * 2)       # [2 4 6 8] - element-wise multiplication
        print(arr + 10)      # [11 12 13 14] - broadcasting
        print(np.sum(arr))   # 10
        print(np.mean(arr))  # 2.5

        # AI Example: Cosine Similarity (for text embeddings)
        embedding1 = np.array([0.5, 0.8, 0.2])
        embedding2 = np.array([0.6, 0.7, 0.3])

        # Calculate similarity
        dot_product = np.dot(embedding1, embedding2)
        norm1 = np.linalg.norm(embedding1)
        norm2 = np.linalg.norm(embedding2)
        similarity = dot_product / (norm1 * norm2)

        print(f"Similarity: {similarity:.3f}")  # 0.989 (very similar!)

๐ŸŽฏ Real World: This is exactly how RAG systems calculate which documents are most relevant to your query!

1.5 Pandas for Data Manipulation

๐Ÿ“Š Pandas: Data Manipulation Powerhouse

# pip install pandas
        import pandas as pd

        # Create DataFrame (like Excel spreadsheet)
        data = {
            'prompt': ['Explain AI', 'Write code', 'Summarize'],
            'tokens': [150, 300, 200],
            'model': ['GPT-4', 'GPT-4', 'Claude']
        }

        df = pd.DataFrame(data)
        print(df)
        prompt  tokens   model
        0   Explain AI     150   GPT-4
        1   Write code     300   GPT-4
        2    Summarize     200  Claude

๐Ÿ’ก Use Case: Track LLM usage, analyze prompt performance, manage training data!

1.5 Pandas for Data Manipulation

๐Ÿ” Pandas Data Operations

# Basic operations
        print(df['tokens'])           # Select column
        print(df[df['tokens'] > 150]) # Filter rows
        print(df['tokens'].mean())    # 216.67 - average

        # Add new column
        df['cost'] = df['tokens'] * 0.0001  # Calculate cost

        # Group by model
        print(df.groupby('model')['tokens'].sum())

        # Read/Write files
        df.to_csv('llm_usage.csv', index=False)
        df_loaded = pd.read_csv('llm_usage.csv')
1.6 Data Visualization with Matplotlib

๐Ÿ“ˆ Matplotlib: Visualize Your Data

Why Visualize?

  • Understand patterns
  • Debug issues
  • Communicate insights
  • Track model performance

Common Plots

  • Line charts (trends)
  • Bar charts (comparisons)
  • Scatter plots (relationships)
  • Histograms (distributions)
# pip install matplotlib
        import matplotlib.pyplot as plt

        # Simple line plot
        epochs = [1, 2, 3, 4, 5]
        loss = [0.8, 0.6, 0.4, 0.3, 0.25]

        plt.plot(epochs, loss, marker='o')
        plt.xlabel('Epoch')
        plt.ylabel('Loss')
        plt.title('Training Loss Over Time')
        plt.grid(True)
        plt.show()
1.6 Data Visualization with Matplotlib

๐Ÿ“Š More Visualization Examples

# Bar chart - compare model performance
        models = ['GPT-3.5', 'GPT-4', 'Claude']
        scores = [85, 92, 90]

        plt.bar(models, scores, color=['blue', 'green', 'purple'])
        plt.ylabel('Accuracy Score')
        plt.title('Model Performance Comparison')
        plt.ylim(80, 95)
        plt.show()

        # Scatter plot - token usage vs cost
        tokens = [100, 250, 500, 750, 1000]
        cost = [0.01, 0.025, 0.05, 0.075, 0.1]

        plt.scatter(tokens, cost, s=100, alpha=0.6)
        plt.xlabel('Tokens')
        plt.ylabel('Cost ($)')
        plt.title('Token Usage vs Cost')
        plt.grid(True, alpha=0.3)
        plt.show()

๐Ÿ’ก Pro Tip: Always visualize your training metrics! It helps you spot problems early.

1.7 FastAPI for API Development

๐Ÿš€ FastAPI: Build AI APIs Fast!

What is an API?

API (Application Programming Interface) = A way for programs to talk to each other

Example: When you use ChatGPT in your app, you're calling OpenAI's API

Why FastAPI?

  • Super fast performance
  • Easy to learn
  • Automatic documentation
  • Type checking built-in

GenAI Use Cases

  • Expose LLM to web apps
  • Build chatbot backends
  • Create RAG APIs
  • Model serving
1.7 FastAPI for API Development

โšก Your First FastAPI App

# pip install fastapi uvicorn
        from fastapi import FastAPI

        app = FastAPI()

        # Simple endpoint
        @app.get("/")
        def read_root():
            return {"message": "Welcome to GenAI API!"}

        # Endpoint with parameters
        @app.get("/generate/{prompt}")
        def generate_text(prompt: str, max_tokens: int = 100):
            # In real app: call LLM here
            return {
                "prompt": prompt,
                "max_tokens": max_tokens,
                "response": f"Generated text for: {prompt}"
            }

        # Run: uvicorn filename:app --reload
Visit: http://127.0.0.1:8000/docs Automatic interactive documentation! ๐ŸŽ‰
1.7 FastAPI for API Development

๐Ÿ“ฎ POST Requests: Sending Data

from fastapi import FastAPI
        from pydantic import BaseModel

        app = FastAPI()

        # Define data structure
        class PromptRequest(BaseModel):
            prompt: str
            temperature: float = 0.7
            max_tokens: int = 150

        # POST endpoint
        @app.post("/chat")
        def chat_completion(request: PromptRequest):
            # Process the prompt
            response = {
                "prompt": request.prompt,
                "settings": {
                    "temperature": request.temperature,
                    "max_tokens": request.max_tokens
                },
                "completion": "This is where LLM response goes"
            }
            return response

๐Ÿ’ก Real World: This is exactly how you'd build a backend for your chatbot!

๐ŸŽฏ Putting It All Together

Building a Complete GenAI Pipeline

import numpy as np
        import pandas as pd
        from fastapi import FastAPI

        app = FastAPI()

        # Load prompt templates
        def load_templates():
            df = pd.read_csv('prompt_templates.csv')
            return df

        # Calculate similarity (NumPy)
        def find_best_template(query_embedding, template_embeddings):
            similarities = np.array([
                np.dot(query_embedding, t) / (np.linalg.norm(query_embedding) * np.linalg.norm(t))
                for t in template_embeddings
            ])
            return np.argmax(similarities)

        # API endpoint
        @app.post("/find-template")
        def get_template(query: str):
            # This is a simplified example showing integration
            templates = load_templates()
            return {"best_template": templates.iloc[0]['template']}

๐ŸŽฏ Key Takeaways

๐Ÿ Python Basics

  • Clean, readable syntax
  • Dynamic typing
  • Functions & lambdas

๐Ÿ”ข NumPy

  • Fast array operations
  • Vector math
  • Essential for embeddings

๐Ÿ“Š Pandas

  • DataFrames for data
  • Easy manipulation
  • File I/O

๐Ÿ“ˆ Matplotlib

  • Visualize data
  • Track metrics
  • Debug models

๐Ÿš€ FastAPI

  • Build APIs quickly
  • Serve AI models
  • Auto documentation

๐ŸŽฏ Together

  • Complete AI stack
  • Production-ready
  • Industry standard

๐Ÿ”ฅ These aren't just librariesโ€”they're your GenAI toolkit!

Master these, and you can build anything from chatbots to RAG systems to full AI applications.

๐Ÿ’ช Practice Exercise

Challenge: Build a Prompt Analyzer

Create a Python program that:

  1. Takes a list of prompts (use Pandas to load from CSV)
  2. Counts tokens in each prompt (use string operations)
  3. Calculates average, min, max tokens (use NumPy)
  4. Creates a bar chart of prompt lengths (use Matplotlib)
  5. Exposes the analysis via FastAPI endpoint

Starter Code Hint:

import pandas as pd
        import numpy as np
        import matplotlib.pyplot as plt
        from fastapi import FastAPI

        # Your code here!
        def count_tokens(text):
            return len(text.split())

        # Load data
        # Analyze
        # Visualize
        # Create API

๐Ÿ“š Next Steps & Resources

๐Ÿ“– Learn More

  • Python: python.org/docs
  • NumPy: numpy.org/doc
  • Pandas: pandas.pydata.org
  • Matplotlib: matplotlib.org
  • FastAPI: fastapi.tiangolo.com

๐ŸŽฏ Practice Platforms

  • Kaggle (datasets + notebooks)
  • LeetCode (Python practice)
  • Google Colab (free notebooks)
  • GitHub (example projects)

๐Ÿ“… Coming Next Week

๐Ÿง  Unit 2: Foundations of Large Language Models

  • Transformer Architecture
  • Attention Mechanisms
  • GPT, BERT, LLaMA explained

โ“ Questions?

Let's Discuss!

Any questions about:

  • Python syntax or data types?
  • NumPy or Pandas operations?
  • Matplotlib visualizations?
  • FastAPI endpoints?
  • How these fit into GenAI?

Thank You! ๐ŸŽ‰

Practice these fundamentals

They're the foundation of everything we'll build!

๐Ÿ“ง Questions? Reach out anytime!

๐Ÿ’ป Start coding today!

๐Ÿš€ See you in the next class!

1 / 24