Final Project Guidelines

EE 541: A Computational Introduction to Deep Learning

Project Topics

Select one of the following topics:

Urban Sound Classification — Classify environmental sounds from city recordings using audio spectrograms.
Intent Classification — Determine user intent from natural language queries in a travel booking domain.
Remaining Useful Life Prediction — Predict equipment failure from multivariate sensor time-series data.
Image Colorization — Generate plausible colors for grayscale images of pets.

See project deliverables for submission requirements and deadlines.

Overview

The final project requires teams of two students to apply deep learning techniques to a problem selected from the topics above. Each option provides a dataset and problem statement. Your task is determining how to approach the problem, designing experiments, implementing models, and analyzing results systematically.

This project demonstrates understanding of course concepts through hands-on application. You will make architectural decisions, explore hyperparameters, experiment with data augmentation, and document what you learn through this process.

Project Structure

Problem Selection

You will select from a set of instructor-defined project options. Each option specifies a dataset and problem but leaves the approach to you. You must determine: - What modeling approach is appropriate - What architectures to try - How to preprocess and augment data - What constitutes strong performance for this problem - How to evaluate your results

Problems are designed to require thoughtful application of course concepts while limiting opportunities to simply reproduce published results.

Your Approach

Your approach is entirely your design. You will make decisions about: - Network architecture selection and design - Hyperparameter configuration - Data preprocessing and augmentation strategies - Training procedures and optimization techniques - Evaluation methodology and baseline establishment

The value of your project lies in systematic exploration of these choices and analysis of what works, what doesn’t, and why.

What This Project Is

A systematic experimental investigation: You will train multiple models, vary architectural and hyperparameter choices, and document what you learn. Strong projects show hypothesis-driven experimentation rather than undirected trial and error.
A demonstration of course concepts: Apply techniques from the entire semester—data handling, architecture design, optimization, evaluation, and analysis.
An exercise in critical analysis: Understanding why approaches succeed or fail, where models struggle, and what results reveal about the problem and methods matters as much as achieving strong performance.
A foundation-building experience: Reinforce understanding of how deep learning works from gradient descent through backpropagation to architecture design.

What This Project Is Not

Not open-ended research: You demonstrate understanding through application, not by developing novel techniques.
Not a literature survey: While you should understand related work, your project is experimental, not a review.
Not a framework tutorial: Demonstrate understanding of deep learning fundamentals, not just PyTorch proficiency.
Not a competition for highest accuracy: Understanding your process, making informed decisions, and analyzing thoroughly matters more than maximizing performance.

Experimental Methodology

Demonstrate systematic experimental practice through your project.

Architecture Exploration

Investigate different network architectures appropriate for your problem. Try architectures of varying depth and complexity. Compare different layer types and architectural patterns. Explore how specific design choices affect performance.

Document what you tried, why you tried it, and what you learned. Show progression from simple baselines to more sophisticated designs with clear rationale.

Hyperparameter Investigation

Explore how hyperparameters affect model behavior: - Learning rate selection and scheduling - Batch size effects on training and generalization - Regularization strategies - Optimizer choices

Understand how these decisions affect training dynamics and final performance.

Data Handling

Demonstrate thoughtful data practices: - Appropriate preprocessing and normalization - Data augmentation suited to your problem - Proper train/validation/test splits - Analysis of data characteristics affecting modeling

Data work should reflect understanding of how data quality and characteristics affect outcomes.

Baseline Establishment

Create meaningful baselines that contextualize your results: - Simple models (logistic regression, shallow networks) - Standard architectures before optimization - Ablated versions showing what components contribute

Baselines demonstrate what architectural or training choices actually matter.

Evaluation and Analysis

Evaluate rigorously: - Use metrics appropriate for your task - Report performance on held-out test data - Analyze failure modes with examples - Investigate what models learned - Compare against baselines honestly

Technical Expectations

Implementation Quality

Implement models carefully: - Use PyTorch while demonstrating understanding of underlying operations - Organize code clearly (data handling, models, training, evaluation) - Handle numerical stability and common pitfalls

Implementation should reflect understanding of mathematics and algorithms, not just API usage.

Training Practices

Apply sound training methodology: - Monitor metrics to diagnose issues - Use appropriate loss functions - Implement proper evaluation without data leakage - Document procedures for reproducibility

Data Hygiene

Practice proper data handling: - Understand dataset characteristics and limitations - Preprocess appropriately for your problem - Split data correctly - Avoid contamination and leakage

Experimental Documentation

Document experiments systematically: - Track architectures and hyperparameters tried - Record performance for significant experiments - Note successes, failures, and hypotheses about why - Maintain logs supporting your final analysis

Analysis and Understanding

Strong projects develop understanding beyond reporting numbers.

Performance Understanding

Analyze why models perform as they do: - What architectural choices matter most? - How do hyperparameters affect training and performance? - What data characteristics drive success or failure? - Where do models make mistakes and why?

Fair Comparisons

When comparing approaches: - Ensure fair comparisons (same data, same evaluation) - Understand what each comparison tests - Report training behavior, not just final performance - Use appropriate statistical measures

Learning from Failures

Document what didn’t work: - Models that failed to train - Architectures that underperformed - Hyperparameters causing instability - Data strategies that hurt performance

Failures often teach more than successes.

Project Scope

Projects should be substantial yet focused enough for thorough completion.

Appropriate Scope

Well-scoped projects include: - Multiple architectural variations (3-4+ meaningfully different approaches) - Systematic hyperparameter exploration (informed investigation, not exhaustive search) - Thoughtful data augmentation experiments - Comprehensive evaluation with baselines - Thorough result analysis

Managing Scope

Focus on depth over breadth: - Thoroughly understand why architectural choices matter rather than trying everything superficially - Emphasize analysis over additional experiments - Document process continuously

If problems prove more difficult than expected: - Focus on understanding best approaches deeply - Analyze failures systematically - Ensure reports reflect deep understanding even if performance is modest

Working in Teams

Teams of two students collaborate throughout the project.

Both members should contribute substantially to experimental design, implementation, analysis, and reporting.

Divide work productively while ensuring both members understand all aspects of the project. Your report must include a contributions statement documenting how work was divided.

Academic Integrity

All work must be completed by your team. Use standard libraries (PyTorch, NumPy, scikit-learn), course materials, and documentation. Consult tutorials for implementation details as needed.

Do not copy substantial code from online sources, other teams, or previous offerings. Do not use pre-trained models unless explicitly permitted. Do not share code or results with other teams.

Document sources for code snippets and ensure you understand what code does. Your implementation should demonstrate understanding, not successful copying.

Evaluation Criteria

Projects are evaluated on experimental rigor, implementation quality, analysis depth, and reporting clarity.

Experimental Quality: Systematic methodology, thoughtful exploration, hypothesis-driven experiments.

Implementation Quality: Well-organized correct code reflecting understanding of fundamentals and appropriate application of course concepts.

Analysis Depth: Understanding of results, investigation of why approaches work, analysis of failures and successes.

Reporting Quality: Clear documentation of process, explanation of decisions and rationale, demonstration of learning beyond achieving performance.

The goal is demonstrating mastery of foundational deep learning through systematic application to a meaningful problem.