Homework #5

EE 541: Fall 2025

Assignment Details

Assigned: 06 September
Due: Sunday, 12 October at 23:59

BrightSpace Assignment: Homework 5
Getting Started Guide: View Guide

Problem 1: Logistic Regression

The MNIST dataset of handwritten digits is one of the earliest and most used datasets to benchmark machine learning classifiers. Each datapoint contains 784 input features – the pixel values from a \(28 \times 28\) image – and belongs to one of 10 output classes – represented by the numbers 0-9.

In this problem you will use numpy to classify input images using a logistic-regression.

Requirements

Use only Python standard library modules, numpy, h5py, and matplotlib for this problem.

Part (a): Logistic “2” detector

In this part you will use the provided MNIST handwritten-digit data to build and train a logistic “2” detector:

\[ y = \begin{cases} 1 & \mathbf{x} \textrm{ is a "2"} \\ 0 & \textrm{else}. \end{cases} \]

A logistic classifier takes learned weight vector \(\mathbf{w} = [w_1, w_2, \ldots w_L]^T\) and the unregularized offset bias \(b \triangleq w_0\) to estimate a probability that an input vector \(\mathbf{x} = [x_1, x_2, \ldots, x_L]^T\) is “2”:

\[ p(\mathbf{x}) = P[Y = 1 | \mathbf{x}, \mathbf{w}] = \frac{1}{1 + \exp\left(-\left(\sum_{k=1}^{L} w_k \cdot x_k + w_0\right)\right)} = \frac{1}{1 + \exp\left(-\left(w^T x + w_0\right)\right)}. \]

Train a logistic classifier to find weights that minimize the binary log-loss (also called the binary cross entropy loss):

\[ l(w) = - \frac{1}{N} \sum_{i=1}^N \left(y_i \log p(x) + \left(1 - y_i\right) \log\left(1 - p(x)\right)\right) \]

where the sum is over the \(N\) samples in the training set.

Train your model until convergence according to some metric you choose. Experiment with variations of \(\ell_1\)- and/or \(\ell_2\)-regularization to stabilize training and improve generalization.

Submit answers to the following:

How did you determine a learning rate? What values did you try? What was your final value?
Describe the method you used to establish model convergence.
What regularizers did you try? Specifically, how did each impact your model or improve its performance?
Plot log-loss (i.e., learning curve) of the training set and test set on the same figure. On a separate figure plot the accuracy against iteration number of your model on the training set and test set. Plot each as a function of the iteration number.
Classify each input to the binary output “digit is a 2” using a 0.5 threshold. Compute the final loss and final accuracy for both your training set and test set.

Submit your trained weights to Gradescope. Save your weights and bias to an hdf5 file. Use keys w and b for the weights and bias, respectively. w should be a length-784 numpy vector/array and b should be a numpy scalar. Use the following as guidance:

with h5py.File(outFile, 'w') as hf:
    hf.create_dataset('w', data = np.asarray(weights))
    hf.create_dataset('b', data = np.asarray(bias))

Note: you will not be scored on your model’s overall accuracy. But a low score may indicate errors in training or poor optimization.

Submission Instructions

Problem 1: Logistic Regression

Submit answers to questions, learning curves, accuracy plots, and final loss and accuracy values to BrightSpace. Submit your Python code and trained weights hdf5 file (w: 784×1 vector, b: scalar) to Gradescope. A suitably annotated Jupyter notebook with inline analysis is sufficient.

Gradescope: Problem 1