05-06 // The Simplest Possible Model

I’ve made a linear regression model that models the linear equation y = 2x + 1. The purpose of this was to create a foundation from which to build everything in the neural network and Kaggle world on top of. This code couldn’t possibly be simpler if I tried.

Step 1: Setting Everything Up

import torch
import torch.nn as nn
import numpy as np

torch.manual_seed(0) # just to keep things consistent

# We're going to just hard-code data in for this example
X = np.array([[i] for i in range(50)])
X = torch.Tensor(X)

Y = np.array([[2 * i + 1] for i in range(50)])
Y = torch.Tensor(Y)

This is as straightforward as it gets for our toy example. We begin by creating our inputs X and Y. X is all of the x-values we want to train our model on (for now the numbers 0-49) and Y is the outputs we want (2x + 1 for the x-values we were interested in, 0-49). To feed them into torch.nn modules, we’re going to have to turn them into torch.Tensors.

A careful observer will note that X and Y are initialized with each number in its own array. This is intentional. torch.nn modules expect their data to be given in a 2D tensor of dimensions (m_examples, n_features). In concrete terms, they want their data to look like [[x1_feature1, x1_feature2, x1_feature3], [x2_feature1, x2_feature2, x2_feature3], ...] . Since we only have one feature - the number itself - and 50 examples, we need to construct our array like [[1], [2], [3], ...]. This would correspond to a 50 x 1 tensor.

Step 2: Making the Model

class LinearRegression(nn.Module):
	def __init__(self):
		super().__init__()
		self.L1 = nn.Linear(1, 1, bias=True)
		
	def forward(self, X):
		Y = self.L1(X)
		return Y

This is how you make models in PyTorch. The main thing to note here is the way that modules work. A module is actually a function you can pass your data to that will transform it according to its specifications. In this example, we pass our data X to the nn.Linear module we defined (self.L1) to get the final output.

The dimensions of L1 are very important. Because X is 50 x 1, we have to make L1's dimensions something that can be multiplied with X (remember linear modules perform the operation Ax + b under the hood). As a consequence, the first dimension of L1 must be 1. The second dimension needs to be 1 as well since the final output we want from this model is a singular number.

The forward() function is where the magic happens. In forward(), we define the computations our model is going to make. This is where model architecture innovations happen. In this example, our model is extremely simple - we apply one linear layer and call it a day - so that’s all we’ll put here.

*hypothetically, if we wanted to make a more complex model, we would set the second dimension to something greater than 1, and then chain this module together with other modules, and so on. The final module’s output dimension would still have to be 1 to output numbers.

Step 3: The Training Loop

N_STEPS = 2000

model = LinearRegression()
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(params=model.parameters(), lr=0.1)

for i in range(N_STEPS):
	Y_pred = model.forward(X)
	loss = loss_function(Y_pred, Y)
	model.zero_grad()
	loss.backward()
	optimizer.step()
	
	if (i + 1) % 200 == 0:
		print(f'Iteration {i + 1}: {loss.item()}')