Autodiff¶
This module shows how to build the first version of MiniTorch (miniMiniTorch?) using only Scalar values. This covers key aspects of autodifferentiation: the key technique in the system. Then you will use your code to train a preliminary model.
All starter code is available in https://github.com/minitorch/Module1 .
To begin, remember to activate your virtual environment first, and then clone your assignment:
>>> git clone {{STUDENT_ASSIGNMENT1_URL}}
>>> cd {{STUDENT_ASSIGNMENT_NAME}}
>>> pip install Ue .
Module 1 is built upon the previous Module 0, so make sure to pull your files from Assignment 0 to your new repo.
Please continue to follow the Contributing guideline.
Tasks¶
Task 1.1: Numerical Derivatives¶
Note
This task requires basic familiarity with derivatives. Be sure to review differentiation rules and the notation for derivatives. Then carefully read the Guide on Derivatives.
Todo
Complete the following function in minitorch/scalar.py and pass tests marked as task1_1.

minitorch.scalar.
central_difference
(f, *vals, arg=0, epsilon=1e06)¶ Computes an approximation to the derivative of f with respect to one arg.
See Derivatives or https://en.wikipedia.org/wiki/Finite_difference for more details.
 Parameters
f  arbitrary function from nscalar args to one value
*vals (list of floats)  nfloat values \(x_0 \ldots x_{n1}\)
arg (int)  the number \(i\) of the arg to compute the derivative
epsilon (float)  a small constant
 Returns
An approximation of \(f'_i(x_0, \ldots, x_{n1})\)
 Return type
float
Task 1.2: Scalars¶
Note
This task requires familiarity with the minitorch.Scalar
class.
Be sure to first carefully read the Guide on
Tracking Variables and to refresh your memory on Python numerical overrides.
Implement the overridden mathematical functions required for the
minitorch.Scalar
class.
Each of these requires wiring the internal Python operator to the correct
minitorch.Function.forward()
call.

minitorch.scalar.ScalarFunction.
forward
(ctx, *inputs)¶ Forward call, compute \(f(x_0 \ldots x_{n1})\).
 Parameters
ctx (
Context
)  A container object to save any information that may be needed for the call to backward.*inputs (list of floats)  nfloat values \(x_0 \ldots x_{n1}\).
Should return float the computation of the function \(f\).
Read the example ScalarFunctions that we have implemented for guidelines. You may find it useful to reuse the operators from Module 0.
We have built a debugging tool for you to observe the workings of your expressions to see how the graph is built. You can run it in the Autodiff Sandbox. You can alter the expression at the top of the file and then run the code to create a graph in Streamlit:
>>> streamlit run app.py  1
Todo
Complete the following functions in minitorch/scalar.py.

minitorch.scalar.Mul.
forward
(ctx, a, b)¶

minitorch.scalar.Inv.
forward
(ctx, a)¶

minitorch.scalar.Neg.
forward
(ctx, a)¶

minitorch.scalar.Sigmoid.
forward
(ctx, a)¶

minitorch.scalar.ReLU.
forward
(ctx, a)¶

minitorch.scalar.Exp.
forward
(ctx, a)¶

minitorch.scalar.LT.
forward
(ctx, a, b)¶

minitorch.scalar.EQ.
forward
(ctx, a, b)¶
Todo
Complete the following function in minitorch/scalar.py, and pass
tests marked as task1_2.
See Python numerical overrides
for the interface of these methods. All of these functions should return
minitorch.Scalar
arguments.

minitorch.Scalar.
__lt__
(self, b)¶

minitorch.Scalar.
__gt__
(self, b)¶

minitorch.Scalar.
__sub__
(self, b)¶

minitorch.Scalar.
__neg__
(self)¶

minitorch.Scalar.
__add__
(self, b)¶

minitorch.Scalar.
log
(self)¶

minitorch.Scalar.
exp
(self)¶

minitorch.Scalar.
sigmoid
(self)¶

minitorch.Scalar.
relu
(self)¶
Task 1.3: Chain Rule¶
Note
This task is quite tricky, so be sure you understand the chain rule, Variables, and Functions. Be sure to first read the Guide on Autodifferentiation very carefully and read the code for other ScalarFunctions.
Implement the chain_rule function on FunctionBase for functions of arbitrary arguments. This function should be able to backward process a function by passing it in a context and \(d_{out}\) and then collecting the local derivatives. It should then pair these with the right variables and return them. This function is also where we filter out constants that were used on the forward pass, but do not need derivatives.
Todo
Complete the following function in minitorch/autodiff.py, and pass tests marked as task1_3.

minitorch.FunctionBase.
chain_rule
(ctx, inputs, d_output)¶ Implement the derivative chainrule.
 Parameters
ctx (
Context
)  The context from running forwardinputs (list of args)  The args that were passed to
FunctionBase.apply()
(e.g. \(x, y\))d_output (number)  The d_output value in the chain rule.
 Returns
A list of nonconstant variables with their derivatives (see is_constant to remove unneeded variables)
 Return type
list of (Variable, number)
Task 1.4: Backpropagation¶
Note
Be sure to first read the Guide on Backpropagation very carefully and read the code for other ScalarFunctions.
Implement backpropagation. Each of these requires wiring the internal Python
operator to the correct
minitorch.Function.backward()
call.

minitorch.scalar.ScalarFunction.
backward
(ctx, d_out)¶ Backward call, computes \(f'_{x_i}(x_0 \ldots x_{n1}) \times d_{out}\).
 Parameters
ctx (Context)  A container object holding any information saved during in the corresponding forward call.
d_out (float)  \(d_out\) term in the chain rule.
Should return the computation of the derivative function \(f'_{x_i}\) for each input \(x_i\) times d_out.
Read the example ScalarFunctions that we have implemented for guidelines. Feel free to also consult differentiation rules if you forget how these identities work.
Todo
Complete the following functions in minitorch/autodiff.py and minitorch/scalar.py, and pass tests marked as task1_4.

minitorch.
topological_sort
(variable)¶ Computes the topological order of the computation graph.
 Parameters
variable (
Variable
)  The rightmost variable Returns
 Nonconstant Variables in topological order
starting from the right.
 Return type
list of Variables

minitorch.
backpropagate
(variable, deriv)¶ Runs backpropagation on the computation graph in order to compute derivatives for the leave nodes.
See Backpropagation for details on the algorithm.
 Parameters
variable (
Variable
)  The rightmost variablederiv (number)  Its derivative that we want to propagate backward to the leaves.
No return. Should write to its results to the derivative values of each leaf through accumulate_derivative.

minitorch.scalar.Mul.
backward
(ctx, d_output)¶

minitorch.scalar.Inv.
backward
(ctx, d_output)¶

minitorch.scalar.Neg.
backward
(ctx, d_output)¶

minitorch.scalar.Sigmoid.
backward
(ctx, d_output)¶

minitorch.scalar.ReLU.
backward
(ctx, d_output)¶

minitorch.scalar.Exp.
backward
(ctx, d_output)¶
Task 1.5: Training¶
If your code works, you should now be able to run the training script. Study the code in project/run_scalar.py carefully to understand what the neural network is doing.
You will also need Module code to implement the parameters Network and for Linear. You can modify the dataset and the module with the parameters at the bottom of the file. Start with this simple config:
PTS = 50
DATASET = minitorch.datasets["Simple"](PTS)
HIDDEN = 2
RATE = 0.5
You can then move up to something more complex, for instance:
PTS = 50
DATASET = minitorch.datasets["Xor"](PTS)
HIDDEN = 10
RATE = 0.5
If your code is successful, you should be able to run the full visualization:
>>> streamlit run app.py  1
Todo
Train a scalar model for each of the 4 main datasets.
Add the output training logs and final images to your README file.