+{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyNll7YpGezWZT1pJueXqmyy"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["# Week 2: Implementing Callbacks in TensorFlow using the MNIST Dataset\n","\n","In the lectures you learned how to do classification using Fashion MNIST, a dataset containing items of clothing. There's another, similar dataset called MNIST which has items of handwriting -- the digits 0 through 9.\n","\n","In this assignment you will code a classifier for the MNIST dataset, that trains until it reaches 98% accuracy and stops once this threshold is achieved. In the lectures you saw how this was done for the loss but here you will be using accuracy instead.\n","\n","Some notes:\n","1. Your network should succeed in less than 10 epochs.\n","2. When it reaches 98% or greater it should print out the string \"Reached 98% accuracy so cancelling training!\" and stop training."],"metadata":{"id":"0D-cIwmCf17G"}},{"cell_type":"markdown","source":["#### TIPS FOR SUCCESSFUL GRADING OF YOUR ASSIGNMENT:\n","\n","- All cells are frozen except for the ones where you need to submit your solutions or when explicitly mentioned you can interact with it.\n","\n","- You can add new cells to experiment but these will be omitted by the grader, so don't rely on newly created cells to host your solution code, use the provided places for this.\n","\n","- You can add the comment # grade-up-to-here in any graded cell to signal the grader that it must only evaluate up to that point. This is helpful if you want to check if you are on the right track even if you are not done with the whole assignment. Be sure to remember to delete the comment afterwards!\n","\n","- Avoid using global variables unless you absolutely have to. The grader tests your code in an isolated environment without running all cells from the top. As a result, global variables may be unavailable when scoring your submission. Global variables that are meant to be used will be defined in UPPERCASE.\n","\n","- To submit your notebook, save it and then click on the blue submit button at the beginning of the page."],"metadata":{"id":"qAwlSZt0gEL7"}},{"cell_type":"code","source":["import os\n","import base64\n","import tensorflow as tf"],"metadata":{"id":"65-WFaJ5gHCn"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["import unittests"],"metadata":{"id":"BxEboQvsgI9E"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## Load and inspect the data\n","\n","Begin by loading the data. A couple of things to notice:\n","\n","- The file `mnist.npz` is already included in the current workspace under the `data` directory. By default the `load_data` from Keras accepts a path relative to `~/.keras/datasets` but in this case it is stored somewhere else, as a result of this, you need to specify the full path.\n","\n","- `tf.keras.datasets.mnist.load_data` returns the train and test sets in the form of the tuples `(training_images, training_labels), (testing_images, testing_labels)` but in this exercise you will be needing only the train set so you can ignore the second tuple."],"metadata":{"id":"LeuSD0eagMEx"}},{"cell_type":"code","source":["# Load data (discard test set)\n","(training_images, training_labels), _ = tf.keras.datasets.mnist.load_data()\n","\n","print(f\"training_images is of type {type(training_images)}.\\ntraining_labels is of type {type(training_labels)}\\n\")\n","\n","# Inspect shape of the data\n","data_shape = training_images.shape\n","\n","print(f\"There are {data_shape[0]} examples with shape ({data_shape[1]}, {data_shape[2]})\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"pDD8-JZZgRao","executionInfo":{"status":"ok","timestamp":1741618017350,"user_tz":240,"elapsed":590,"user":{"displayName":"Luis Alfredo Hung Araque","userId":"00964424177241549147"}},"outputId":"f216373c-08c7-4151-a4c1-eaba643af337"},"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n","\u001b[1m11490434/11490434\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m0s\u001b[0m 0us/step\n","training_images is of type <class 'numpy.ndarray'>.\n","training_labels is of type <class 'numpy.ndarray'>\n","\n","There are 60000 examples with shape (28, 28)\n"]}]},{"cell_type":"markdown","source":["One important step is to normalize the pixel values. The dataset includes black and white images and the pixel values for these kinds of images usually range from 0 to 255 but the network will have an easier time learning if these values range from 0 to 1.\n","\n","The data comes as numpy arrays so you can easily normalize the pixel values by using vectorization:"],"metadata":{"id":"UDftfQ1KgV4J"}},{"cell_type":"code","source":["# Normalize pixel values\n","training_images = training_images / 255.0"],"metadata":{"id":"qc_czQY_gVlz"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## Exercise 1: create_and_compile_model\n","\n","Your first task is to create and compile the model that you will later train to recognize handwritten digits.\n","\n","Feel free to try the architecture for the neural network that you see fit but in case you need extra help you can check out an architecture that works pretty well at the end of this notebook. Notice that the part where the model is compiled is already provided (and the `accuracy` metric is defined so it can be accessed by your callback later on) so you only need to specify the layers of the model.\n","\n","Hints:\n","- The first layer should take into consideration the `input_shape` of the data, which in this case is the size of each image\n","- The last layer should take into account the number of classes you are trying to predict"],"metadata":{"id":"WEFFlJwhgbfz"}},{"cell_type":"code","source":["# GRADED FUNCTION: create_and_compile_model\n","\n","def create_and_compile_model():\n"," \"\"\"Returns the compiled (but untrained) model.\n","\n"," Returns:\n"," tf.keras.Model: The model that will be trained to predict predict handwriting digits.\n"," \"\"\"\n","\n"," ### START CODE HERE ###\n","\n"," # Define the model\n"," model = tf.keras.models.Sequential([\n","\t\t tf.keras.layers.Input(shape=(28,28)),\n"," tf.keras.layers.Flatten(),\n"," tf.keras.layers.Dense(512, activation=tf.nn.relu),\n"," tf.keras.layers.Dense(10, activation=tf.nn.softmax)\n"," ])\n","\n"," ### END CODE HERE ###\n","\n"," # Compile the model\n"," model.compile(\n","\t\toptimizer='adam',\n","\t\tloss='sparse_categorical_crossentropy',\n","\t\tmetrics=['accuracy']\n","\t)\n","\n"," return model"],"metadata":{"id":"M9Qbpa4zgf4g"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["The next cell allows you to check the number of total and trainable parameters of your model and prompts a warning in case these exceeds those of a reference solution, this serves the following 3 purposes listed in order of priority:\n","\n","- Helps you prevent crashing the kernel during training.\n","\n","- Helps you avoid longer-than-necessary training times.\n","- Provides a reasonable estimate of the size of your model. In general you will usually prefer smaller models given that they accomplish their goal successfully.\n","\n","**Notice that this is just informative** and may be very well below the actual limit for size of the model necessary to crash the kernel. So even if you exceed this reference you are probably fine. However, **if the kernel crashes during training or it is taking a very long time and your model is larger than the reference, come back here and try to get the number of parameters closer to the reference.**"],"metadata":{"id":"CRtGPntGgklD"}},{"cell_type":"code","source":["# Save untrained model in a variable\n","untrained_model = create_and_compile_model()\n","\n","# Check parameter count against a reference solution\n","unittests.parameter_count(untrained_model)"],"metadata":{"id":"c6HoXHibgooL"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# Use it to predict the first 5 images in the train set\n","predictions = untrained_model.predict(training_images[:5], verbose=False)\n","\n","print(f\"predictions have shape: {predictions.shape}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"fR54Yye3grTg","executionInfo":{"status":"ok","timestamp":1741618291128,"user_tz":240,"elapsed":539,"user":{"displayName":"Luis Alfredo Hung Araque","userId":"00964424177241549147"}},"outputId":"d8c9acd8-d71b-41d5-a6fe-ab36d22f495d"},"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["predictions have shape: (5, 10)\n"]}]},{"cell_type":"markdown","source":["**Expected Output:**\n","\n","```\n","predictions have shape: (5, 10)\n","```"],"metadata":{"id":"q52hN0Pjgu9o"}},{"cell_type":"markdown","source":["## Exercise 2: EarlyStoppingCallback\n","\n","Now it is time to create your own custom callback. For this complete the `EarlyStoppingCallback` class and the `on_epoch_end` method in the cell below. If you need some guidance on how to proceed, check out this [link](https://www.tensorflow.org/guide/keras/writing_your_own_callbacks)."],"metadata":{"id":"iaLy_5IGgxuB"}},{"cell_type":"code","source":["# GRADED CLASS: EarlyStoppingCallback\n","\n","### START CODE HERE ###\n","\n","# Remember to inherit from the correct class\n","class EarlyStoppingCallback(tf.keras.callbacks.Callback):\n","\n"," # Define the correct function signature for on_epoch_end method\n"," def on_epoch_end(self, epoch, logs=None):\n","\n"," # Check if the accuracy is greater or equal to 0.98\n"," if logs.get('accuracy') is not None and logs('accuracy') >= 0.98:\n","\n"," # Stop training once the above condition is met\n"," self.model.stop_training = True\n","\n"," print(\"\\nReached 98% accuracy so cancelling training!\")\n","\n","### END CODE HERE ###"],"metadata":{"id":"MzBMHFVEgyjx"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# Test your code!\n","unittests.test_EarlyStoppingCallback(EarlyStoppingCallback)"],"metadata":{"id":"sI1M2uT0g3ch"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## Exercise 3: train_mnist\n","\n","Now that you have defined your callback it is time to complete the `train_mnist` function below. This function will receive the training data (features and targets encoded as numpy arrays) and should use it to train the model you defined earlier. It should also return the training history of the model. This object is returned when using the `fit` method of a `tf.keras.Model` as explained in the [docs](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit).\n","\n","**You must set your model to train for 10 epochs and the callback should fire before the 10th epoch for you to pass this part of the assignment**"],"metadata":{"id":"w4FAR_jAg7RA"}},{"cell_type":"code","source":["# GRADED FUNCTION: train_mnist\n","\n","def train_mnist(training_images, training_labels):\n"," \"\"\"Trains a classifier of handwritten digits.\n","\n"," Args:\n"," training_images (numpy.ndarray): The images of handwritten digits\n"," training_labels (numpy.ndarray): The labels of each image\n","\n"," Returns:\n"," tf.keras.callbacks.History : The training history.\n"," \"\"\"\n","\n"," ### START CODE HERE ###\n","\n"," # Create a compiled (but untrained) version of the model\n"," # Hint: Remember you already coded a function that does this!\n"," model = create_and_compile_model()\n","\n"," # Fit the model for 10 epochs adding the callbacks and save the training history\n"," history = model.fit(training_images, training_labels, epochs=10, callbacks=[EarlyStoppingCallback()])\n","\n"," ### END CODE HERE ###\n","\n"," return history"],"metadata":{"id":"kGFF19bCg_Ah"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["Now train the model and get the training history by calling the `train_mnist` function, passing in the appropiate parameters:"],"metadata":{"id":"nTHskHWohCiw"}},{"cell_type":"code","source":["training_history = train_mnist(training_images, training_labels)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"rn3sIYlHhE2Y","executionInfo":{"status":"ok","timestamp":1741618417003,"user_tz":240,"elapsed":104137,"user":{"displayName":"Luis Alfredo Hung Araque","userId":"00964424177241549147"}},"outputId":"67653463-5b97-436d-8a19-19fef6b352e8"},"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["Epoch 1/10\n","\u001b[1m1875/1875\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m21s\u001b[0m 11ms/step - accuracy: 0.9016 - loss: 0.3390\n","Epoch 2/10\n","\u001b[1m1875/1875\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m21s\u001b[0m 11ms/step - accuracy: 0.9748 - loss: 0.0836\n","Epoch 3/10\n","\u001b[1m1872/1875\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m━\u001b[0m \u001b[1m0s\u001b[0m 11ms/step - accuracy: 0.9848 - loss: 0.0508\n","Reached 98% accuracy so cancelling training!\n","\u001b[1m1875/1875\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m41s\u001b[0m 11ms/step - accuracy: 0.9848 - loss: 0.0508\n"]}]},{"cell_type":"markdown","source":["**Expected Output:**\n","\n","`Reached 98% accuracy so cancelling training!` printed out before reaching 10 epochs."],"metadata":{"id":"v1mr6xn1hHSQ"}},{"cell_type":"code","source":["# Test your code!\n","unittests.test_training_history(training_history)"],"metadata":{"id":"k42okkzrhJsv"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## Need more help?\n","\n","Run the following cell to see an architecture that works well for the problem at hand:"],"metadata":{"id":"oO-o5RRfhNv4"}},{"cell_type":"code","source":["# WE STRONGLY RECOMMEND YOU TO TRY YOUR OWN ARCHITECTURES FIRST\n","# AND ONLY RUN THIS CELL IF YOU WISH TO SEE AN ANSWER\n","\n","encoded_answer = \"CiAgIC0gQSB0Zi5rZXJhcy5JbnB1dCB3aXRoIHRoZSBzYW1lIHNoYXBlIGFzIHRoZSBpbWFnZXMKICAgLSBBIEZsYXR0ZW4gbGF5ZXIKICAgLSBBIERlbnNlIGxheWVyIHdpdGggNTEyIHVuaXRzIGFuZCBSZUxVIGFjdGl2YXRpb24gZnVuY3Rpb24KICAgLSBBIERlbnNlIGxheWVyIHdpdGggMTAgdW5pdHMgYW5kIHNvZnRtYXggYWN0aXZhdGlvbiBmdW5jdGlvbgo==\"\n","encoded_answer = encoded_answer.encode('ascii')\n","answer = base64.b64decode(encoded_answer)\n","answer = answer.decode('ascii')\n","\n","print(answer)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"7e-GY8rGhQWw","executionInfo":{"status":"ok","timestamp":1741618417130,"user_tz":240,"elapsed":34,"user":{"displayName":"Luis Alfredo Hung Araque","userId":"00964424177241549147"}},"outputId":"afc19dd8-f4be-488f-dd0a-901ccdf9d015"},"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["\n"," - A tf.keras.Input with the same shape as the images\n"," - A Flatten layer\n"," - A Dense layer with 512 units and ReLU activation function\n"," - A Dense layer with 10 units and softmax activation function\n","\n"]}]},{"cell_type":"markdown","source":["**Congratulations on finishing this week's assignment!**\n","\n","You have successfully implemented a callback that gives you more control over the training loop for your model. Nice job!\n","\n","**Keep it up!**"],"metadata":{"id":"tcs1oGKPhS7B"}}]}
0 commit comments