Python > Data Science and Machine Learning Libraries > TensorFlow and Keras > Neural Networks
Simple Neural Network with Keras
This snippet demonstrates a basic feedforward neural network built with Keras for a classification problem. It covers data preparation, model definition, training, and evaluation. It's a starting point for understanding neural network implementation using TensorFlow's Keras API.
Import Necessary Libraries
This section imports the required libraries. numpy
is used for numerical operations, tensorflow
is the core library, keras
provides a high-level API for building neural networks, and layers
module defines the individual layers of the network.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
Prepare the Data
This section prepares the MNIST dataset. The MNIST dataset is loaded, pixel values are normalized to the range [0, 1], the data is reshaped to have a channel dimension (required by convolutional layers), and the labels are converted to one-hot encoded vectors.
num_classes = 10
input_shape = (28, 28, 1)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print("x_train shape:", x_train.shape)
print("x_test shape:", x_test.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")
Build the Model
This section defines the neural network model. It's a simple Convolutional Neural Network (CNN) with two convolutional layers, max pooling layers, and a dense layer for classification. The 'relu' activation function is used for convolutional layers, and 'softmax' is used for the output layer to produce probabilities for each class. Dropout is added to prevent overfitting.
model = keras.Sequential(
[
keras.Input(shape=input_shape),
layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation="softmax"),
]
)
Compile the Model
This section compiles the model and trains it on the training data. The 'categorical_crossentropy' loss function is used because it's a multi-class classification problem. The 'adam' optimizer is used for updating the model's weights. The model is trained for a specified number of epochs with a given batch size. A portion of the training data is used for validation during training.
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
batch_size = 128
epochs = 15
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)
Evaluate the Model
This section evaluates the trained model on the test data and prints the test loss and accuracy.
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])
Concepts Behind the Snippet
This code demonstrates fundamental concepts of neural networks including:
Real-Life Use Case
This type of neural network can be adapted for various image classification tasks, such as:
Best Practices
Interview Tip
When discussing this code in an interview, be prepared to explain:
When to Use Them
Use CNNs for tasks involving image data, particularly when spatial relationships between pixels are important. They are suitable for classification, object detection, and image segmentation tasks.
Memory Footprint
The memory footprint of this model depends on factors such as the number of layers, the number of neurons per layer, the size of the input images, and the batch size. Larger models and larger input images will require more memory. Consider reducing the number of layers or neurons, or reducing the image size to reduce memory consumption.
Alternatives
Pros
Cons
FAQ
-
What is the purpose of the `Flatten` layer?
The `Flatten` layer converts the multi-dimensional output of the convolutional layers into a one-dimensional vector, which can be fed into the dense layers. -
What is the role of the `Dropout` layer?
The `Dropout` layer randomly sets a fraction of the input units to 0 during training. This helps to prevent overfitting by reducing the model's reliance on specific features. -
Why is the data normalized?
Normalizing the data to the range [0, 1] helps to improve training stability and performance by preventing the activation functions from saturating.