You should be able to do this exercise after Lecture 8.
In this exercise we use the IMDb-dataset, which we will use to perform a sentiment analysis. The code below assumes that the data is placed in the same folder as this notebook. We see that the reviews are loaded as a pandas dataframe, and print the beginning of the first few reviews.
%matplotlib inline
import pandas as pd
import numpy as np
from numpy.random import seed
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import regularizers, optimizers
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
reviews = pd.read_csv('reviews.txt', header=None)
labels = pd.read_csv('labels.txt', header=None)
Y = (labels=='positive').astype(np.int_)
print(type(reviews))
print(reviews.head())
<class 'pandas.core.frame.DataFrame'> 0 0 bromwell high is a cartoon comedy . it ran at ... 1 story of a man who has unnatural feelings for ... 2 homelessness or houselessness as george carli... 3 airport starts as a brand new luxury pla... 4 brilliant over acting by lesley ann warren . ...
First we load the reviews and labels, and convert the labels from positive and negative to numerical values 0 and 1.
(a) Split the reviews and labels in test, train and validation sets. The train and validation sets will be used to train your model and tune hyperparameters, the test set will be saved for testing.
X_train_val, X_test, Y_train_val, Y_test = train_test_split(reviews, labels, random_state=69, stratify=labels)
X_train, X_val, Y_train, Y_val = train_test_split(X_train_val, Y_train_val, random_state=69)
print("Size of training set:{}".format(X_train.shape[0]))
print("Size of validation set:{}".format(X_val.shape[0]))
print("Size of test set:{}".format(X_test.shape[0]))
X_train.head()
Size of training set:14062 Size of validation set:4688 Size of test set:6250
0 | |
---|---|
3720 | i always follow the dakar so when my husband ... |
21332 | as with that film we follow the implausible if... |
16618 | one of several musicals about sailors on leave... |
9428 | whereas the hard boiled detective stories of ... |
3067 | when i first watched robotboy i found it fres... |
(b) Use the CountVectorizer
from sklearn.feature_extraction.text
to create a Bag-of-Words representation of the reviews. (See an example of how to do this in chapter 7 of "Muller and Guido"). Only use the 10,000 most frequent words (use the max_features
-parameter of CountVectorizer
).
vect = CountVectorizer(max_features=10_000).fit(X_train[0])
X_train = vect.transform(X_train[0])
X_val = vect.transform(X_val[0])
X_test = vect.transform(X_test[0])
(c) Explore the representation of the reviews. How is a single word represented? How about a whole review?
Each review will be vectorized into an array of how many times each word is present in the review. I.e. [0,1,1] means that the 2nd and 3rd word in the bag of words is present. This is shown below, where the 7th review is shown (index 6) and it has a '1' on index 2, which means the word 'abandoned' is present.
display(X_train[0].toarray())
display(vect.get_feature_names_out())
array([[0, 0, 0, ..., 0, 0, 0]], dtype=int64)
array(['aaron', 'abandon', 'abandoned', ..., 'zoom', 'zorro', 'zu'], dtype=object)
(d) Train a neural network with a single hidden layer on the dataset, tuning the relevant hyperparameters to optimize accuracy.
X_train, X_test, Y_train, Y_test = train_test_split(reviews, labels, random_state=69, stratify=labels)
# Transform T_train and Y_test to dummy data
Y_train_dummies = pd.get_dummies(Y_train)
Y_train = Y_train_dummies['0_positive'].values
Y_test_dummies = pd.get_dummies(Y_test)
Y_test = Y_test_dummies['0_positive'].values
vect = CountVectorizer(max_features=10_000).fit(X_train[0])
X_train = vect.transform(X_train[0])
X_test = vect.transform(X_test[0])
We already split the dataset into test, train and validation, however, this is not nessecary as tensorflow already has a validation split parameter we can use. Therefore in the above code we simply split the dataset into test and train, and later in the fit function we set the _validationsplit=0.2 to 20% validation set.
We will also use early stopping, which allows the algorithm to stop at an optimal point where the accuracy is at its peak.
seed(69)
tf.random.set_seed(69)
input_size = 10_000
callback = tf.keras.callbacks.EarlyStopping(monitor = 'loss', patience = 3)
#initialize a neural network
model = Sequential()
# hidden layer
model.add(Dense(units=512, activation='tanh', input_dim=input_size, kernel_regularizer=regularizers.l2(0.001)))
# output layer
model.add(Dense(units=2, activation='softmax'))
sgd = optimizers.SGD(learning_rate = 0.1)
model.compile(loss='sparse_categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])
history = model.fit(X_train.toarray(), Y_train, epochs=50, batch_size=50, verbose=1, validation_split=0.2, callbacks=[callback])
Epoch 1/50 300/300 [==============================] - 8s 25ms/step - loss: 1.6220 - accuracy: 0.6623 - val_loss: 1.3302 - val_accuracy: 0.7987 Epoch 2/50 300/300 [==============================] - 8s 25ms/step - loss: 1.3492 - accuracy: 0.7482 - val_loss: 1.3220 - val_accuracy: 0.7208 Epoch 3/50 300/300 [==============================] - 8s 26ms/step - loss: 1.2026 - accuracy: 0.7842 - val_loss: 1.1175 - val_accuracy: 0.8069 Epoch 4/50 300/300 [==============================] - 8s 25ms/step - loss: 1.1001 - accuracy: 0.7996 - val_loss: 1.0199 - val_accuracy: 0.8237 Epoch 5/50 300/300 [==============================] - 8s 25ms/step - loss: 1.0133 - accuracy: 0.8127 - val_loss: 0.9309 - val_accuracy: 0.8565 Epoch 6/50 300/300 [==============================] - 8s 25ms/step - loss: 0.9249 - accuracy: 0.8304 - val_loss: 0.8541 - val_accuracy: 0.8563 Epoch 7/50 300/300 [==============================] - 8s 25ms/step - loss: 0.8530 - accuracy: 0.8393 - val_loss: 0.7861 - val_accuracy: 0.8688 Epoch 8/50 300/300 [==============================] - 8s 25ms/step - loss: 0.7947 - accuracy: 0.8444 - val_loss: 0.7355 - val_accuracy: 0.8691 Epoch 9/50 300/300 [==============================] - 8s 26ms/step - loss: 0.7427 - accuracy: 0.8494 - val_loss: 0.8121 - val_accuracy: 0.8000 Epoch 10/50 300/300 [==============================] - 8s 25ms/step - loss: 0.7248 - accuracy: 0.8411 - val_loss: 0.6676 - val_accuracy: 0.8723 Epoch 11/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6724 - accuracy: 0.8550 - val_loss: 0.6582 - val_accuracy: 0.8539 Epoch 12/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6526 - accuracy: 0.8472 - val_loss: 0.6464 - val_accuracy: 0.8520 Epoch 13/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6080 - accuracy: 0.8577 - val_loss: 0.5917 - val_accuracy: 0.8661 Epoch 14/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5883 - accuracy: 0.8577 - val_loss: 0.5784 - val_accuracy: 0.8603 Epoch 15/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5689 - accuracy: 0.8595 - val_loss: 0.9543 - val_accuracy: 0.7096 Epoch 16/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5372 - accuracy: 0.8664 - val_loss: 0.5799 - val_accuracy: 0.8341 Epoch 17/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5048 - accuracy: 0.8739 - val_loss: 0.6544 - val_accuracy: 0.8024 Epoch 18/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4976 - accuracy: 0.8717 - val_loss: 0.5281 - val_accuracy: 0.8520 Epoch 19/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4847 - accuracy: 0.8724 - val_loss: 0.5001 - val_accuracy: 0.8627 Epoch 20/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4676 - accuracy: 0.8788 - val_loss: 0.4791 - val_accuracy: 0.8725 Epoch 21/50 300/300 [==============================] - 8s 26ms/step - loss: 0.4680 - accuracy: 0.8729 - val_loss: 0.5696 - val_accuracy: 0.8179 Epoch 22/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4678 - accuracy: 0.8718 - val_loss: 0.4728 - val_accuracy: 0.8688 Epoch 23/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4312 - accuracy: 0.8828 - val_loss: 0.5651 - val_accuracy: 0.8189 Epoch 24/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4337 - accuracy: 0.8763 - val_loss: 0.6795 - val_accuracy: 0.7277 Epoch 25/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4389 - accuracy: 0.8766 - val_loss: 0.5237 - val_accuracy: 0.8291 Epoch 26/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4091 - accuracy: 0.8854 - val_loss: 0.4616 - val_accuracy: 0.8643 Epoch 27/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4230 - accuracy: 0.8821 - val_loss: 0.5265 - val_accuracy: 0.8363 Epoch 28/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4086 - accuracy: 0.8835 - val_loss: 0.6940 - val_accuracy: 0.7435 Epoch 29/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4169 - accuracy: 0.8786 - val_loss: 0.4600 - val_accuracy: 0.8621 Epoch 30/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4122 - accuracy: 0.8794 - val_loss: 0.4711 - val_accuracy: 0.8515 Epoch 31/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4059 - accuracy: 0.8809 - val_loss: 0.4623 - val_accuracy: 0.8557 Epoch 32/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4188 - accuracy: 0.8763 - val_loss: 0.5176 - val_accuracy: 0.8325 Epoch 33/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3928 - accuracy: 0.8855 - val_loss: 0.4537 - val_accuracy: 0.8629 Epoch 34/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3862 - accuracy: 0.8921 - val_loss: 0.4812 - val_accuracy: 0.8528 Epoch 35/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3885 - accuracy: 0.8867 - val_loss: 0.5025 - val_accuracy: 0.8400 Epoch 36/50 300/300 [==============================] - 8s 26ms/step - loss: 0.3968 - accuracy: 0.8845 - val_loss: 0.4552 - val_accuracy: 0.8571 Epoch 37/50 300/300 [==============================] - 8s 26ms/step - loss: 0.3920 - accuracy: 0.8866 - val_loss: 0.4860 - val_accuracy: 0.8493
As we can see above, the model stopped early after 37 Epoch, instead of the 50 we specified.
plt.figure()
plt.title("Loss curves")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.plot(history.history['loss'], label = 'train')
plt.plot(history.history['val_loss'], label = 'valid')
plt.legend()
plt.show()
plt.figure()
plt.title("Accuracy curves")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.plot(history.history['accuracy'], label = 'train')
plt.plot(history.history['val_accuracy'], label = 'valid')
plt.legend()
plt.show()
After running the model we can evaluate the loss and accuracy. We can see that our model stops with a final validation accuracy of 85% and a loss of 0.4860. As can be seen on the diagrams the scores varies and spikes a lot. lowering the learning rate or adding more layers could help this and smooth out the curves.
We can do another run with a learning rate of 0.05 and 50 Epochs to see the effect.
input_size = 10_000
callback = tf.keras.callbacks.EarlyStopping(monitor = 'loss', patience = 3)
#initialize a neural network
model2 = Sequential()
# hidden layer
model2.add(Dense(units=512, activation='tanh', input_dim=input_size, kernel_regularizer=regularizers.l2(0.001)))
# output layer
model2.add(Dense(units=2, activation='softmax'))
sgd = optimizers.SGD(learning_rate = 0.05)
model2.compile(loss='sparse_categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])
history = model2.fit(X_train.toarray(), Y_train, epochs=50, batch_size=50, verbose=1, validation_split=0.2, callbacks=[callback])
Epoch 1/50 300/300 [==============================] - 8s 26ms/step - loss: 1.6119 - accuracy: 0.6753 - val_loss: 1.3679 - val_accuracy: 0.8187 Epoch 2/50 300/300 [==============================] - 7s 25ms/step - loss: 1.3867 - accuracy: 0.7697 - val_loss: 1.3731 - val_accuracy: 0.7459 Epoch 3/50 300/300 [==============================] - 7s 25ms/step - loss: 1.2794 - accuracy: 0.8019 - val_loss: 1.2478 - val_accuracy: 0.7968 Epoch 4/50 300/300 [==============================] - 7s 25ms/step - loss: 1.2034 - accuracy: 0.8205 - val_loss: 1.1508 - val_accuracy: 0.8365 Epoch 5/50 300/300 [==============================] - 7s 25ms/step - loss: 1.1304 - accuracy: 0.8341 - val_loss: 1.1724 - val_accuracy: 0.8083 Epoch 6/50 300/300 [==============================] - 7s 25ms/step - loss: 1.0713 - accuracy: 0.8465 - val_loss: 1.0245 - val_accuracy: 0.8640 Epoch 7/50 300/300 [==============================] - 7s 25ms/step - loss: 1.0253 - accuracy: 0.8471 - val_loss: 1.0172 - val_accuracy: 0.8448 Epoch 8/50 300/300 [==============================] - 7s 25ms/step - loss: 0.9791 - accuracy: 0.8576 - val_loss: 0.9375 - val_accuracy: 0.8699 Epoch 9/50 300/300 [==============================] - 8s 25ms/step - loss: 0.9314 - accuracy: 0.8665 - val_loss: 0.9276 - val_accuracy: 0.8581 Epoch 10/50 300/300 [==============================] - 8s 25ms/step - loss: 0.8871 - accuracy: 0.8679 - val_loss: 0.8680 - val_accuracy: 0.8763 Epoch 11/50 300/300 [==============================] - 8s 25ms/step - loss: 0.8454 - accuracy: 0.8738 - val_loss: 0.8394 - val_accuracy: 0.8739 Epoch 12/50 300/300 [==============================] - 8s 25ms/step - loss: 0.8242 - accuracy: 0.8731 - val_loss: 0.8431 - val_accuracy: 0.8595 Epoch 13/50 300/300 [==============================] - 8s 25ms/step - loss: 0.7797 - accuracy: 0.8809 - val_loss: 0.8114 - val_accuracy: 0.8616 Epoch 14/50 300/300 [==============================] - 8s 25ms/step - loss: 0.7570 - accuracy: 0.8831 - val_loss: 0.7686 - val_accuracy: 0.8683 Epoch 15/50 300/300 [==============================] - 8s 25ms/step - loss: 0.7235 - accuracy: 0.8875 - val_loss: 0.9167 - val_accuracy: 0.8011 Epoch 16/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6963 - accuracy: 0.8883 - val_loss: 0.7884 - val_accuracy: 0.8288 Epoch 17/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6533 - accuracy: 0.8994 - val_loss: 0.7225 - val_accuracy: 0.8637 Epoch 18/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6355 - accuracy: 0.9010 - val_loss: 0.7082 - val_accuracy: 0.8613 Epoch 19/50 300/300 [==============================] - 8s 25ms/step - loss: 0.6068 - accuracy: 0.9058 - val_loss: 0.6745 - val_accuracy: 0.8707 Epoch 20/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5923 - accuracy: 0.9067 - val_loss: 0.6519 - val_accuracy: 0.8749 Epoch 21/50 300/300 [==============================] - 8s 26ms/step - loss: 0.5794 - accuracy: 0.9029 - val_loss: 1.0427 - val_accuracy: 0.7077 Epoch 22/50 300/300 [==============================] - 7s 25ms/step - loss: 0.5613 - accuracy: 0.9049 - val_loss: 0.6366 - val_accuracy: 0.8701 Epoch 23/50 300/300 [==============================] - 8s 25ms/step - loss: 0.5329 - accuracy: 0.9115 - val_loss: 1.2642 - val_accuracy: 0.6667 Epoch 24/50 300/300 [==============================] - 7s 25ms/step - loss: 0.5265 - accuracy: 0.9107 - val_loss: 0.7618 - val_accuracy: 0.7971 Epoch 25/50 300/300 [==============================] - 7s 25ms/step - loss: 0.5054 - accuracy: 0.9150 - val_loss: 0.6086 - val_accuracy: 0.8632 Epoch 26/50 300/300 [==============================] - 7s 25ms/step - loss: 0.4705 - accuracy: 0.9231 - val_loss: 0.6847 - val_accuracy: 0.8344 Epoch 27/50 300/300 [==============================] - 7s 25ms/step - loss: 0.4829 - accuracy: 0.9193 - val_loss: 0.6838 - val_accuracy: 0.8376 Epoch 28/50 300/300 [==============================] - 7s 25ms/step - loss: 0.4573 - accuracy: 0.9212 - val_loss: 0.5926 - val_accuracy: 0.8691 Epoch 29/50 300/300 [==============================] - 7s 25ms/step - loss: 0.4543 - accuracy: 0.9197 - val_loss: 0.5779 - val_accuracy: 0.8688 Epoch 30/50 300/300 [==============================] - 7s 25ms/step - loss: 0.4422 - accuracy: 0.9246 - val_loss: 0.5949 - val_accuracy: 0.8547 Epoch 31/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4240 - accuracy: 0.9297 - val_loss: 0.6040 - val_accuracy: 0.8571 Epoch 32/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4287 - accuracy: 0.9242 - val_loss: 0.6082 - val_accuracy: 0.8501 Epoch 33/50 300/300 [==============================] - 8s 25ms/step - loss: 0.4125 - accuracy: 0.9308 - val_loss: 0.5575 - val_accuracy: 0.8680 Epoch 34/50 300/300 [==============================] - 8s 26ms/step - loss: 0.3842 - accuracy: 0.9361 - val_loss: 0.5787 - val_accuracy: 0.8557 Epoch 35/50 300/300 [==============================] - 7s 25ms/step - loss: 0.3846 - accuracy: 0.9360 - val_loss: 0.5532 - val_accuracy: 0.8525 Epoch 36/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3767 - accuracy: 0.9374 - val_loss: 0.5956 - val_accuracy: 0.8461 Epoch 37/50 300/300 [==============================] - 8s 26ms/step - loss: 0.3985 - accuracy: 0.9277 - val_loss: 0.6476 - val_accuracy: 0.8224 Epoch 38/50 300/300 [==============================] - 8s 26ms/step - loss: 0.3588 - accuracy: 0.9405 - val_loss: 0.5793 - val_accuracy: 0.8467 Epoch 39/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3535 - accuracy: 0.9430 - val_loss: 0.5505 - val_accuracy: 0.8621 Epoch 40/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3542 - accuracy: 0.9433 - val_loss: 0.5214 - val_accuracy: 0.8584 Epoch 41/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3617 - accuracy: 0.9377 - val_loss: 0.5226 - val_accuracy: 0.8653 Epoch 42/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3518 - accuracy: 0.9439 - val_loss: 0.6717 - val_accuracy: 0.8069 Epoch 43/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3387 - accuracy: 0.9417 - val_loss: 0.6040 - val_accuracy: 0.8341 Epoch 44/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3298 - accuracy: 0.9434 - val_loss: 0.5819 - val_accuracy: 0.8475 Epoch 45/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3256 - accuracy: 0.9422 - val_loss: 0.5307 - val_accuracy: 0.8408 Epoch 46/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3106 - accuracy: 0.9499 - val_loss: 0.6801 - val_accuracy: 0.8144 Epoch 47/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3113 - accuracy: 0.9480 - val_loss: 0.7106 - val_accuracy: 0.7659 Epoch 48/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3168 - accuracy: 0.9469 - val_loss: 0.4893 - val_accuracy: 0.8643 Epoch 49/50 300/300 [==============================] - 8s 25ms/step - loss: 0.2864 - accuracy: 0.9549 - val_loss: 0.5341 - val_accuracy: 0.8619 Epoch 50/50 300/300 [==============================] - 8s 25ms/step - loss: 0.3178 - accuracy: 0.9475 - val_loss: 0.5115 - val_accuracy: 0.8643
We can see that with a lower learning rate, the final accuracy was 86% and loss 0.5115, which is very close to the previous run.
plt.figure()
plt.title("Loss curves")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.plot(history.history['loss'], label = 'train')
plt.plot(history.history['val_loss'], label = 'valid')
plt.legend()
plt.show()
plt.figure()
plt.title("Accuracy curves")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.plot(history.history['accuracy'], label = 'train')
plt.plot(history.history['val_accuracy'], label = 'valid')
plt.legend()
plt.show()
The second run is about the same as the previous, it is still spiky and the accuracy scores is about the same. The first run stopped early, however the second did notso running it for longer would probably give slightly better results as the accuracy and loss evened out.
(e) Test your sentiment-classifier on the test set.
We can evaluate both of our models to validate the accuracy and loss. First we validate the initial run and then the second run.
print("Loss + accuracy on train data: {}".format(model.evaluate(X_train, Y_train)))
print("Loss + accuracy on test data: {}".format(model.evaluate(X_test, Y_test)))
print()
print("Loss + accuracy on train data: {}".format(model2.evaluate(X_train, Y_train)))
print("Loss + accuracy on test data: {}".format(model2.evaluate(X_test, Y_test)))
586/586 [==============================] - 5s 8ms/step - loss: 0.3414 - accuracy: 0.9101 Loss + accuracy on train data: [0.3413655161857605, 0.9100800156593323] 196/196 [==============================] - 2s 8ms/step - loss: 0.4860 - accuracy: 0.8478 Loss + accuracy on test data: [0.48597320914268494, 0.8478400111198425] 586/586 [==============================] - 5s 8ms/step - loss: 0.2785 - accuracy: 0.9603 Loss + accuracy on train data: [0.27854329347610474, 0.960319995880127] 196/196 [==============================] - 2s 8ms/step - loss: 0.5076 - accuracy: 0.8693 Loss + accuracy on test data: [0.5076231360435486, 0.8692799806594849]
Doing a final evaluation on the training and test sets, we can compare them and see that the second model did slightly better.
(h) Use the classifier to classify a few sentences you write yourselves.
my_reviews = [
"I don't like whatever the thing I am reviewing is about, it is simply terrible. Nothing about it is good it's all bad.",
"I like whatever I am reviewing as it is very nice and fun. It has a good theme and pacing i believe."
]
my_reviews = vect.transform(my_reviews)
# Expects first review to be negative and second one positive.
model.predict(my_reviews)
1/1 [==============================] - 0s 73ms/step
array([[0.99781275, 0.00218725], [0.05843586, 0.94156414]], dtype=float32)
As the prediction show, the first review is 99% negative and the second review is 94% positive, which is very good.