5. Sign Language Translator

5. Sign Language Translator

Github: https://github.com/mavericks-angelhack2019/angelhack2019

Media Press: Đội của cựu học sinh Trường THPT chuyên Lương Văn Chánh đoạt giải nhất

Award: First prize in AngelHack Hackthon 2019 + AWS Challenge from Amazon

Sign language translator is a Deep Learning solution for muted people created by Christopher Le (Le Dam Hong Loc). The software will detect user’s sign language and translate it into alphabet on screen. This project is still in development for optimization

Steps initialized:

Roi (Region of interests): to capture a only a specific region Image Thresholding to transform recording into binary Contours to remove background

Simple Sign Language Translator’s Tutorials (SLT)

Team: Mavericks

WE WON!

A. Dataset: https://drive.google.com/open?id=1U0LI3hXbc5-lAfjpjDRqtPVzmEzD3GR7

  1. Explanation: The dataset’s images are mainly back and white and captured without background’s noise

Example:

  1. Test’s data: In supervised learning like this, test data means data without knowing the outcome. The model will base on the train data to predict the outcome
  2. Train’s data: Train data helps the model predict the values because they already have results. Train data is always larger than test data

**B. Train model: **

  1. Set up: Note: Please implement this step through Google Colab with GPU in settings because your local computer maybe damaged while training this.

    Steps: a. Created a new Python3 notebook on Google Colabs b. Go to Edit->Notebook Settings-> Choose Hardware: GPU c. Go to Files, upload the zip file “mydata.zip” (Remember the zip file, not the whole folder) d. Type the first line: “!unzip mydata.zip”

  2. Code: First created a file with name “train.py” In the next line after the unzip, paste this:
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense, Dropout
from keras import optimizers

import matplotlib.pyplot as plt

# Step 1 - Building the CNN
classifier = Sequential()

# Initializing the CNN
classifier.add(Convolution2D(32, 3,  3, input_shape = (64, 64, 1), activation = 'relu'))

#step 2 - Pooling
classifier.add(MaxPooling2D(pool_size =(2,2)))

# Adding second convolution layer
classifier.add(Convolution2D(32, 3,  3, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size =(2,2)))

#Adding 3rd Concolution Layer
classifier.add(Convolution2D(64, 3,  3, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size =(2,2)))


#Step 3 - Flattening
classifier.add(Flatten())

#Step 4 - Full Connection
classifier.add(Dense(256, activation = 'relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(27, activation = 'softmax'))

#Compiling The CNN
classifier.compile(
                   optimizer = optimizers.SGD(lr = 0.01),
                   loss = 'categorical_crossentropy',
                   metrics = ['accuracy'])

# Step 2 - Preparing the train/test data and training the model

# Code copied from - https://keras.io/preprocessing/image/
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

training_set = train_datagen.flow_from_directory('mydata/training_set',
                                                 target_size=(64, 64),
                                                 batch_size=32,
                                                 color_mode='grayscale',
                                                 class_mode='categorical')

test_set = test_datagen.flow_from_directory('mydata/test_set',
                                            target_size=(64, 64),
                                            batch_size=5,
                                            color_mode='grayscale',
                                            class_mode='categorical')
model = classifier.fit_generator(
        training_set,
        steps_per_epoch=10000, # No of images in training set
        epochs=27,
        validation_data=test_set,
        validation_steps=6750)# No of images in test set



# Saving the model
model_json = classifier.to_json()
with open("model-bw.json", "w") as json_file:
    json_file.write(model_json)
classifier.save_weights('model-bw.h5') #Model is saved with the name "model-bw.h5"


#Plot the result
plt.plot(model.history['acc'])
plt.plot(model.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss

plt.plot(model.history['loss'])
plt.plot(model.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

This will take some hours so please run this overnight and be patient. After that, download the model created

**C. Implement the model with OpenCV through Webcam: ** Create a new python file named: “translator.py” Codes:

import numpy as np
from keras.models import model_from_json
import operator
import cv2
import sys, os

# Loading the model
json_file = open("model-bw.json", "r")
model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(model_json)
# load weights into new model
loaded_model.load_weights("model-bw.h5")
print("Loaded model from disk")

cap = cv2.VideoCapture(0)
bgRemover = cv2.createBackgroundSubtractorMOG2()

# Category dictionary
categories = {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E', 5: 'F', 6: 'G', 7: 'H', 8: 'I', 9: 'J', 10: 'K', 11: 'L', 12: 'M', 13: 'N', 14: 'O', 15: 'P', 16: 'Q', 17: 'R', 18: 'S', 19: 'space', 20: 'T', 21: 'U', 22: 'V', 23: 'W', 24: 'X', 25: 'Y', 26: 'Z'}

while True:
    _, frame = cap.read()
    bgRemoveMask = bgRemover.apply(frame)

    # Simulating mirror image
    frame = cv2.flip(frame, 1)

    # Got this from collect-data.py
    # Coordinates of the ROI
    x1 = int(0.5*frame.shape[1])
    y1 = 10
    x2 = frame.shape[1]-10
    y2 = int(0.5*frame.shape[1])
    # Drawing the ROI
    # The increment/decrement by 1 is to compensate for the bounding box
    cv2.rectangle(frame, (x1-1, y1-1), (x2+1, y2+1), (255,0,0) ,1)
    # Extracting the ROI
    roi = frame[y1:y2, x1:x2]

    # Resizing the ROI so it can be fed to the model for prediction
    roi = cv2.resize(roi, (64, 64))
    roi = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
    _, test_image = cv2.threshold(roi, 120, 255, cv2.THRESH_BINARY)
    cv2.imshow("test", test_image)
    cv2.imshow('original', frame)
    cv2.imshow('bgRemover', bgRemoveMask)


    # Batch of 1
    result = loaded_model.predict(test_image.reshape(1, 64, 64, 1))
    prediction = {'A': result[0][0],
        'B': result[0][1],
        'C': result[0][2],
        'D': result[0][3],
        'E': result[0][4],
        'F': result[0][5],
        'G': result[0][6],
        'H': result[0][7],
        'I': result[0][8],
        'J': result[0][9],
        'K': result[0][10],
        'L': result[0][11],
        'M': result[0][12],
        'N': result[0][13],
        'O': result[0][14],
        'P': result[0][15],
        'Q': result[0][16],
        'R': result[0][17],
        'S': result[0][18],
        'space': result[0][19],
        'T': result[0][20],
        'U': result[0][21],
        'V': result[0][22],
        'W': result[0][23],
        'X': result[0][24],
        'Y': result[0][25],
        'Z': result[0][26]}

    # Sorting based on top prediction
    prediction = sorted(prediction.items(), key=operator.itemgetter(1), reverse=True)

    # Displaying the predictions
    cv2.putText(frame, prediction[0][0], (10, 120), cv2.FONT_HERSHEY_PLAIN, 1, (0,255,255), 1)
    cv2.imshow("Frame", frame)

    interrupt = cv2.waitKey(10)
    if interrupt & 0xFF == 27: # esc key
        break


cap.release()
cv2.destroyAllWindows()

D. Running the files: In the terminal, type: python translator.py

_Problems: _

  1. Background subtraction - Solution: https://www.youtube.com/watch?v=8-3vl71TjDs
  2. Convert this program into website - Solution: Reference 1: https://www.codepool.biz/web-camera-recorder-oepncv-flask.html

    Reference 2: https://webrtchacks.com/webrtc-cv-tensorflow/?fbclid=IwAR3u6K2T0IM8aXXAZdDevwjTqajRmMycF8P9T6dVWM1OG3Vg_ODu9pRyJpw

  3. AWS Sagemaker Implementation Tensorflow - Solution: AWS Sagemaker’s documentation: https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-tensorflow.html

Quick fix if the final product is not okay: 1. Clone and use this Github: https://github.com/evilport2/sign-language 2. Open terminal and run: python recognize_gesture.py 3. Record a demo video with good background with least noise

Other references: _ OpenCV’s documentation: https://docs.opencv.org/master/d9/df8/tutorialroot.html

Tensorflows’ documentation: https://www.tensorflow.org/api_docs

CNN (Convolutional Neural Networks)‘s explanation - If somone ever asks: https://www.youtube.com/watch?v=YRhxdVk_sIs

Me On Instagram

Get The Latest Updates Delivered To Your Inbox

Subscribe to my newsletter and stay updated.

IL 61201, US - @2020 Christopher Le