AI_Research
Entry: 005
Gesture
Recognition_CNN
Published: 2024.09.05
Read_Time: 10 min
Computer Vision in Industry
As part of my M.Sc. thesis, I researched the application of Convolutional Neural Networks (CNN) for real-time gesture detection in noisy industrial environments.
Architectural Approach
We moved away from standard VGG models towards a lighter MobileNet-v2 based architecture with custom attention heads to handle occlusion and lighting variances common in factory floors.
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(num_classes, activation='softmax'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(num_classes, activation='softmax'))
Initial Performance
The prototype achieved a consistent 96% accuracy on the test set while maintaining 24 FPS on an embedded CPU, proving that high-performance vision doesn't always require a dedicated GPU.