Computer Vision: Python OCR & Object Detection Quick Starter

Quick Starter for Optical Character Recognition, Image Recognition Object Detection and Object Recognition using Python

  • (5.0) 0 students enrolled

Course Overview

Image Recognition, Object Detection, Object Recognition and also Optical Character Recognition are among the most used applications of Computer Vision.

Using these techniques, the computer will be able to recognize and classify either the whole image or multiple objects inside a single image predicting the class of the objects with the percentage accuracy score. Using OCR, it can also recognize and convert the text in the images to the machine-readable format like text or a document.

Object Detection and Object Recognition is widely used in many simple applications and also complex ones like self-driving cars.

This course will be a quick starter for people who want to dive into Optical Character Recognition, Image Recognition and Object Detection using Python without having to deal with all the complexities and mathematics associated with typical Deep Learning process.

At first, we will have an introductory theory session about Optical Character Recognition technology.

After that, we are ready to proceed with preparing our computer for python coding by downloading and installing the anaconda package and will check and see if everything is installed fine.

Most of you may not be coming from a python-based programming background. The next few sessions and examples will help you get the basic python programming skill to proceed with the sessions included in this course. The topics include Python assignment, flow-control, functions and data structures.

Then we will install the dependencies and libraries that we require to do the Optical Character Recognition. We are using Tesseract Library to do the OCR. At first, we will install the Library and then its python bindings. We will also install OpenCV, which is the Open Source Computer Vision library in Python.

We will install the Pillow library, which is the Python Image Library. Then we will have an introduction to the steps involved in the Optical Character Recognition and later will proceed with coding and implementing the OCR program. We will use a few example images to do Character Recognition testing and will verify the results.

Then we will have an introduction to Convolutional Neural Networks, which we will be using to do the Image Recognition. Here we will be classifying a full image based on the single primary object in it.

We will then proceed with installing the Keras Library which we will be using to do the Image recognition. We will be using the built-in, pre-trained Models that are included in Keras. The base code in python is also provided in the Keras documentation.

At first, We will be using the popular pre-trained model architecture called the VGGNet. We will have an introductory session about the architecture of VGGNet. Then we will proceed with using the pre-trained VGGNet 16 Model included in Keras to do Image Recognition and classification. We will try with a few sample images to check the predictions. Then will move on to a deeper VGGNet 19 Model included in Keras to do Image Recognition and classification.

Then we will try the ResNet pre-trained model included with the Keras library. We will include the model in the code and then we will try with a few sample images to check the predictions.

And after that, we will try the Inception pre-trained model. We will also include the model in the code and then we will try with a few sample images to check the predictions. Then will go ahead with the Xception pre-trained model. Here also, we will include the model in the code and then we will try with a few sample images.

And those were Image Recognition pre-trained models, which can only label and classify a complete image based on the primary object in it. Now we will proceed with Object Recognition in which we can detect and label multiple objects in a single image.

At first, we will have an introduction to MobileNet-SSD Pre-trained Model, which is a single shot detector that is capable of detecting multiple objects in a scene. We will also be having a quick discussion about the dataset that is used to train this model.

Later we will be implementing the MobileNet-SSD Pre-trained Model in our code and will get the predictions and bounding box coordinates for every object detected. We will draw the bounding box around the objects in the image and write the label along with the confidence value.

Then we will go ahead with object detection from a live video. We will be streaming the real-time live video from the computer's webcam and will try to detect objects from it. We will draw a rectangle around each object detected in the live video along with the label and confidence.

In the next session, we will go ahead with object detection from a pre-saved video. We will be streaming the saved video from our folder and will try to detect objects from it. We will draw a rectangle around each object detected along with the label and confidence.

Later we will be going ahead with the Mask-RCNN Pre-trained Model. In the previous model, we were only able to get a bounding box around the object, but in Mask-RCNN, we can get both the box co-ordinates as well the mask over the exact shape of an object detected. We will have an introduction to this model and its details.

Later we will be implementing the Mask-RCNN Pre-trained Model in our code and as the first step, we will get the predictions and bounding box coordinates for every object detected. We will draw the bounding box around the objects in the image and write the label along with the confidence value.

Later we will be getting the mask returned for each object predicted. We will process that data and use it to draw translucent multi-coloured masks over every object detected and write the label along with the confidence value.

Then we will go ahead with object detection from a live video using Mask-RCNN. We will be streaming the real-time live video from the computer's webcam and will try to detect objects from it. We will draw the mask over the perimeter of each object detected in the live video along with the label and confidence.

And like we did for our previous model, we will go ahead with object detection from a pre-saved video using Mask-RCNN. We will be streaming the saved video from our folder and will try to detect objects from it. We will draw coloured masks for object detected along with the label and confidence.

The Mask-RCNN is very accurate with the vast class list but will be very slow in processing images using low power CPU based computers. MobileNet-SSD is fast but less accurate and low in a number of classes. We need a perfect blend of speed and accuracy which will take us to Object Detection and Recognition using Yolo pre-trained model. We will have an overview of the Yolo model in the next session and then we will implement yolo object detection from a single image.

And using that as the base, we will try the Yolo model for object detection from a real-time webcam video and we will check the performance. Later we will use it for object recognition from the pre-saved video file.

To further improve the speed of frames processed, we will use the model called Tiny Yolo which is a lightweight version of the actual Yolo model. We will use tiny yolo at first for the pre-saved video and will analyse the accuracy as well as speed and then we will try the same for a real-time video from webcam and see the difference in performance compared to actual yolo.

Also after completing this course, you will be provided with a course completion certificate which will add value to your portfolio.

 

What are the requirements?

  • A decent configuration computer (preferably Windows) and an enthusiasm to dive into the world of OCR, Image and Object Recognition using Python

What am I going to get from this course?

  • Optical Character Recognition with Tesseract Library, Image Recognition using Keras, Object Recognition using MobileNet SSD, Mask R-CNN, YOLO, Tiny YOLO from static

What is the target audience?

  • Beginners or who wants to start with Python based OCR, Image Recognition and Object Recognition

About the Author

I  am a pioneering, talented and security-oriented Android/iOS Mobile and PHP/Python Web Developer Application Developer offering more than eight years’ overall IT experience which involves designing, implementing, integrating, testing and supporting impact-full web and mobile applications. I am a Post Graduate Masters Degree holder in Computer Science and Engineering. My experience with PHP/Python Programming is an added advantage for server based Android and iOS Client Applications. I am currently serving full time as a Senior Solution Architect managing my client's projects from start to finish to ensure high quality, innovative and functional design.

Course Curriculum

Course Introduction and Table of Contents
1 Video Lectures | 00:09:41

  • Course Introduction and Table of Contents
    09:41
     

Introduction to OCR Concepts and Libraries
1 Video Lectures | 00:04:51

  • Introduction to OCR Concepts and Libraries
    04:51
     

Setting up Environment - Anaconda
1 Video Lectures | 00:08:16

  • Setting up Environment - Anaconda
    08:16
     

Python Basics
4 Video Lectures | 00:34:23

  • Python Basics - Part 1 - Assignment
    09:01
     
  • Python Basics - Part 2 - Flow Control
    09:27
     
  • Python Basics - Part 3 - Data Structures
    11:55
     
  • Python Basics - Part 4 - Functions
    04:00
     

Tesseract OCR Setup
2 Video Lectures | 00:09:38

  • Tesseract OCR Setup - Part 1
    05:08
     
  • Tesseract OCR Setup - Part 2
    04:30
     

OpenCV Setup
1 Video Lectures | 00:04:19

  • OpenCV Setup
    04:19
     

Tesseract Image OCR Implementation
2 Video Lectures | 00:15:44

  • Tesseract Image OCR Implementation - Part 1
    08:06
     
  • Tesseract Image OCR Implementation - Part 2
    07:38
     

Optional: cv2.imshow() Not Responding Issue Fix
1 Video Lectures | 00:01:18

  • Optional: cv2.imshow() Not Responding Issue Fix
    01:18
     

Introduction to CNN - Convolutional Neural Networks - Theory Session
1 Video Lectures | 00:10:09

  • Introduction to CNN - Convolutional Neural Networks - Theory Session
    10:09
     

Installing Additional Dependencies for CNN
1 Video Lectures | 00:03:41

  • Installing Additional Dependencies for CNN
    03:41
     

Introduction to VGGNet Architecture
1 Video Lectures | 00:04:50

  • Introduction to VGGNet Architecture
    04:50
     

Image Recognition using Pre-Trained VGGNet16 Model
1 Document Lectures | 2 Video Lectures | 00:15:31

  • Image Recognition using Pre-Trained VGGNet16 Model - Part 1
    09:42
     
  • Image Recognition using Pre-Trained VGGNet16 Model - Part 2
    05:49
     
  • TensorFlow "Module Not Found" Error Fix (Optional) - Do ONLY if you have error
    1 Page

Image Recognition using Pre-Trained VGGNet19 Model
1 Video Lectures | 00:04:20

  • Image Recognition using Pre-Trained VGGNet19 Model
    04:20
     

Image Recognition using Pre-Trained ResNet Model
1 Video Lectures | 00:05:19

  • Image Recognition using Pre-Trained ResNet Model
    05:19
     

Image Recognition using Pre-Trained Inception Model
1 Video Lectures | 00:06:23

  • Image Recognition using Pre-Trained Inception Model
    06:23
     

Image Recognition using Pre-Trained Xception Model
1 Video Lectures | 00:03:57

  • Image Recognition using Pre-Trained Xception Model
    03:57
     

Introduction to MobileNet-SSD Pretrained Model
1 Video Lectures | 00:06:02

  • Introduction to MobileNet-SSD Pretrained Model
    06:02
     

Mobilenet SSD Object Detection
2 Video Lectures | 00:20:47

  • Mobilenet SSD Object Detection - Part 1
    11:04
     
  • Mobilenet SSD Object Detection - Part 2
    09:43
     

Mobilenet SSD Realtime Video
1 Video Lectures | 00:08:21

  • Mobilenet SSD Realtime Video
    08:21
     

Mobilenet SSD Pre-saved Video
1 Video Lectures | 00:04:07

  • Mobilenet SSD Pre-saved Video
    04:07
     

Mask RCNN Pre-trained model Introduction
1 Video Lectures | 00:05:45

  • Mask RCNN Pre-trained model Introduction
    05:45
     

MaskRCNN Bounding Box Implementation
2 Video Lectures | 00:12:55

  • MaskRCNN Bounding Box Implementation - Part 1
    07:00
     
  • MaskRCNN Bounding Box Implementation - Part 2
    05:55
     

MaskRCNN Object Mask Implementation
2 Video Lectures | 00:12:12

  • MaskRCNN Object Mask Implementation - Part 1
    07:24
     
  • MaskRCNN Object Mask Implementation - Part 2
    04:48
     

MaskRCNN Realtime Video
2 Video Lectures | 00:12:01

  • MaskRCNN Realtime Video - Part 1
    06:22
     
  • MaskRCNN Realtime Video - Part 2
    05:39
     

MaskRCNN Pre-saved Video
1 Video Lectures | 00:02:28

  • MaskRCNN Pre-saved Video
    02:28
     

YOLO Pre-trained Model Introduction
1 Video Lectures | 00:06:48

  • YOLO Pre-trained Model Introduction
    06:48
     

YOLO Implementation
2 Video Lectures | 00:17:47

  • YOLO Implementation - Part 1
    09:37
     
  • YOLO Implementation - Part 2
    08:10
     

YOLO Real-time Video
1 Video Lectures | 00:05:47

  • YOLO Real-time Video
    05:47
     

YOLO Pre-saved Video
1 Video Lectures | 00:02:27

  • YOLO Pre-saved Video
    02:27
     

Tiny YOLO Pre-saved Video
1 Video Lectures | 00:04:33

  • Tiny YOLO Pre-saved Video
    04:33
     

Tiny YOLO Real-time Video
1 Video Lectures | 00:03:44

  • Tiny YOLO Real-time Video
    03:44
     

YOLOv4 - Step 1 - Updating OpenCV Version
1 Video Lectures | 00:07:01

  • YOLOv4 - Step 1 - Updating OpenCV version
    07:01
     

YOLOv4 - Step 2 - Object Recognition Implementation
1 Video Lectures | 00:05:14

  • YOLOv4 - Step 2 - Object Recognition Implementation
    05:14
     

SOURCE CODE AND FILES ATTACHED
1 Document Lectures

  • SOURCE CODE AND FILES ATTACHED
    371 Page

reviews

  • No reviews found