How to Segment handwritten and printed digit without losing information in opencv?

I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.

Sample:
enter image description here

How to get all 5 characters separately?

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

1

If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?

– Y.AL
Oct 31 '18 at 9:34

yes, you are right.

– Zara
Nov 13 '18 at 6:48

add a comment |

Sample:
enter image description here

How to get all 5 characters separately?

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

1

If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?

– Y.AL
Oct 31 '18 at 9:34

yes, you are right.

– Zara
Nov 13 '18 at 6:48

add a comment |

Sample:
enter image description here

How to get all 5 characters separately?

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

Sample:
enter image description here

How to get all 5 characters separately?

python-3.x opencv image-processing computer-vision digits

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

edited Jan 16 at 12:58

asked Oct 25 '18 at 18:11

Zara

5210

asked Oct 25 '18 at 18:11

Zara

5210

asked Oct 25 '18 at 18:11

Zara

5210

1

If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?

– Y.AL
Oct 31 '18 at 9:34

yes, you are right.

– Zara
Nov 13 '18 at 6:48

add a comment |

1

If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?

– Y.AL
Oct 31 '18 at 9:34

yes, you are right.

– Zara
Nov 13 '18 at 6:48

If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?

– Y.AL
Oct 31 '18 at 9:34

yes, you are right.

– Zara
Nov 13 '18 at 6:48

add a comment |

2 Answers
2

active

oldest

votes

+25

Segmenting characters from the image -

Approach -

Threshold the image (Convert it to BW)

Perform Dilation

Check the contours are large enough

Find rectangular Contours

Take ROI and save the characters

Python Code -

# import the necessary packages

import numpy as np

import cv2

import imutils



# load the image, convert it to grayscale, and blur it to remove noise

image = cv2.imread("sample1.jpg")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.GaussianBlur(gray, (7, 7), 0)



# threshold the image

ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)



# dilate the white portions

dilate = cv2.dilate(thresh1, None, iterations=2)



# find contours in the image

cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,

    cv2.CHAIN_APPROX_SIMPLE)

cnts = cnts[0] if imutils.is_cv2() else cnts[1]



orig = image.copy()

i = 0



for cnt in cnts:

    # Check the area of contour, if it is very small ignore it

    if(cv2.contourArea(cnt) < 100):

        continue



    # Filtered countours are detected

    x,y,w,h = cv2.boundingRect(cnt)



    # Taking ROI of the cotour

    roi = image[y:y+h, x:x+w]



    # Mark them on the image if you want

    cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)



    # Save your contours or characters

    cv2.imwrite("roi" + str(i) + ".png", roi)



    i = i + 1 



cv2.imshow("Image", orig) 

cv2.waitKey(0)

First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.

Input Image -

Input

Threshold -

Thresh

Dilated -

Dilate

Contours -

Contours

Saved characters -

char2 char0 char1 char3

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

1

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

add a comment |

Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).

if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)

if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)

Problem of characters separation :

opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)

opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

answered Oct 31 '18 at 10:33

Y.AL

1,364922

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52995607%2fhow-to-segment-handwritten-and-printed-digit-without-losing-information-in-openc%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

+25

Segmenting characters from the image -

Approach -

Threshold the image (Convert it to BW)

Perform Dilation

Check the contours are large enough

Find rectangular Contours

Take ROI and save the characters

Python Code -

# import the necessary packages

import numpy as np

import cv2

import imutils



# load the image, convert it to grayscale, and blur it to remove noise

image = cv2.imread("sample1.jpg")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.GaussianBlur(gray, (7, 7), 0)



# threshold the image

ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)



# dilate the white portions

dilate = cv2.dilate(thresh1, None, iterations=2)



# find contours in the image

cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,

    cv2.CHAIN_APPROX_SIMPLE)

cnts = cnts[0] if imutils.is_cv2() else cnts[1]



orig = image.copy()

i = 0



for cnt in cnts:

    # Check the area of contour, if it is very small ignore it

    if(cv2.contourArea(cnt) < 100):

        continue



    # Filtered countours are detected

    x,y,w,h = cv2.boundingRect(cnt)



    # Taking ROI of the cotour

    roi = image[y:y+h, x:x+w]



    # Mark them on the image if you want

    cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)



    # Save your contours or characters

    cv2.imwrite("roi" + str(i) + ".png", roi)



    i = i + 1 



cv2.imshow("Image", orig) 

cv2.waitKey(0)

Input Image -

Input

Threshold -

Thresh

Dilated -

Dilate

Contours -

Contours

Saved characters -

char2 char0 char1 char3

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

1

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

add a comment |

+25

Segmenting characters from the image -

Approach -

Threshold the image (Convert it to BW)

Perform Dilation

Check the contours are large enough

Find rectangular Contours

Take ROI and save the characters

Python Code -

# import the necessary packages

import numpy as np

import cv2

import imutils



# load the image, convert it to grayscale, and blur it to remove noise

image = cv2.imread("sample1.jpg")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.GaussianBlur(gray, (7, 7), 0)



# threshold the image

ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)



# dilate the white portions

dilate = cv2.dilate(thresh1, None, iterations=2)



# find contours in the image

cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,

    cv2.CHAIN_APPROX_SIMPLE)

cnts = cnts[0] if imutils.is_cv2() else cnts[1]



orig = image.copy()

i = 0



for cnt in cnts:

    # Check the area of contour, if it is very small ignore it

    if(cv2.contourArea(cnt) < 100):

        continue



    # Filtered countours are detected

    x,y,w,h = cv2.boundingRect(cnt)



    # Taking ROI of the cotour

    roi = image[y:y+h, x:x+w]



    # Mark them on the image if you want

    cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)



    # Save your contours or characters

    cv2.imwrite("roi" + str(i) + ".png", roi)



    i = i + 1 



cv2.imshow("Image", orig) 

cv2.waitKey(0)

Input Image -

Input

Threshold -

Thresh

Dilated -

Dilate

Contours -

Contours

Saved characters -

char2 char0 char1 char3

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

1

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

add a comment |

+25

Segmenting characters from the image -

Approach -

Threshold the image (Convert it to BW)

Perform Dilation

Check the contours are large enough

Find rectangular Contours

Take ROI and save the characters

Python Code -

# import the necessary packages

import numpy as np

import cv2

import imutils



# load the image, convert it to grayscale, and blur it to remove noise

image = cv2.imread("sample1.jpg")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.GaussianBlur(gray, (7, 7), 0)



# threshold the image

ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)



# dilate the white portions

dilate = cv2.dilate(thresh1, None, iterations=2)



# find contours in the image

cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,

    cv2.CHAIN_APPROX_SIMPLE)

cnts = cnts[0] if imutils.is_cv2() else cnts[1]



orig = image.copy()

i = 0



for cnt in cnts:

    # Check the area of contour, if it is very small ignore it

    if(cv2.contourArea(cnt) < 100):

        continue



    # Filtered countours are detected

    x,y,w,h = cv2.boundingRect(cnt)



    # Taking ROI of the cotour

    roi = image[y:y+h, x:x+w]



    # Mark them on the image if you want

    cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)



    # Save your contours or characters

    cv2.imwrite("roi" + str(i) + ".png", roi)



    i = i + 1 



cv2.imshow("Image", orig) 

cv2.waitKey(0)

Input Image -

Input

Threshold -

Thresh

Dilated -

Dilate

Contours -

Contours

Saved characters -

char2 char0 char1 char3

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

Segmenting characters from the image -

Approach -

Threshold the image (Convert it to BW)

Perform Dilation

Check the contours are large enough

Find rectangular Contours

Take ROI and save the characters

Python Code -

# import the necessary packages

import numpy as np

import cv2

import imutils



# load the image, convert it to grayscale, and blur it to remove noise

image = cv2.imread("sample1.jpg")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = cv2.GaussianBlur(gray, (7, 7), 0)



# threshold the image

ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)



# dilate the white portions

dilate = cv2.dilate(thresh1, None, iterations=2)



# find contours in the image

cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,

    cv2.CHAIN_APPROX_SIMPLE)

cnts = cnts[0] if imutils.is_cv2() else cnts[1]



orig = image.copy()

i = 0



for cnt in cnts:

    # Check the area of contour, if it is very small ignore it

    if(cv2.contourArea(cnt) < 100):

        continue



    # Filtered countours are detected

    x,y,w,h = cv2.boundingRect(cnt)



    # Taking ROI of the cotour

    roi = image[y:y+h, x:x+w]



    # Mark them on the image if you want

    cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)



    # Save your contours or characters

    cv2.imwrite("roi" + str(i) + ".png", roi)



    i = i + 1 



cv2.imshow("Image", orig) 

cv2.waitKey(0)

Input Image -

Input

Threshold -

Thresh

Dilated -

Dilate

Contours -

Contours

Saved characters -

char2 char0 char1 char3

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

edited Nov 14 '18 at 7:08

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

answered Nov 1 '18 at 5:28

Devashish Prasad

416314

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

1

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

add a comment |

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

1

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

Can you please remove images from your answer because of privacy issues?

– Zara
Nov 13 '18 at 6:48

Okay i will replace those images with different ones

– Devashish Prasad
Nov 14 '18 at 6:44

add a comment |

Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).

if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)

if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)

Problem of characters separation :

opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)

opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

answered Oct 31 '18 at 10:33

Y.AL

1,364922

add a comment |

Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).

if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)

if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)

Problem of characters separation :

opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)

opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

answered Oct 31 '18 at 10:33

Y.AL

1,364922

add a comment |

Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).

if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)

if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)

Problem of characters separation :

opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)

opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

answered Oct 31 '18 at 10:33

Y.AL

1,364922

Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).

if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)

if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)

Problem of characters separation :

opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)

opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

answered Oct 31 '18 at 10:33

Y.AL

1,364922

answered Oct 31 '18 at 10:33

Y.AL

1,364922

answered Oct 31 '18 at 10:33

Y.AL

1,364922

answered Oct 31 '18 at 10:33

Y.AL

1,364922

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky