How to Segment handwritten and printed digit without losing information in opencv?
I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.
Sample:
How to get all 5 characters separately?
python-3.x opencv image-processing computer-vision digits
add a comment |
I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.
Sample:
How to get all 5 characters separately?
python-3.x opencv image-processing computer-vision digits
1
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
yes, you are right.
– Zara
Nov 13 '18 at 6:48
add a comment |
I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.
Sample:
How to get all 5 characters separately?
python-3.x opencv image-processing computer-vision digits
I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.
Sample:
How to get all 5 characters separately?
python-3.x opencv image-processing computer-vision digits
python-3.x opencv image-processing computer-vision digits
edited Jan 16 at 12:58
Zara
asked Oct 25 '18 at 18:11
ZaraZara
5210
5210
1
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
yes, you are right.
– Zara
Nov 13 '18 at 6:48
add a comment |
1
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
yes, you are right.
– Zara
Nov 13 '18 at 6:48
1
1
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
yes, you are right.
– Zara
Nov 13 '18 at 6:48
yes, you are right.
– Zara
Nov 13 '18 at 6:48
add a comment |
2 Answers
2
active
oldest
votes
Segmenting characters from the image -
Approach -
- Threshold the image (Convert it to BW)
- Perform Dilation
- Check the contours are large enough
- Find rectangular Contours
- Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -

Threshold -

Dilated -

Contours -

Saved characters -




Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
add a comment |
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
- if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
- if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52995607%2fhow-to-segment-handwritten-and-printed-digit-without-losing-information-in-openc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Segmenting characters from the image -
Approach -
- Threshold the image (Convert it to BW)
- Perform Dilation
- Check the contours are large enough
- Find rectangular Contours
- Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -

Threshold -

Dilated -

Contours -

Saved characters -




Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
add a comment |
Segmenting characters from the image -
Approach -
- Threshold the image (Convert it to BW)
- Perform Dilation
- Check the contours are large enough
- Find rectangular Contours
- Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -

Threshold -

Dilated -

Contours -

Saved characters -




Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
add a comment |
Segmenting characters from the image -
Approach -
- Threshold the image (Convert it to BW)
- Perform Dilation
- Check the contours are large enough
- Find rectangular Contours
- Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -

Threshold -

Dilated -

Contours -

Saved characters -




Segmenting characters from the image -
Approach -
- Threshold the image (Convert it to BW)
- Perform Dilation
- Check the contours are large enough
- Find rectangular Contours
- Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -

Threshold -

Dilated -

Contours -

Saved characters -




edited Nov 14 '18 at 7:08
answered Nov 1 '18 at 5:28
Devashish PrasadDevashish Prasad
416314
416314
Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
add a comment |
Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
Can you please remove images from your answer because of privacy issues?
– Zara
Nov 13 '18 at 6:48
1
1
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
Okay i will replace those images with different ones
– Devashish Prasad
Nov 14 '18 at 6:44
add a comment |
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
- if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
- if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character
add a comment |
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
- if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
- if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character
add a comment |
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
- if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
- if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
- if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
- if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character
answered Oct 31 '18 at 10:33
Y.ALY.AL
1,364922
1,364922
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52995607%2fhow-to-segment-handwritten-and-printed-digit-without-losing-information-in-openc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
If I understand your question, you have two problems, 1) is bottom part of digits could be cropped, 2) is how to segment digits (objects) from BW images.. right ?
– Y.AL
Oct 31 '18 at 9:34
yes, you are right.
– Zara
Nov 13 '18 at 6:48