Calculating distance between two images
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:

and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.
I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.
But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:
0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0
But when I compare this to another picture, such as:
or 
The program will run terribly and only match 1 or 2 of the letters correctly.
The output for testing the second picture(fat letters):
0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0
The output for testing the third picture(handwritten letters):
0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0
I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.
Here is the formula I'm using for my distance

image image-processing image-recognition
|
show 1 more comment
I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:

and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.
I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.
But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:
0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0
But when I compare this to another picture, such as:
or 
The program will run terribly and only match 1 or 2 of the letters correctly.
The output for testing the second picture(fat letters):
0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0
The output for testing the third picture(handwritten letters):
0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0
I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.
Here is the formula I'm using for my distance

image image-processing image-recognition
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17
|
show 1 more comment
I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:

and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.
I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.
But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:
0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0
But when I compare this to another picture, such as:
or 
The program will run terribly and only match 1 or 2 of the letters correctly.
The output for testing the second picture(fat letters):
0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0
The output for testing the third picture(handwritten letters):
0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0
I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.
Here is the formula I'm using for my distance

image image-processing image-recognition
I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:

and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.
I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.
But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:
0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0
But when I compare this to another picture, such as:
or 
The program will run terribly and only match 1 or 2 of the letters correctly.
The output for testing the second picture(fat letters):
0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0
The output for testing the third picture(handwritten letters):
0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0
I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.
Here is the formula I'm using for my distance

image image-processing image-recognition
image image-processing image-recognition
edited Nov 16 '18 at 19:01
dshawn
asked Nov 16 '18 at 18:45
dshawndshawn
566
566
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17
|
show 1 more comment
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17
|
show 1 more comment
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343694%2fcalculating-distance-between-two-images%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343694%2fcalculating-distance-between-two-images%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.
– ZorgoZ
Nov 16 '18 at 18:54
The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?
– dshawn
Nov 16 '18 at 18:57
Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.
– ZorgoZ
Nov 16 '18 at 19:03
So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?
– dshawn
Nov 16 '18 at 19:14
Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.
– ZorgoZ
Nov 16 '18 at 20:17