Calculating distance between two images





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:
enter image description here



and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.



I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.



But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:



0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0


But when I compare this to another picture, such as:
enter image description here or enter image description here



The program will run terribly and only match 1 or 2 of the letters correctly.



The output for testing the second picture(fat letters):



0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0


The output for testing the third picture(handwritten letters):



0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0


I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.



Here is the formula I'm using for my distance
enter image description here










share|improve this question

























  • And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

    – ZorgoZ
    Nov 16 '18 at 18:54













  • The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

    – dshawn
    Nov 16 '18 at 18:57













  • Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

    – ZorgoZ
    Nov 16 '18 at 19:03











  • So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

    – dshawn
    Nov 16 '18 at 19:14











  • Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

    – ZorgoZ
    Nov 16 '18 at 20:17




















0















I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:
enter image description here



and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.



I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.



But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:



0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0


But when I compare this to another picture, such as:
enter image description here or enter image description here



The program will run terribly and only match 1 or 2 of the letters correctly.



The output for testing the second picture(fat letters):



0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0


The output for testing the third picture(handwritten letters):



0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0


I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.



Here is the formula I'm using for my distance
enter image description here










share|improve this question

























  • And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

    – ZorgoZ
    Nov 16 '18 at 18:54













  • The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

    – dshawn
    Nov 16 '18 at 18:57













  • Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

    – ZorgoZ
    Nov 16 '18 at 19:03











  • So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

    – dshawn
    Nov 16 '18 at 19:14











  • Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

    – ZorgoZ
    Nov 16 '18 at 20:17
















0












0








0








I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:
enter image description here



and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.



I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.



But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:



0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0


But when I compare this to another picture, such as:
enter image description here or enter image description here



The program will run terribly and only match 1 or 2 of the letters correctly.



The output for testing the second picture(fat letters):



0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0


The output for testing the third picture(handwritten letters):



0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0


I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.



Here is the formula I'm using for my distance
enter image description here










share|improve this question
















I'm trying to make an image recognition program where it can recognize basic numbers from 0 to 9. what it does is I feed the program a black and white picture like the following:
enter image description here



and it will scale down each of the letters to 9px by 9px, then it will analyze the 9 3x3 regions and generate a ratio of black/white pixels for each of those regions, and then those 9 ratios for the 9 regions will be saved into an array. In the end 10 of these arrays with 9 ratios will be generated, and it will be saved to a file.



I then add another file and scale its letters down to 9x9, note that it will be the same kind of image with numbers 0 to 9 in black and white. At this point I will do a nested for loop, for each letter on this new image, I will calculate the Euclidean distance from all the symbols from the saved file, by subtracting the specific 3x3 region's ratio, and squaring it. after I have added all 9, i square root the number at the end. After all the loops it will return the lowest Euclidean distance it found out of the 10, and return the index where that was found. it will do this for all 10 numbers, from 0 to 9.



But here I have run into an issue, I'm not sure if I'm doing something incorrectly but when I test this against the same image, sure enough I get a minimum euclidean distance of 0 for each of the numbers when comapred to themselves. Here is the output for when its compared to itself:



0: min:0.0,closest to symbol 0.0
1: min:0.0,closest to symbol 1.0
2: min:0.0,closest to symbol 2.0
3: min:0.0,closest to symbol 3.0
4: min:0.0,closest to symbol 4.0
5: min:0.0,closest to symbol 5.0
6: min:0.0,closest to symbol 6.0
7: min:0.0,closest to symbol 7.0
8: min:0.0,closest to symbol 8.0
9: min:0.0,closest to symbol 9.0


But when I compare this to another picture, such as:
enter image description here or enter image description here



The program will run terribly and only match 1 or 2 of the letters correctly.



The output for testing the second picture(fat letters):



0: min:1.8506293555082927,closest to symbol 2.0
1: min:1.564875407093958,closest to symbol 1.0
2: min:0.3639905193866784,closest to symbol 2.0
3: min:1.1955040828800994,closest to symbol 2.0
4: min:1.3529365858707012,closest to symbol 3.0
5: min:2.898762101870034,closest to symbol 3.0
6: min:1.5830312225733887,closest to symbol 3.0
7: min:0.8423801045588752,closest to symbol 2.0
8: min:0.5368578842642693,closest to symbol 2.0
9: min:0.7954891148284288,closest to symbol 2.0


The output for testing the third picture(handwritten letters):



0: min:0.9028763024523015,closest to symbol 0.0
1: min:1.4312693941385868,closest to symbol 2.0
2: min:0.9545516809617107,closest to symbol 3.0
3: min:1.254754527423458,closest to symbol 5.0
4: min:0.9153443316713837,closest to symbol 6.0
5: min:1.7914458590530422,closest to symbol 0.0
6: min:1.3450158998859059,closest to symbol 0.0
7: min:1.077083815334289,closest to symbol 6.0
8: min:0.725648713927017,closest to symbol 6.0
9: min:0.6018180093870922,closest to symbol 3.0


I get that the letters look different with different fonts and it probably needs more than just 1 image to recognize other fonts accurately, but the accuracy is so terrible it makes me think I have to be doing something wrong. The handwritten one does look pretty different but the fat letter one looks mostly the same as mine except its thicker. It's only recognizing one or two out of the 10 numbers for both images and I feel like it's by luck, like how a broken clock is right twice a day. When I printed out the euclidean distances for fat 9, 9 literally gave the highest euclidean distance, that tells me something has to be wrong.



Here is the formula I'm using for my distance
enter image description here







image image-processing image-recognition






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 19:01







dshawn

















asked Nov 16 '18 at 18:45









dshawndshawn

566




566













  • And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

    – ZorgoZ
    Nov 16 '18 at 18:54













  • The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

    – dshawn
    Nov 16 '18 at 18:57













  • Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

    – ZorgoZ
    Nov 16 '18 at 19:03











  • So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

    – dshawn
    Nov 16 '18 at 19:14











  • Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

    – ZorgoZ
    Nov 16 '18 at 20:17





















  • And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

    – ZorgoZ
    Nov 16 '18 at 18:54













  • The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

    – dshawn
    Nov 16 '18 at 18:57













  • Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

    – ZorgoZ
    Nov 16 '18 at 19:03











  • So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

    – dshawn
    Nov 16 '18 at 19:14











  • Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

    – ZorgoZ
    Nov 16 '18 at 20:17



















And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

– ZorgoZ
Nov 16 '18 at 18:54







And what distance measure are you using? Btw, topological distance measures are not really suitable for such tasks.

– ZorgoZ
Nov 16 '18 at 18:54















The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

– dshawn
Nov 16 '18 at 18:57







The distance measure I'm using simply sums up the square of the difference of ratio between the 2 pictures in each 3x3 region for all 9 regions and square roots the sum. If this is bad what type of distance measure should I be using for this task?

– dshawn
Nov 16 '18 at 18:57















Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

– ZorgoZ
Nov 16 '18 at 19:03





Beware, that characters are rather different. The topology on its own is not discriminative enough. You can recognize only slightly different glyphs. But the ones you are trying are too different from the reference.

– ZorgoZ
Nov 16 '18 at 19:03













So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

– dshawn
Nov 16 '18 at 19:14





So would you say what I'm experiencing is completely normal and I just need to compare more features than just black/white pixel ratio to determine the number?

– dshawn
Nov 16 '18 at 19:14













Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

– ZorgoZ
Nov 16 '18 at 20:17







Yes, that's my point. If you would want to recognize license plates for example, where there is some sort of standard font (in many countries at least), you could use this approach to compare stadard glyphs with real-life license plates. But a general OCR (even with a restricted character set) needs more "intelligent" algorythms.

– ZorgoZ
Nov 16 '18 at 20:17














0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343694%2fcalculating-distance-between-two-images%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343694%2fcalculating-distance-between-two-images%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

The Sandy Post

Danny Elfman

Pages that link to "Head v. Amoskeag Manufacturing Co."