How to implement an efficient im2col function in C++ using OpenCV?











up vote
2
down vote

favorite
2












I've been trying to implement im2col function present in MATLAB and GNU Octave. I found it hard to understand the implementation present in Octave's source code, so I ran the function on few matrices to understand the logic behind it. Using that, I've implemented the same in C++ using OpenCV, and although the result seems to be the same, it's awfully slow.



#include <opencv2/opencv.hpp>
#include <iostream>


using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
Mat input = Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

int x = m - rowBlock + 1;
int y = n - colBlock + 1;

Mat result = Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);
cout << result << endl;

return 0;
}


Is there any way to improve on it? I'm sure I might be doing a lot of things very inefficiently here.










share|improve this question






















  • dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
    – Micka
    Jan 4 '15 at 16:13










  • you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
    – Micka
    Jan 4 '15 at 16:39










  • @Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
    – rayryeng
    Jan 4 '15 at 20:57










  • is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
    – Micka
    Jan 4 '15 at 21:18










  • @Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
    – rayryeng
    Jan 5 '15 at 0:42

















up vote
2
down vote

favorite
2












I've been trying to implement im2col function present in MATLAB and GNU Octave. I found it hard to understand the implementation present in Octave's source code, so I ran the function on few matrices to understand the logic behind it. Using that, I've implemented the same in C++ using OpenCV, and although the result seems to be the same, it's awfully slow.



#include <opencv2/opencv.hpp>
#include <iostream>


using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
Mat input = Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

int x = m - rowBlock + 1;
int y = n - colBlock + 1;

Mat result = Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);
cout << result << endl;

return 0;
}


Is there any way to improve on it? I'm sure I might be doing a lot of things very inefficiently here.










share|improve this question






















  • dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
    – Micka
    Jan 4 '15 at 16:13










  • you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
    – Micka
    Jan 4 '15 at 16:39










  • @Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
    – rayryeng
    Jan 4 '15 at 20:57










  • is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
    – Micka
    Jan 4 '15 at 21:18










  • @Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
    – rayryeng
    Jan 5 '15 at 0:42















up vote
2
down vote

favorite
2









up vote
2
down vote

favorite
2






2





I've been trying to implement im2col function present in MATLAB and GNU Octave. I found it hard to understand the implementation present in Octave's source code, so I ran the function on few matrices to understand the logic behind it. Using that, I've implemented the same in C++ using OpenCV, and although the result seems to be the same, it's awfully slow.



#include <opencv2/opencv.hpp>
#include <iostream>


using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
Mat input = Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

int x = m - rowBlock + 1;
int y = n - colBlock + 1;

Mat result = Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);
cout << result << endl;

return 0;
}


Is there any way to improve on it? I'm sure I might be doing a lot of things very inefficiently here.










share|improve this question













I've been trying to implement im2col function present in MATLAB and GNU Octave. I found it hard to understand the implementation present in Octave's source code, so I ran the function on few matrices to understand the logic behind it. Using that, I've implemented the same in C++ using OpenCV, and although the result seems to be the same, it's awfully slow.



#include <opencv2/opencv.hpp>
#include <iostream>


using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
Mat input = Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

int x = m - rowBlock + 1;
int y = n - colBlock + 1;

Mat result = Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);
cout << result << endl;

return 0;
}


Is there any way to improve on it? I'm sure I might be doing a lot of things very inefficiently here.







c++ matlab opencv image-processing computer-vision






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 4 '15 at 15:25









Koushik S

133




133












  • dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
    – Micka
    Jan 4 '15 at 16:13










  • you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
    – Micka
    Jan 4 '15 at 16:39










  • @Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
    – rayryeng
    Jan 4 '15 at 20:57










  • is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
    – Micka
    Jan 4 '15 at 21:18










  • @Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
    – rayryeng
    Jan 5 '15 at 0:42




















  • dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
    – Micka
    Jan 4 '15 at 16:13










  • you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
    – Micka
    Jan 4 '15 at 16:39










  • @Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
    – rayryeng
    Jan 4 '15 at 20:57










  • is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
    – Micka
    Jan 4 '15 at 21:18










  • @Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
    – rayryeng
    Jan 5 '15 at 0:42


















dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
– Micka
Jan 4 '15 at 16:13




dont know what im2col should do, but I guess it's result would be a 9024 x 35 matrix in your example? Are the elements in each block just written to a row in the result? If yes, what's the ordering? First row of the block in the first elements of the row in the result? second row of the block just after that?
– Micka
Jan 4 '15 at 16:13












you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
– Micka
Jan 4 '15 at 16:39




you have a problem with rows and cols mixed unintuitively with x and y. Maybe your problem is, that your outer loop is over columns and your inner loop is over rows, which is against the data ordering, leading to caching problems.
– Micka
Jan 4 '15 at 16:39












@Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
– rayryeng
Jan 4 '15 at 20:57




@Micka - im2col is a MATLAB function that takes every possible pixel neighbourhood of a known size, converts them into stacked 1D columns, and a matrix is created that concatenates all of these columns together. I use it all the time when I want to implement a filter that doesn't have a built-in equivalent in MATLAB.
– rayryeng
Jan 4 '15 at 20:57












is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
– Micka
Jan 4 '15 at 21:18




is the ordering of the stacked neighborhood important? I would have assumed col-first ordering, but your code gives row-first ordering. Those things are important if you want most efficiency. However, my posted answer should give the same results as your posting, but should be much faster.
– Micka
Jan 4 '15 at 21:18












@Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
– rayryeng
Jan 5 '15 at 0:42






@Micka - It isn't my code or my post, but yes it does it in a col-first ordering. It grabs pixel neighbourhoods column wise, and orders the pixel neighbourhood such that the columns get unrolled first. As such, if we had a pixel neighbourhood that was {{1,2,3}, {4,5,6}, {7,8,9}};, it would become such that: {1,4,7,2,5,8,3,6,9};.
– rayryeng
Jan 5 '15 at 0:42














1 Answer
1






active

oldest

votes

















up vote
4
down vote



accepted










This one is much faster for me:



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(xB*yB,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}
// check your output here...
}


I added it to your code, to test the equality (it would be better to write a function for each one and encapsulate, though ;) )



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// here, your naming of x and y is counter intuitive for me, since I see x being linked to cols normally (e.g. direction of x-axis)
int x = m - rowBlock + 1;
int y = n - colBlock + 1;

cv::Mat result = cv::Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
cv::Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
cv::vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);

std::cout << result.rows << " x " << result.cols << std::endl;

char w;
std::cin >> w;


// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(x*y,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}



std::cout << result2.rows << " x " << result2.cols << std::endl;
std::cin >> w;

// test whether both results are the same:
bool allGood = true;
for(int j=0; j<result.rows; ++j)
for(int i=0; i<result.cols; ++i)
{
if(result.at<float>(j,i) != result2.at<float>(j,i))
{
std::cout << "("<<j<<","<<i<<") = " << result.at<float>(j,i) << " != " << result2.at<float>(j,i) << std::endl;
allGood = false;
}
}
if(allGood) std::cout << "matrices are equal" << std::endl;

std::cin >> w;

return 0;
}





share|improve this answer





















  • This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
    – Koushik S
    Jan 5 '15 at 17:36










  • Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
    – Micka
    Jan 5 '15 at 18:32











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f27767036%2fhow-to-implement-an-efficient-im2col-function-in-c-using-opencv%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
4
down vote



accepted










This one is much faster for me:



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(xB*yB,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}
// check your output here...
}


I added it to your code, to test the equality (it would be better to write a function for each one and encapsulate, though ;) )



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// here, your naming of x and y is counter intuitive for me, since I see x being linked to cols normally (e.g. direction of x-axis)
int x = m - rowBlock + 1;
int y = n - colBlock + 1;

cv::Mat result = cv::Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
cv::Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
cv::vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);

std::cout << result.rows << " x " << result.cols << std::endl;

char w;
std::cin >> w;


// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(x*y,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}



std::cout << result2.rows << " x " << result2.cols << std::endl;
std::cin >> w;

// test whether both results are the same:
bool allGood = true;
for(int j=0; j<result.rows; ++j)
for(int i=0; i<result.cols; ++i)
{
if(result.at<float>(j,i) != result2.at<float>(j,i))
{
std::cout << "("<<j<<","<<i<<") = " << result.at<float>(j,i) << " != " << result2.at<float>(j,i) << std::endl;
allGood = false;
}
}
if(allGood) std::cout << "matrices are equal" << std::endl;

std::cin >> w;

return 0;
}





share|improve this answer





















  • This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
    – Koushik S
    Jan 5 '15 at 17:36










  • Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
    – Micka
    Jan 5 '15 at 18:32















up vote
4
down vote



accepted










This one is much faster for me:



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(xB*yB,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}
// check your output here...
}


I added it to your code, to test the equality (it would be better to write a function for each one and encapsulate, though ;) )



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// here, your naming of x and y is counter intuitive for me, since I see x being linked to cols normally (e.g. direction of x-axis)
int x = m - rowBlock + 1;
int y = n - colBlock + 1;

cv::Mat result = cv::Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
cv::Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
cv::vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);

std::cout << result.rows << " x " << result.cols << std::endl;

char w;
std::cin >> w;


// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(x*y,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}



std::cout << result2.rows << " x " << result2.cols << std::endl;
std::cin >> w;

// test whether both results are the same:
bool allGood = true;
for(int j=0; j<result.rows; ++j)
for(int i=0; i<result.cols; ++i)
{
if(result.at<float>(j,i) != result2.at<float>(j,i))
{
std::cout << "("<<j<<","<<i<<") = " << result.at<float>(j,i) << " != " << result2.at<float>(j,i) << std::endl;
allGood = false;
}
}
if(allGood) std::cout << "matrices are equal" << std::endl;

std::cin >> w;

return 0;
}





share|improve this answer





















  • This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
    – Koushik S
    Jan 5 '15 at 17:36










  • Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
    – Micka
    Jan 5 '15 at 18:32













up vote
4
down vote



accepted







up vote
4
down vote



accepted






This one is much faster for me:



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(xB*yB,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}
// check your output here...
}


I added it to your code, to test the equality (it would be better to write a function for each one and encapsulate, though ;) )



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// here, your naming of x and y is counter intuitive for me, since I see x being linked to cols normally (e.g. direction of x-axis)
int x = m - rowBlock + 1;
int y = n - colBlock + 1;

cv::Mat result = cv::Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
cv::Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
cv::vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);

std::cout << result.rows << " x " << result.cols << std::endl;

char w;
std::cin >> w;


// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(x*y,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}



std::cout << result2.rows << " x " << result2.cols << std::endl;
std::cin >> w;

// test whether both results are the same:
bool allGood = true;
for(int j=0; j<result.rows; ++j)
for(int i=0; i<result.cols; ++i)
{
if(result.at<float>(j,i) != result2.at<float>(j,i))
{
std::cout << "("<<j<<","<<i<<") = " << result.at<float>(j,i) << " != " << result2.at<float>(j,i) << std::endl;
allGood = false;
}
}
if(allGood) std::cout << "matrices are equal" << std::endl;

std::cin >> w;

return 0;
}





share|improve this answer












This one is much faster for me:



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(xB*yB,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}
// check your output here...
}


I added it to your code, to test the equality (it would be better to write a function for each one and encapsulate, though ;) )



int main()
{
cv::Mat input = cv::Mat::eye(100,100,CV_32FC1);
input.at<float>(1,2) = 2; //Makes it easier to verify the correct solution
int rowBlock = 7;
int colBlock = 5;

int m = input.rows;
int n = input.cols;

// here, your naming of x and y is counter intuitive for me, since I see x being linked to cols normally (e.g. direction of x-axis)
int x = m - rowBlock + 1;
int y = n - colBlock + 1;

cv::Mat result = cv::Mat::zeros(1,rowBlock*colBlock,CV_32FC1);

for(int i = 0; i< y; i++)
{
for (int j = 0; j< x; j++)
{
cv::Mat temp2 = input.rowRange(j,j+rowBlock).colRange(i,i+colBlock).t();
temp2 = temp2.reshape(1,1);
cv::vconcat(result,temp2,result);
}
}
result = result.rowRange(1,result.rows);

std::cout << result.rows << " x " << result.cols << std::endl;

char w;
std::cin >> w;


// using right x = col; y = row
int yB = m - rowBlock + 1;
int xB = n - colBlock + 1;

// you know the size of the result in the beginning, so allocate it all at once
cv::Mat result2 = cv::Mat::zeros(x*y,rowBlock*colBlock,CV_32FC1);
for(int i = 0; i< yB; i++)
{
for (int j = 0; j< xB; j++)
{
// here yours is in different order than I first thought:
//int rowIdx = j + i*xB; // my intuition how to index the result
int rowIdx = i + j*yB;

for(unsigned int yy =0; yy < rowBlock; ++yy)
for(unsigned int xx=0; xx < colBlock; ++xx)
{
// here take care of the transpose in the original method
//int colIdx = xx + yy*colBlock; // this would be not transposed
int colIdx = xx*rowBlock + yy;

result2.at<float>(rowIdx,colIdx) = input.at<float>(i+yy, j+xx);
}

}
}



std::cout << result2.rows << " x " << result2.cols << std::endl;
std::cin >> w;

// test whether both results are the same:
bool allGood = true;
for(int j=0; j<result.rows; ++j)
for(int i=0; i<result.cols; ++i)
{
if(result.at<float>(j,i) != result2.at<float>(j,i))
{
std::cout << "("<<j<<","<<i<<") = " << result.at<float>(j,i) << " != " << result2.at<float>(j,i) << std::endl;
allGood = false;
}
}
if(allGood) std::cout << "matrices are equal" << std::endl;

std::cin >> w;

return 0;
}






share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 4 '15 at 16:55









Micka

13.2k22847




13.2k22847












  • This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
    – Koushik S
    Jan 5 '15 at 17:36










  • Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
    – Micka
    Jan 5 '15 at 18:32


















  • This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
    – Koushik S
    Jan 5 '15 at 17:36










  • Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
    – Micka
    Jan 5 '15 at 18:32
















This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
– Koushik S
Jan 5 '15 at 17:36




This is excellent! I only had to transpose the resultant matrix for my application, but thanks a lot for your time and help!
– Koushik S
Jan 5 '15 at 17:36












Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
– Micka
Jan 5 '15 at 18:32




Nice to hear! The code might be even more efficient if you consider the transposing already during/before computation, but if it is fast enough I guess it is not worth the work :)
– Micka
Jan 5 '15 at 18:32


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f27767036%2fhow-to-implement-an-efficient-im2col-function-in-c-using-opencv%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Danny Elfman

Lugert, Oklahoma