To get from the image you have with the nearly touching horizontal and vertical lines to just the rectangles:
- Convert to binary (i.e. all lines
are white, the rest is black)
- Perform a Binary dilation (here you make every pixel that touches a white pixel in the source image or is a white pixel in the source image white. Touch is straight only (so each pixel "touches" the pixels to its left, right, above and below it) this is called "4-connected"
- repeat step 3 a few times if the gaps between the ends are larger then 2 pixels wide, but not too often!
- Perform a skeleton operation (here you make every pixel in the output image black if it is a white pixel in the source image that touches at least one black pixel and the white pixels it touches (in the source image) all touch eachother. Again touch defined with 4-connectedness. See sample below.
- Repeat step 4 untill the image doesn t change after a repeat (all white pixels are line ends or connectors)
This will, with a bit of luck, first show the boxes with thick fat lines, leaving thick fat artifacts all over the image (after step 3) and then then after step 5 all thick fat artifacts will have been removed, while all boxes remain. You need to tweek the number of repeats in step 3 for best results. If you re interested in image morphology, this is the book of a really good introductory course I took.
Sample: (0=black, 1=white, pixels in the center of each 3x3 block are being considered, input left, output right)
011 => 011
011 => 001 all other white pixels touch, so eliminate
011 => 011
010 => 010
010 => 010 top pixel would become disconnected, so leave
010 => 010
010 => 010
010 => 000 touches only one white pixel, so remove
000 => 000
010 => 010
111 => 111 does not touch black pixels, leave
010 => 010
010 => 010
011 => 011 other pixels do not touch. so leave
000 => 000