We want to build a classifier for a simplistic OCR system. It should map any grayscale image of an ‚a‘ to the discrete symbol ‚a‘. The grayscale image to the left is already a member of a discrete, finite set X: There are 17×20 pixels, and each pixel takes one of 256 values. The number of possible images is small enough to be written down on this page: 6.33E818 needs less than 1K digits. On the other hand, there is no chance to store the classifying function X → Y as a table.

The a-ness of the image survives under a wide range of linear or nonlinear transformations. Most notable are combinations of scaling and threshold operations. First, the image is scaled down to 5×6 pixels. You may need to squint to see more clearly, but the ‚a‘ is still recognizable. The threshold operation separates the thirty pixels into two sets. Each pixel in the dark  set is darker than any pixel in the white set and vice versa.  Note how small the image space X1 has become. It has only binomial(30,15) ≈ 1.55e8 points. The seriphs are still visible. ( see also Phenomena )

Let’s pretend for the moment that we have a good classifier f, perhaps sitting as a black box in front of the screen. The classifier f : X  → Y  still factors over this small space X1. At least, this is true for the fiber over ‚a‘ ∈ Y. There may be more difficult points in X, where the factorization becomes impossible. Let’s define U1 ⊂ X as the subset of X where the factorization of f through h1 : X → X1 remains possible. If we choose several maps hi: X → Xi, the open sets Ui may form an open cover of X! We may recover our classifier although the individual spaces are very small.

    Some remarks on the choice of hi: X → Xi:

  • If it is large, perhaps equal to X, factorization is possible everywhere and U = X. This is easy.
  • If we choose a small,stupid space, for example { 0 }, U becomes empty. This is useless.
  • A perfect choice would be X1 = Y, h1 = f. This is cheating.

The useful spaces X1 lie somewhere between these extremes.

But are there enough maps?

Yes, certainly. In our simplistic OCR example we’ve ignored character segmentation. The bounding box of our ‚a‘ is not given, we have to compute it, and we should expect some variability in this process. Likewise, there may be small changes in page orientation. Our helper functions hi are locally constant on pitifully small domains. Even very slight changes of bounding box and orientation will be reflected in the bitmaps.

The original ‚a‘ is a 17×20 image. We can crop off any of four borders. That will give us 16 slightly different images.

The scaled-down versions are very similar.

The threshold operation separates images that are close together in R30.

In this special case we may identify the spaces Xi. The pairs (hi, Xi) are different, though. In general, the Xi will have nothing in common.

We can use small rotations instead of clipping:

Or scale down to different grid resolutions:

Or fix the geometry and vary the threshold value:

The range of locally constant maps is in no way exhausted with these graphic examples. The threshold operation is just a very special case. We may, for example, calculate the rank order of all pixel values. This gives a map from R6×5 into a discrete set S30 of permutations. It is large, but the number of frequently occuring permutations is much smaller. This bold idea works remarkably well with tiny images of 5×4 pixels each! (See here.)In the same vein, every patch of 2×2 pixels in our 6×5 area determines a permutation of four pixel values. It encodes the local gradient. Taken together, these codes give a map from R6×5 into {1,…,24}5×4.
Next: On Hamming Distance

 Date Posted: 14 Mrz 2009 @ 05 14 PM
Last Modified: 01 Mai 2011 @ 10 18 PM
Posted By: Hardy

Responses to this post » (None)


Post a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

\/ More Options ...
Change Theme...
  • Users » 3
  • Posts/Pages » 40
  • Comments » 3
Change Theme...
  • VoidVoid
  • LifeLife « Default
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight

On Digital Memory

    No Child Pages.