We divide document image into two parts - text and background . text area is regarded as a series of random variables with 0 - 1 distribution and background area as a series of random variables with degenerate distribution . because the median filter is robust for degenerate distribution , it can reduce the noise 文本图像通常可以转换成二值图像,我们可以把文本图像的文字区域和空白区域分别理解为两个相互独立的随机变量序列的依次观测,其中一个是服从两点分布,而另一个服从退化的单点分布。
It includes five parts : pre - process , multi - scale analysis , feature extraction , classification and text area extraction . the size of text in scene image varies a lot , to extract different size text , in multi - scale process , we make the image into three resolutions : 1 / 1 , 1 / 2 , 1 / 4 由于图像上字符的大小变化范围很大,为了能够找出大小不一的文本区域,在多分辨分析中,本文将图像分解为原分辨率的1 / 1 、 1 / 2和1 / 4的三幅子图,然后对每幅子图应用相同的文本区域定位算法。