In the article, we look at the architecture of a detector of groups of small objects in close proximity to each other with distances between them as short as couples of pixels. In modern days the issue with detection of such small objects using a neural network is often the pooling-based architecture leading to spatial information loss. We suggest a model of a convolutional network based on a fully connected convolutional network such as Network in Network (NiN). The accuracy of the detector is measured in a license plate recognition problem when images of the license plates are produced by roads and highways video surveillance systems.
Our aim is to present a solution to a specific problem without regards to using case specifics such as license plate edge detection, segmentation, and binarization of symbols. We focus on symbol detection and we process raw grayscale data. Furthermore, we avoid license plate pattern detection and matching. In spite of narrow conditions, we put on the problem the result we achieve is useful since it can be universally applied to many kinds of real-world problems due to it being invariant to orientation in space and having low requirements to quality of an image. There are no particular requirements to the size of an image being processed but scaling might require to be executed in order to fit symbols in a predefined range, which in most commonly used systems is achievable due to positions of cameras and surveilled objects being known in advance. In our benchmarking, we achieved mean Average Precision (mAP) of 90.25% which is on the level with modern automatic recognition systems for license plates.
Keywords: Object Detection, Region Proposal, CNN, NiN, licence plates.