Using the gradient direction to reduce the number of votes An improvement suggested by O'Gorman and Clowes can be used to detect lines if one takes into account that the local
gradient of the image intensity will necessarily be orthogonal to the edge. Since
edge detection generally involves computing the intensity gradient magnitude, the gradient direction is often found as a side effect. If a given point of coordinates (
x,y) happens to indeed be on a line, then the local direction of the gradient gives the
θ parameter corresponding to said line, and the
r parameter is then immediately obtained. (Shapiro and Stockman, 305) The gradient direction can be estimated to within 20°, which shortens the sinusoid trace from the full 180° to roughly 45°. This reduces the computation time and has the interesting effect of reducing the number of useless votes, thus enhancing the visibility of the spikes corresponding to real lines in the image.
Kernel-based Hough transform (KHT) Fernandes and Oliveira suggested an improved voting scheme for the Hough transform that allows a software implementation to achieve real-time performance even on relatively large images (e.g., 1280×960). The Kernel-based Hough transform uses the same (r,\theta) parameterization proposed by Duda and Hart but operates on clusters of approximately collinear pixels. For each cluster, votes are cast using an oriented elliptical-Gaussian kernel that models the uncertainty associated with the best-fitting line with respect to the corresponding cluster. The approach not only significantly improves the performance of the voting scheme, but also produces a much cleaner accumulator and makes the transform more robust to the detection of spurious lines.
3-D kernel-based Hough transform for plane detection (3DKHT) Limberger and Oliveira suggested a deterministic technique for plane detection in unorganized point clouds whose cost is n \log(n) in the number of samples, achieving real-time performance for relatively large datasets (up to 10^5 points on a 3.4 GHz CPU). It is based on a fast Hough-transform voting strategy for planar regions, inspired by the Kernel-based Hough transform (KHT). This 3D kernel-based Hough transform (3DKHT) uses a fast and robust algorithm to segment clusters of approximately co-planar samples, and casts votes for individual clusters (instead of for individual samples) on a (\theta, \phi, \rho) spherical accumulator using a trivariate Gaussian kernel. The approach is several orders of magnitude faster than existing (non-deterministic) techniques for plane detection in point clouds, such as
RHT and
RANSAC, and scales better with the size of the datasets. It can be used with any application that requires fast detection of planar features on large datasets.
Hough transform of curves, and its generalization for analytical and non-analytical shapes Although the version of the transform described above applies only to finding straight lines, a similar transform can be used for finding any shape which can be represented by a set of parameters. A circle, for instance, can be transformed into a set of three parameters, representing its center and radius, so that the Hough space becomes three dimensional. Arbitrary ellipses and curves can also be found this way, as can any shape easily expressed as a set of parameters. The generalization of the Hough transform for detecting analytical shapes in spaces having any dimensionality was proposed by Fernandes and Oliveira. In contrast to other Hough transform-based approaches for analytical shapes, Fernandes' technique does not depend on the shape one wants to detect nor on the input
data type. The detection can be driven to a type of analytical shape by changing the assumed model of geometry where data have been encoded (e.g.,
euclidean space,
projective space,
conformal geometry, and so on), while the proposed formulation remains unchanged. Also, it guarantees that the intended shapes are represented with the smallest possible number of parameters, and it allows the concurrent detection of different kinds of shapes that best fit an input set of entries with different dimensionalities and different geometric definitions (e.g., the concurrent detection of planes and spheres that best fit a set of points, straight lines and circles). For more complicated shapes in the plane (i.e., shapes that cannot be represented analytically in some 2D space), the
Generalised Hough transform is used, which allows a
feature to vote for a particular position, orientation and/or scaling of the shape using a predefined look-up table.The Hough transform accumulates contributions from all pixels in the detected edge.
Circle detection process Altering the algorithm to detect circular shapes instead of lines is relatively straightforward. • First, we create the accumulator space, which is made up of a cell for each pixel. Initially each cell is set to 0. • For each edge point (i, j) in the image, increment all cells which according to the equation of a circle (i - a)^2 + (j - b)^2 = r^2 could be the center of a circle. These cells are represented by the letter a in the equation. • For each possible value of a found in the previous step, find all possible values of b which satisfy the equation. • Search for local maxima in the accumulator space. These cells represent circles that were detected by the algorithm. If we do not know the radius of the circle we are trying to locate beforehand, we can use a three-dimensional accumulator space to search for circles with an arbitrary radius. Naturally, this is more computationally expensive. This method can also detect circles that are partially outside of the accumulator space, as long as enough of the circle's area is still present within it.
Detection of 3D objects (planes and cylinders) Hough transform can also be used for the detection of 3D objects in
range data or 3D
point clouds. The extension of classical Hough transform for plane detection is quite straightforward. A plane is represented by its explicit equation z = a_x x + a_y y + d for which we can use a 3D Hough space corresponding to a_x, a_y and d. This extension suffers from the same problems as its 2D counterpart i.e., near horizontal planes can be reliably detected, while the performance deteriorates as planar direction becomes vertical (big values of a_x and a_y amplify the noise in the data). This formulation of the plane has been used for the detection of planes in the point clouds acquired from airborne laser scanning and works very well because in that domain all planes are nearly horizontal. For generalized plane detection using Hough transform, the plane can be parametrized by its normal vector n (using spherical coordinates) and its distance from the origin \rho resulting in a three dimensional Hough space. This results in each point in the input data voting for a sinusoidal surface in the Hough space. The intersection of these sinusoidal surfaces indicates presence of a plane. A more general approach for more than 3 dimensions requires search heuristics to remain feasible. Hough transform has also been used to find cylindrical objects in point clouds using a two step approach. The first step finds the orientation of the cylinder and the second step finds the position and radius.
Using weighted features One common variation detail. That is, finding the bins with the highest count in one stage can be used to constrain the range of values searched in the next.
Carefully chosen parameter space A high-dimensional parameter space for the Hough transform is not only slow, but if implemented without forethought can easily overrun the available memory. Even if the programming environment allows the allocation of an array larger than the available memory space through
virtual memory, the number of
page swaps required for this will be very demanding because the accumulator array is used in a randomly accessed fashion, rarely stopping in contiguous memory as it skips from index to index. Consider the task of finding ellipses in an 800x600 image. Assuming that the radii of the ellipses are oriented along principal axes, the parameter space is four-dimensional. (
x,
y) defines the center of the ellipse, and
a and
b denote the two radii. Allowing the center to be anywhere in the image, adds the constraint 03) in the number of non-zero points in the image. ==Limitations==