Data Structures: Shell sort

Description

Shellsort, also known as Shell sort or Shell's method is an in-place comparison sort. It generalizes an exchanging sort, such as insertion or bubble sort, by allowing the comparison and exchange of elements that lie far apart. Its first version was published by Donald Shell in 1959. The running time of Shellsort is heavily dependent on the gap sequence it uses. For many practical variants, determining their time complexity remains an open problem.It is a multi-pass algorithm. Each pass is an insertion sort of the sequences consisting of every h-th element for a fixed gap h (also known as the increment). This is referred to as h-sorting. It is an unstable: it may change the relative order of elements with equal values. It has "natural" behavior, in that it executes faster when the input is partially sorted.

Gap sequences

Every gap sequence that contains 1 yields a correct sort; however, the properties of thus obtained versions of Shellsort may be very different.
The table below compares most proposed gap sequences published so far. Some of them have decreasing elements that depend on the size of the sorted array (N). Others are increasing infinite sequences, whose elements less than N should be used in reverse order.

General term (k ≥ 1)	Concrete gaps	Worst-case time complexity	Author and year of publication
$\lfloor N / 2^k \rfloor$	$\left\lfloor\frac{N}{2}\right\rfloor, \left\lfloor\frac{N}{4}\right\rfloor, \ldots, 1$	$Θ(N 2)$ [when N=2^p]	Shell, 1959^[1]
$2 \lfloor N / 2^{k+1} \rfloor + 1$	$2 \left\lfloor\frac{N}{4}\right\rfloor + 1, \ldots, 3, 1$	$Θ(N 3 / 2)$	Frank & Lazarus, 1960^[3]
$2 k - 1$	$1, 3, 7, 15, 31, 63, \ldots$	$Θ(N 3 / 2)$	Hibbard, 1963^[4]
$2 k + 1$ , with 1 prepended	$1, 3, 5, 9, 17, 33, 65, \ldots$	$Θ(N 3 / 2)$	Papernov & Stasevich, 1965^[5]
successive numbers of the form $2 p 3 q$	$1, 2, 3, 4, 6, 8, 9, 12, \ldots$	$Θ(N log 2 N)$	Pratt, 1971^[6]
$(3 k - 1) / 2$ , not greater than $\lceil N / 3 \rceil$	$1, 4, 13, 40, 121, \ldots$	$Θ(N 3 / 2)$	Knuth, 1973^[7]
$\prod \limits_{\scriptscriptstyle 0\le q<r\atop \scriptscriptstyle q\neq(r^2+r)/2-k}a_q, \hbox{where}$ $r = \left\lfloor \sqrt{2k+\sqrt{2k}} \right\rfloor,$ $a_q=\min\{n\in\mathbb{N}\colon n\ge(5/2)^{q+1},$ $\forall p\colon0\le p<q\Rightarrow\gcd(a_p,n)=1\}$	$1, 3, 7, 21, 48, 112, \ldots$	$O(N e^\sqrt{8\ln(5/2)\ln N})$	Incerpi & Sedgewick, 1985^[8]
$4^k + 3\cdot2^{k-1} + 1$ , with 1 prepended	$1, 8, 23, 77, 281, \ldots$	$O (N 4 / 3)$	Sedgewick, 1986^[9]
$9(4^{k-1}-2^{k-1})+1, 4^{k+1}-6\cdot2^k+1$	$1, 5, 19, 41, 109, \ldots$	$O (N 4 / 3)$	Sedgewick, 1986^[9]
$h_k = \max\left\{\left\lfloor 5h_{k-1}/11 \right\rfloor, 1\right\}, h_0 = N$	$\left\lfloor \frac{5N}{11} \right\rfloor, \left\lfloor \frac{5}{11}\left\lfloor \frac{5N}{11} \right\rfloor\right\rfloor, \ldots, 1$	?	Gonnet & Baeza-Yates, 1991^[10]
$\left\lceil \frac{9^k-4^k}{5\cdot4^{k-1}} \right\rceil$	$1, 4, 9, 20, 46, 103, \ldots$	?	Tokuda, 1992^[11]
unknown	$1,4,10,23,57,132,301,701$	?	Ciura, 2001^[12]

When the binary representation of N contains many consecutive zeroes, Shellsort using Shell's original gap sequence makes Θ(N²) comparisons in the worst case. For instance, this case occurs for N equal to a power of two when elements greater and smaller than the median occupy odd and even positions respectively, since they are compared only in the last pass.
Although it has higher complexity than the O(NlogN) that is optimal for comparison sorts, Pratt's version lends itself to sorting networks and has the same asymptotic gate complexity as Batcher's bitonic sorter.
Gonnet and Baeza-Yates observed that Shellsort makes the fewest comparisons on average when the ratios of successive gaps are roughly equal to 2.2.^[10] This is why their sequence with ratio 2.2 and Tokuda's sequence with ratio 2.25 prove efficient. However, it is not known why this is so. Sedgewick recommends to use gaps that have low greatest common divisors or are pairwise coprime.^[13]
With respect to the average number of comparisons, the best known gap sequences are 1, 4, 10, 23, 57, 132, 301, 701 and similar, with gaps found experimentally. Optimal gaps beyond 701 remain unknown, but good results can be obtained by extending the above sequence according to the recursive formula $h_k = \lfloor 2.25 h_{k-1} \rfloor$ .
Tokuda's sequence, defined by the simple formula $h_k = \lceil h'_k \rceil$ , where

h' k = 2.25 h' k - 1 + 1

h' 1 = 1

, can be recommended for practical applications.

Data Structures

Lunes, Enero 30, 2012

Shell sort

Description

Gap sequences

Walang komento:

Mag-post ng isang Komento