One of the most widely known approaches in the field of compression algorithms issliding window compression. It works as follows:
- Consider characters one by one. Let the current character index be
i
. - Take the last
width
characters before the current one (i.e.s[i - width, i - 1]
, wheres[i, j]
means the substring ofs
from indexi
to indexj
, both inclusive), and call it the window. - Find such
startIndex
andlength
thats[i, i + length - 1] = s[startIndex, startIndex + length - 1]
ands[startIndex, startIndex + length - 1]
is contained within the window. If there are several such pairs, choose the one with the largestlength
. If there still remains more than one option, choose the one with the smalleststartIndex
. - If the search was successful, append
"(startIndex,length)"
to the result and move to the indexi + length
. - Otherwise, append the current character to the result and move on to the next one.
Given a string, apply sliding window compression to it.
Example
- For
inputString = "abacabadabacaba"
andwidth = 7
, the output should belosslessDataCompression(inputString, width) = "ab(0,1)c(0,3)d(4,3)c(8,3)"
.- Step 1:
i = 0, inputString[i] = 'a', window = ""
.'a'
is not contained withinthe window, so it is appended to the result. - Step 2:
i = 1, inputString[i] = 'b', window = "a"
.'b'
is not contained within the window, so it is appended to the result. - Step 3:
i = 2, inputString[i] = 'a', window = "ab"
.'a'
can be found in the window.'a'
in the window corresponds to theinputString[0]
, so(0,1)
representing the substring"a"
is appended to the result. - Step 4:
i = 3, inputString[i] = 'c', window = "aba"
. The same situation as in the first two steps. - Step 5:
i = 4, inputString[i] = 'a', window = "abac"
. ConsiderstartIndex = 0, length = 3
.inputString[startIndex, startIndex + length - 1] = "aba"
and it is contained within the window,inputString[i, i + length - 1] = "aba"
. Therefore,"(0,3)"
should be added to the result.i += length
. - Step 6:
i = 7, inputString[i] = 'd', window = inputString[0, 6] = "abacaba"
. The same situation as in the first two steps. - Step 7:
i = 8, inputString[i] = 'a', window = inputString[1, 7] = "bacabad"
. Considerlength = 3
again.inputString[i, i + b - 1] = "aba"
,window[3, 5] = "aba"
, and it corresponds toinputString[4, 6]
sinceinputString[0, 2]
is no longer within the window. So,"(4,3)"
should be appended.i += length
. - Step 8:
i = 11, inputString[i] = 'c', window = "abadaba"
. The same situation as at the first two steps. - Step 9:
i = 12, inputString[i] = 'a', window = "badabac"
.length = 3, inputString[i, i + length - 1] = "aba"
,window[3, 5] = "aba"
, and it corresponds toinputString[8, 10]
. So,"(8,3)"
should be appended.i += length
.
- Step 1:
- For
inputString = "abacabadabacaba"
andwidth = 8
, the output should belosslessDataCompression(inputString, width) = "ab(0,1)c(0,3)d(0,7)"
.In both of the above examples the resulting "compressed" string becomes even longer than the initial one. In fact, sliding window compression proves to be efficient for longer inputs. E.g.: - For
inputString = "aaaaaaaaaaaaaaaaaaaaaaaaaaaa"
andwidth = 12
, the output should belosslessDataCompression(inputString, width) = "a(0,1)(0,2)(0,4)(0,8)(4,12)"
.In the last example the resulting string is one character shorter than the input one. It is the shortest possible example of the efficient work of sliding window compression. If the input contained even more letters'a'
, then the effect of this approach would be even more considerable.
- [input] string inputStringA non-empty string of lowercase letters.
- [input] integer widthA positive integer.
- [output] stringCompressed
inputString
.
0 comments :
Post a Comment