Sliding window I

In arrays and strings, there are a subset of problems which deal with window of subarrays or substrings. In such problems the idea is to search a subarray matching a particular condition. A brute-force way of checking all the sub-arrays or substrings gives a runtime complexity of O(N^2). Sliding Window technique can be really helpful in such problems to reduce the time complexity to O(N) in such problems. In this post I would be discussing some common code templates which can be used for solving using sliding window techniques.

A sliding window can be of two types.

  • Fixed size sliding window, where the length of the window is fixed as it slides over the array/string.
  • Variable size sliding window, where the length of the window increases/decreases as it slides over the array/string.

In this article would be covering Fixed size sliding window and code template around that.

Example Problems:

Problems in this category are easier to recognize, as the problem statements itself defines the window length and is given as part of the parameter of the question.

A window of size k in an array of size N will have N - k + 1 sub arrays of size k. Ex. In an array like [0, 1, 2, 3, 4, 5, 6, 7] will have windows of size k=3 as [0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]. As it can be seen, after the first window is initialized subsequent windows are created by adding next number and removing a number from the previous window. As [0, 1, 2] can be moved to next window by removing 0 and adding 3 to the window. So a sliding window works like a double ended queue deque where in each iteration an element is added from the end and an element is removed from the front thus maintaining the size k after initialization.

Even though the operation of the sliding window gives an impression of using extra space of a deque, but for scalar values like total of subarray can be maintained in constant space. Example of such a problem.

  • Find if there is a sub-array of size k in array of integer nums with the total as target value t.
# array `nums`, window size `k`, subarray sum = target
def subArrayEqualsTarget(nums: List[int], k: int, t: int):
    s, n = 0, len(nums)
    # Initialization Phase 
    # Work on the first window of k elements.
    for i in range(k): # Accumulate sum of left most window [0..k] in s
        s += nums[i]
    if s == t: 
        return True # Found first window sum as target
    # Iteration Phase 
    # Iterate from k..n to work on each window of k size.
    for i in range(k, n):
        s += nums[i] - nums[i-k] # Add a new right value nums[i] to window and remove left nums[i-k] value from window.
        if s == t:
            return True # found subsequent window sum as target
    return False

The above problem, is a generic template where the sum function can be replaced by other functions like dividing by k to get rolling mean or removing and adding left and right values respectively to hash map for checking duplicates and even getting rolling hash.

Let’s go over some of the problems mentioned above, to see how this template is used.

Given an array of integers arr and two integers k and threshold, return the number of sub-arrays of size k and average greater than or equal to threshold.

Example 1:

Input: arr = [2,2,2,2,5,5,5,8], k = 3, threshold = 4
Output: 3
Explanation: Sub-arrays [2,5,5],[5,5,5] and [5,5,8] have averages 4, 5 and 6 respectively. All other sub-arrays of size 3 have averages less than 4 (the threshold).

This is a counting problem, where every time a average of k numbers is greater than or equal to threshold is counted as a solution.

A brute force way to solve the problem would be iterate over each n - k windows and calculate average for each window and compare it with threshold., this will result in O(nk) solution. It is not hard to see that the calculation of the average for each window from start is redundant, and if can reuse the average calculated in the previous window to get the average of the current window then only constant work is done at every index in the array. Based on that algorithm the run time is O(n) and space is O(1)

def numOfSubarrays(nums: List[int], k: int, threshold: int) -> int:
    s, n, t = 0, len(nums), threshold * k # sum should be greater than threshold * k for average to be greater than threshold.
    sol = 0
    # Intialization on first window.
    for i in range(k):
        s += nums[i]
    if s >= t:
        sol += 1
    # Working with rest of the windows.
    for i in range(k, n):
        s += nums[i] - nums[i - k] # Add the nums[i] to the window, remove nums[i - k] form the window.
        if s >= t: # Check if the sum is greater than threshold * k
            sol +=1
    
    return sol
Given a string s and an integer k, return the number of substrings in s of length k with no repeated characters.

Example:

Input: s = "havefunonleetcode", k = 5
Output: 6
Explanation: There are 6 substrings they are: 'havef','avefu','vefun','efuno','etcod','tcode'.

This is again a counting problem, where the sub-strings of size k need to be counted with no repeated characters. A brute force approach again would be to go over the n-k sub strings and iterate over each substring to check if there are repeated characters. Such an alorithm will take O(nk) algorithm. To make a linear alogirthm need to avoid repeated work of checking substrings again. To do so after the first window in each iteration an item is added to the window and item is removed from the window if the number of unique items in the window are equal to k then that is valid solution. A map can be used to keep track of the items added in the window and it’s frequencies. This algorithm takes O(n) time and O(1) space as there are only lower case characters stored in map.

def numKLenSubstrNoRepeats(s: str, k: int) -> int:
    mp, n = defaultdict(int), len(s)
    sol = 0
    # Intialization phase for first window
    for i in range(min(n, k)):
        mp[s[i]] += 1
    if len(mp) == k:
        sol += 1
    # Iterating over each window
    for i in range(k, n):
        mp[s[i - k]] -= 1 # decreasing frequency count of the element
        if mp[s[i - k]] == 0: # If the element is not present anymore remove from array.
            del mp[s[i - k]]
        mp[s[i]] += 1
        if len(mp) == k:
            sol += 1
    return sol

There are few other problems, to practice this pattern more.

Now once we know fixed window basic template lets take a look at a harder problem.

You are given an array of integers nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.

Return the max sliding window.
Example 1:

Input: nums = [1,3,-1,-3,5,3,6,7], k = 3
Output: [3,3,5,5,6,7]
Explanation: 
Window position                Max
---------------               -----
[1  3  -1] -3  5  3  6  7       3
 1 [3  -1  -3] 5  3  6  7       3
 1  3 [-1  -3  5] 3  6  7       5
 1  3  -1 [-3  5  3] 6  7       5
 1  3  -1  -3 [5  3  6] 7       6
 1  3  -1  -3  5 [3  6  7]      7

Based on the problem statement it is not difficult to identify that this is again a fixed sliding window problem. The new number can be added to the right side of the window and old number can be deleted from the left side of the window as we have done earlier, the difficult part is finding the maximum number from the elements in the window. A naive way would be again to look at all the n - k windows and find max in each window which will take k iterations giving O(nk) time.

If someone is aware of the heap data structure it is quite easy to see that max heap can give the maximum number in O(1) time. The complexity of using heap is, it is not easy to delete an element from the heap, when a number moves out of the window and is not the maximum. Also adding and removing an element from heap takes O(log n) complexity, giving the overall time complexity as O(nlogn). So to keep track of the elements in the heap, index of the element can also be added to the number, and can be lazy removed when fetching the max in every iteration.

def maxSlidingWindow(nums: List[int], k: int) -> List[int]:
    hp, n = [], len(nums)
    sol = []
    #Initialization for first window
    for i in range(min(n, k)):
        heapq.heappush(hp, (-nums[i], i)) # Add (-element, index) to heap. Negative element is added as python heaps are min heap.
    
    # Append max of the first window [0..k]
    sol.append(-hp[0][0])
    # Iterate for each window of size k
    for i in range(k, n):
        # Remove if the outside window element is on top of heap
        while hp and hp[0][1] <= i - k:
            heapq.heappop(hp)
        # Add the current window element to heap
        heapq.heappush(hp, (-nums[i], i))
        # Add max to the solution
        sol.append(-hp[0][0])
    return sol

In languages like Java or C++ a sortedMap or TreeMap can also be used which are based on balanced binary search tree. These can easily find max in O(logn) and can also remove elements in O(logn). Still the complexity will O(nlogn)

To reduce the complexity further, needs to look at a way to find the max, delete and insert in constant time for each window. The way to think about it is.

Every time an element is removed from the window, two cases can occur.

  1. If it is not the max, the max would be either the old max or the new added element.
  2. If it is the max, then a new max needs to be determined, which would be the second max or next smaller element after the removed element.

If you have seen monotonic stack/queues, those data structures can help to tell that. Read more: Monotonic Stacks

A decreasing monotonic stack can find the next smaller element. A deque can be used, so that the max is the element being removed from the right of the window then it can be popped from the left of the deque. Basically deque contains the element in the window in decreasing order of their values, thus when the element from left is removed than left most element becomes the max.

def maxSlidingWindow(nums: List[int], k: int) -> List[int]:
    dq, n = Deque(), len(nums)
    sol = []
    # Initialize first window. Add [0..k] elements in the decreasing monotonic deque.
    for i in range(min(n, k)):
        while dq and dq[-1] < nums[i]: # Pop all the values on top of the stack smaller than current value pop
            dq.pop()
        dq.append(nums[i]) # Insert current value to the top of the stack.
    # Append max of the first window [0..k]
    sol.append(dq[0])
    
    for i in range(k, n):
        if dq and dq[0] == nums[i - k]: # If the value being removed from window is the max, pop from the left.
            dq.popleft()
        while dq and dq[-1] < nums[i]: # Again pop all the smaller values from top of the stack than current value
            dq.pop()
        dq.append(nums[i])
        sol.append(dq[0]) # Insert max to the solution to the solution.
    return sol

The above alorithm runs O(N) time complexity and O(k) space.

comments powered by Disqus