longest substring with at most k distinct characters
2 min readJan 21, 2021
Given a string you need to print longest possible substring that has exactly M unique characters. If there are more than one substring of longest possible length, then print any one of them.
Examples:
"aabbcc", k = 1
Max substring can be any one from {"aa" , "bb" , "cc"}."aabbcc", k = 2
Max substring can be any one from {"aabb" , "bbcc"}."aabbcc", k = 3
There are substrings with exactly 3 unique characters
{"aabbcc" , "abbcc" , "aabbc" , "abbc" }
Max is "aabbcc" with length 6."aaabbb", k = 3
There are only two unique characters, thus show error message.
Analysis:
Two pointers + map to remember the count of each char
if map has more than k unique chars, start removing the left char.
# define character rangeCHAR_RANGE = 128def longestSubstr(str, k):# stores longest substring boundaries end = begin = 0# set to store distinct characters in a window window = set()# count list to store frequency of characters present in current window# we can also use a dictionary instead of count list freq = [0] * CHAR_RANGE# [low..high] maintain sliding window boundaries low = high = 0 while high < len(str): window.add(str[high]) freq[ord(str[high])] += 1# if window size is more than k, remove characters from the left while len(window) > k:# if the frequency of leftmost character becomes 0 after# removing it in the window, remove it from set as well freq[ord(str[low])] = freq[ord(str[low])] — 1 if freq[ord(str[low])] == 0: window.remove(str[low]) low = low + 1 # reduce window size# update maximum window size if necessary if end — begin < high — low: end = high begin = low high = high + 1# return longest substring found at str[begin..end] return str[begin:end + 1]