longest substring with at most k distinct characters

Newbie Developer
2 min readJan 21, 2021

Given a string you need to print longest possible substring that has exactly M unique characters. If there are more than one substring of longest possible length, then print any one of them.

Examples:

"aabbcc", k = 1
Max substring can be any one from {"aa" , "bb" , "cc"}.
"aabbcc", k = 2
Max substring can be any one from {"aabb" , "bbcc"}.
"aabbcc", k = 3
There are substrings with exactly 3 unique characters
{"aabbcc" , "abbcc" , "aabbc" , "abbc" }
Max is "aabbcc" with length 6.
"aaabbb", k = 3
There are only two unique characters, thus show error message.

Analysis:

Two pointers + map to remember the count of each char

if map has more than k unique chars, start removing the left char.

# define character rangeCHAR_RANGE = 128def longestSubstr(str, k):# stores longest substring boundaries  end = begin = 0# set to store distinct characters in a window  window = set()# count list to store frequency of characters present in current window# we can also use a dictionary instead of count list  freq = [0] * CHAR_RANGE# [low..high] maintain sliding window boundaries  low = high = 0  while high < len(str):    window.add(str[high])    freq[ord(str[high])] += 1# if window size is more than k, remove characters from the left    while len(window) > k:# if the frequency of leftmost character becomes 0 after# removing it in the window, remove it from set as well        freq[ord(str[low])] = freq[ord(str[low])] — 1        if freq[ord(str[low])] == 0:          window.remove(str[low])          low = low + 1 # reduce window size# update maximum window size if necessary    if end — begin < high — low:      end = high      begin = low      high = high + 1# return longest substring found at str[begin..end]  return str[begin:end + 1]

--

--