Longest Substring without Repeating Characters
Longest Substring without Repeating Characters
My first attempt was to remove duplicate characters from the string, then generate all substrings from index i to the end of this cleaned (non-duplicate) string. For each substring, I counted its length, starting from a default of 0. The idea was that removing duplicates up front would simplify the problem.
However, this approach is fundamentally flawed. For example, in the case of "pwwkew", my solution produced "pwke", which is not a substring of the original string, but a subsequence.
def remove_duplicates(self, s): # convert string s to dict, then join with empty string "" # when converting s to dict, it removes duplicated characters. return "".join(dict.fromkeys(s)) def lengthOfLongestSubstring(self, s): word_counts = {} clean_str = self.remove_duplicates(s) for i in range(0, len(clean_str)): substr = clean_str[i:] # create a new key with default 0 if not exists, otherwise, increase by one. word_counts[substr] = word_counts.get(substr, 0) + len(substr) print(word_counts)
The key realization was that substrings must remain contiguous, which means I cannot preprocess the string by removing characters. Instead, the solution is to operate directly on the original string and dynamically maintain a valid window.
The correct approach uses two pointers (left and right) that move along the string, forming a sliding window that always represents a substring with no duplicate characters.
def lengthOfLongestSubstring(s): seen = set() left = 0 max_len = 0 for right in range(len(s)): while s[right] in seen: seen.remove(s[left]) left += 1 seen.add(s[right]) max_len = max(max_len, right - left + 1) return max_len
On each iteration, we check whether the character at the right pointer has already been seen. If it has, the window is no longer valid, so we advance the left pointer and remove characters from the set until the duplicate is resolved. Once the window is valid again, we expand it by moving right forward. Throughout this process, we keep track of the maximum window size encountered. This guarantees that every window is a valid substring, and the algorithm runs in linear time.
Here is an illustration generated by ChatGPT.