Skip to content

Latest commit

 

History

History
112 lines (84 loc) · 6.43 KB

File metadata and controls

112 lines (84 loc) · 6.43 KB

Find Longest Self-Contained Substring

You are given a string, s, consisting of lowercase English letters. Your task is to find the length of the longest self-contained substring of s.

A substring t of s is called self-contained if:

  • t is not equal to the entire string s.
  • Every character in t does not appear anywhere else in s (outside of t).

In other words, all characters in t are completely unique to that substring within the string s. Return the length of the longest self-contained substring. If no such substring exists, return -1.

Constraints:

  • 2 ≤ s.length ≤ 1000
  • s consists only of lowercase English letters.

Examples

Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Example 8


Solution

We iterate through the string once to record where each character first appears and where it last appears. Two separate hash maps store each character’s first and last occurrence indices. Each unique character serves as a potential starting point, defining an initial window from its first to last occurrence. The window is adjusted based on the following conditions:

  • As we iterate within this window, if we encounter a character whose last occurrence is further to the right than the current window’s end, we expand the window to include it. This ensures that the substring remains self-contained, meaning all occurrences of each character within it are included. The process continues until no more characters extend the window’s boundary.

  • As we expand the window, every character inside it must have its first occurrence within the current window’s start boundary. If we encounter a character whose first occurrence index is before the current window’s starting position, it means that an earlier part of the string contains an instance of that character, violating the self-contained property. When this happens, the current substring is invalid, and we discard it from consideration.

The maximum valid window length is tracked and updated accordingly, ensuring the longest valid substring is returned.

The steps of the algorithm are as follows:

  1. Create two hashmaps, first, and last, to store the first and last occurrence index of each character in the string.

  2. Iterate through the string once to populate these hash maps with each character’s first and last occurrence.

  3. Initialize max_len = -1 to keep track of the maximum length of a valid self-contained substring found.

  4. For each unique character c1 (processed once), start from its first occurrence index:

    • Initialize the start and end of the window by setting the starting point to the character’s first occurrence and the ending point to its last occurrence in the string.

    • Iterate through the string from the starting position start, extending the endpoint end whenever a character’s last occurrence is beyond the current endpoint end.

    • If a character c2 inside the window has its first occurrence before the window's start, the window is invalid.

  5. Validate the substring:

    • When the current index j reaches the end, check if the window is valid and its length is less than the total string length.

    • If the window is valid, update max_len with the maximum of its current value and the window’s length (end - start + 1).

  6. After checking all potential starting characters, return the maximum valid length found. If no valid substring exists, return -1.

Let’s look at the illustration below to better understand the solution.

Solution 1 Solution 2 Solution 3 Solution 4 Solution 5 Solution 6 Solution 7 Solution 8 Solution 9 Solution 10 Solution 11 Solution 12 Solution 13 Solution 14 Solution 15 Solution 16 Solution 17 Solution 18 Solution 19 Solution 20 Solution 21 Solution 22 Solution 23 Solution 24 Solution 25 Solution 26 Solution 27 Solution 28

Time Complexity

The time complexity of the above solution is O(n), where n is the number of characters in the string.

Space Complexity

The space complexity of the above solution is O(1) because of the fixed character set size, 26 lowercase English letters.