Skip to content

Commit fad094e

Browse files
committed
feat(strings): lexicographically largest string in box 2
1 parent 51e2782 commit fad094e

File tree

4 files changed

+125
-0
lines changed

4 files changed

+125
-0
lines changed

pystrings/lexicographically_largest_string/README.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,79 @@ string among all the substrings in the box.
2828
![Example 2](./images/examples/lexicographically_largest_string_from_box_example_2.png)
2929
![Example 3](./images/examples/lexicographically_largest_string_from_box_example_3.png)
3030
![Example 4](./images/examples/lexicographically_largest_string_from_box_example_4.png)
31+
32+
## Solution
33+
34+
In the game, each round splits the string into `num` non-empty parts. The longest possible contiguous segment
35+
that can appear in the box from any round can be computed as:
36+
37+
```text
38+
length = len(word) - num + 1
39+
```
40+
41+
We begin by observing that we don’t need to collect all substrings from all game rounds. Instead, we only care about
42+
substrings of this specific length, as those are the longest possible segments that can be added to the box. A
43+
straightforward approach would be to generate all substrings of this length, compare them individually, and return the
44+
lexicographically largest. However, this method would be inefficient for longer strings.
45+
46+
We use the two pointers technique to scan the string in a single pass to improve this. This allows us to track and
47+
update the best candidate substring on the fly without explicitly generating every possible option.
48+
49+
The algorithm is based on the principle that lexicographical order is determined by the first differing character
50+
between two substrings. It begins by handling a simple case: if there is only one friend, no splitting is needed, and
51+
the entire word is returned immediately as the output.
52+
53+
Otherwise, the algorithm uses two pointers:
54+
- One pointer for marking the beginning of the current best substring.
55+
- The second pointer is for exploring potential alternatives.
56+
57+
To compare the two, characters from each position are examined individually. The comparison moves forward as long as the
58+
characters match, skipping over any shared prefix. This is a key optimization: common prefixes don’t need to be
59+
rechecked. When a difference is found or when one of the substrings ends, the algorithm checks which substring is
60+
lexicographically larger. If the new candidate is better, it replaces the current best. Then, instead of resuming the
61+
comparison from the next character, the algorithm uses the length of the skipped prefix to jump forward, avoiding any
62+
redundant overlapping comparisons.
63+
64+
This skipping logic is important. It guarantees that no character is re-evaluated unnecessarily, making solving the
65+
problem in linear time possible. So, by comparing character-by-character and skipping ahead, the algorithm identifies
66+
the lexicographically largest substring of the required length.
67+
68+
Now, let’s look at the solution steps below:
69+
70+
1. We handle the edge case where `num == 1`. The entire string is returned in this case because no splitting is
71+
necessary.
72+
2. We initialize two pointers: i = 0 marks the starting index of the current best substring, and j = 1 is the starting
73+
index of the candidate substring.
74+
3. We also use k = 0 to compare characters one by one.
75+
4. While j is within bounds of the string, we compare characters at positions i + k and j + k:
76+
- If they are equal, we increment k by 1 to continue the comparison.
77+
- If the character at j + k is greater, it means the candidate substring is better. So, store the value of i in a
78+
temporary variable tempIndex, update i = j, and shift j forward using j = max(j + 1, tempIndex + k + 1).
79+
- If the character at i + k is greater, we skip past the common prefix using j = j + k + 1 to avoid redundant
80+
comparisons.
81+
5. After finishing all comparisons, the best substring starts at the index i and has a length of `len(word) - num + 1.`
82+
6. We return the best final substring as the result.
83+
84+
Let’s look at the following illustration to get a better understanding of the solution:
85+
86+
![Solution 1](./images/solutions/lexicographically_largest_string_from_box_solution_1.png)
87+
![Solution 2](./images/solutions/lexicographically_largest_string_from_box_solution_2.png)
88+
![Solution 3](./images/solutions/lexicographically_largest_string_from_box_solution_3.png)
89+
![Solution 4](./images/solutions/lexicographically_largest_string_from_box_solution_4.png)
90+
![Solution 5](./images/solutions/lexicographically_largest_string_from_box_solution_5.png)
91+
![Solution 6](./images/solutions/lexicographically_largest_string_from_box_solution_6.png)
92+
![Solution 7](./images/solutions/lexicographically_largest_string_from_box_solution_7.png)
93+
![Solution 8](./images/solutions/lexicographically_largest_string_from_box_solution_8.png)
94+
![Solution 9](./images/solutions/lexicographically_largest_string_from_box_solution_9.png)
95+
96+
## Time Complexity
97+
98+
The overall time complexity of the solution is `O(n)`because:
99+
- The string is scanned using two pointers (`i` and `j`) and an offset pointer (`k`) for character comparisons.
100+
- Each character is compared at most a constant number of times.
101+
- The `j` pointer skips over redundant regions, ensuring linear traversal.
102+
- Total comparisons across the entire string remain linear in `n`.
103+
104+
## Space Complexity
105+
106+
The space complexity of the solution is `O(1)`because a constant space is used for pointer variables only.

pystrings/lexicographically_largest_string/__init__.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
def lexicographically_largest_string_from_box(word: str, num: int) -> str:
2+
if num == 1:
3+
return word
4+
25
n = len(word)
36
max_substring = ""
47

@@ -35,3 +38,25 @@ def lexicographically_largest_string_from_box(word: str, num: int) -> str:
3538
max_substring = candidate
3639

3740
return max_substring
41+
42+
# Function to find the lexicographically largest string
43+
def lexicographically_largest_string_from_box_2(word: str, num: int) -> str:
44+
if num == 1:
45+
return word
46+
47+
n = len(word)
48+
i, j = 0, 1
49+
50+
while j < n:
51+
k = 0
52+
while j + k < n and word[i + k] == word[j + k]:
53+
k += 1
54+
55+
if j + k < n and word[i + k] < word[j + k]:
56+
temp_index = i
57+
i = j
58+
j = max(j + 1, temp_index + k + 1)
59+
else:
60+
j = j + k + 1
61+
62+
return word[i : min(n, i + n - num + 1)]
22.2 KB
Loading

pystrings/lexicographically_largest_string/test_lexicographically_largest_string.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
from parameterized import parameterized
33
from pystrings.lexicographically_largest_string import (
44
lexicographically_largest_string_from_box,
5+
lexicographically_largest_string_from_box_2
56
)
67

78

@@ -29,6 +30,29 @@ def test_lexicographically_largest_string_from_box(
2930
actual = lexicographically_largest_string_from_box(word, num)
3031
self.assertEqual(expected, actual)
3132

33+
@parameterized.expand(
34+
[
35+
("acbd", 2, "d"),
36+
("zzzzz", 5, "z"),
37+
("aazz", 1, "aazz"),
38+
("yxa", 2, "yx"),
39+
("dbca", 2, "dbc"),
40+
("gggg", 4, "g"),
41+
("zxya", 3, "zx"),
42+
("mnopqr", 3, "r"),
43+
(
44+
"ajhpnonbogfhxanqumtrgosqwjubctnwxcjxgvophccygoomigxlkhsxnnitsyqkiwmrrlekphwpgtsvjsbokakzpdkdzzdnbgmaepkhuohdlvzsfiokivnhybybinedxbkgdpjdktxkezfyvcxegobkfdnmdiupsfjsobwpseucjbzkvqoqxbnsoqjlldqjpjqkjvzdmpnyrwtkbezhmrsdzfgsjmycaydczpiotkvjzwcoiihunqmzylccuekkdebhgiifwrlbramdqsbwzbuoyluqqtdaroa",
45+
1,
46+
"ajhpnonbogfhxanqumtrgosqwjubctnwxcjxgvophccygoomigxlkhsxnnitsyqkiwmrrlekphwpgtsvjsbokakzpdkdzzdnbgmaepkhuohdlvzsfiokivnhybybinedxbkgdpjdktxkezfyvcxegobkfdnmdiupsfjsobwpseucjbzkvqoqxbnsoqjlldqjpjqkjvzdmpnyrwtkbezhmrsdzfgsjmycaydczpiotkvjzwcoiihunqmzylccuekkdebhgiifwrlbramdqsbwzbuoyluqqtdaroa",
47+
),
48+
]
49+
)
50+
def test_lexicographically_largest_string_from_box_2(
51+
self, word: str, num: int, expected: str
52+
):
53+
actual = lexicographically_largest_string_from_box_2(word, num)
54+
self.assertEqual(expected, actual)
55+
3256

3357
if __name__ == "__main__":
3458
unittest.main()

0 commit comments

Comments
 (0)