Skip to content

Commit 2aaabbe

Browse files
committed
feat(algorithms, sliding-window): permutation in string)
1 parent dc19fc4 commit 2aaabbe

14 files changed

+270
-1
lines changed

algorithms/graphs/min_cost_valid_path/test_min_cost_to_make_valid_path.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
min_cost_dijkstra,
77
min_cost_0_1_bfs,
88
min_cost_0_1_bfs_2,
9-
min_cost_dfs_and_bfs
9+
min_cost_dfs_and_bfs,
1010
)
1111

1212
MIN_COST_TO_MAKE_VALID_PATH_IN_GRID_TEST_CASES = [
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Permutation in String
2+
3+
Given two strings s1 and s2, return true if s2 contains a permutation of s1, or false otherwise.
4+
5+
In other words, return true if one of s1's permutations is the substring of s2.
6+
7+
## Examples
8+
9+
Example 1:
10+
11+
```text
12+
Input: s1 = "ab", s2 = "eidbaooo"
13+
Output: true
14+
Explanation: s2 contains one permutation of s1 ("ba").
15+
```
16+
17+
Example 2:
18+
```text
19+
Input: s1 = "ab", s2 = "eidboaoo"
20+
Output: false
21+
```
22+
23+
## Constraints
24+
25+
- 1 <= s1.length, s2.length <= 104
26+
- s1 and s2 consist of lowercase English letters.
27+
28+
## Topics
29+
30+
- Hash Table
31+
- Two Pointers
32+
- String
33+
- Sliding Window
34+
35+
## Hints
36+
37+
- Obviously, brute force will result in TLE. Think of something else.
38+
- How will you check whether one string is a permutation of another string?
39+
- One way is to sort the string and then compare. But, Is there a better way?
40+
- If one string is a permutation of another string then they must have one common metric. What is that?
41+
- Both strings must have same character frequencies, if one is permutation of another. Which data structure should be
42+
used to store frequencies?
43+
- What about hash table? An array of size 26?
44+
45+
## Solution(s)
46+
47+
1. [Sliding Window](#sliding-window)
48+
2. [Optimized Sliding Window](#optimized-sliding-window)
49+
50+
### Sliding Window
51+
52+
A naive approach would be to generate all possible rearrangements of s1 and check whether any appear in s2. But that
53+
would quickly become inefficient for longer strings, because the number of permutations grows extremely fast. For a
54+
three-letter string like "abc", there are already six permutations; for "abcdef", there are hundreds. So, instead of
55+
trying to rearrange s1, we look for an efficient alternative.
56+
57+
Two strings are permutations of each other if and only if they contain the same characters the same number of times. For
58+
example, "abc" and "bca" are permutations because both have one a, one b, and one c. This means that if we find any
59+
substring of s2 that is the same length as s1 and has the same character counts, then that substring must be a
60+
permutation of s1.
61+
62+
> Quick recall
63+
> A substring is a contiguous sequence of characters within a larger string. So, if s1 = "ab" and s2 = "acb", even
64+
> though both strings contain the same counts of letters a and b, there’s no two-letter block in s2 that contains both
65+
> together so that the result would be FALSE.
66+
67+
The same idea is used in this solution, keeping track of the character counts of s1 inside s2. To do this efficiently,
68+
the algorithm uses a sliding window over s2. At any given moment, this window represents a substring of s2 that’s the
69+
same length as s1 and could potentially be one of its permutations. To check that, compare the frequency of each
70+
character in s1 with the frequency of the characters inside this window.
71+
72+
If the counts match, it means the current window contains the same letters as s1, just possibly in a different order,
73+
so a valid permutation is found. In that case, return TRUE. If the counts don’t match, and there are still characters
74+
left to examine in s2, slide the window forward by one character. That means one new character from the right side of
75+
s2 is added to the window, and one old character from the left side is removed. This is important because the window
76+
must always stay the same length as s1. Keep doing this until either a match is found or the end of s2 is reached. If
77+
all possible windows have been checked and none match the character frequencies of s1, then return FALSE.
78+
79+
Let's look at the algorithm steps:
80+
81+
1. Store the lengths of s1 and s2 in n1 and n2.
82+
2. If n1 > n2, return FALSE, as a longer string can’t fit as a substring in some other string.
83+
3. Create two arrays, s1Counts and windowCounts, of length 26. The former stores the frequencies of characters in s1,
84+
and the latter stores the frequencies in the current window of s2.
85+
4. If the two frequency arrays are identical at this point, s1Counts == windowCounts, the function returns TRUE because
86+
the first window is already a permutation of s1.
87+
5. If they do not match, the window begins sliding across s2 one character at a time. For each position i from n1 to
88+
n2-1 in s2:
89+
- Add the count of characters at the right end of s2 (s2[i]) in the current window.
90+
- Remove the count of characters at the right end of s2 (s2[i - n1]) from the current window.
91+
- If s1Counts == windowCounts, return TRUE.
92+
6. Once the window has moved across the entire s2, and no match is found, return FALSE.
93+
94+
![Example 1](./images/solutions/permutation_in_string_solution_1.png)
95+
![Example 2](./images/solutions/permutation_in_string_solution_2.png)
96+
![Example 3](./images/solutions/permutation_in_string_solution_3.png)
97+
![Example 4](./images/solutions/permutation_in_string_solution_4.png)
98+
![Example 5](./images/solutions/permutation_in_string_solution_5.png)
99+
![Example 6](./images/solutions/permutation_in_string_solution_6.png)
100+
![Example 7](./images/solutions/permutation_in_string_solution_7.png)
101+
![Example 8](./images/solutions/permutation_in_string_solution_8.png)
102+
![Example 9](./images/solutions/permutation_in_string_solution_9.png)
103+
![Example 10](./images/solutions/permutation_in_string_solution_10.png)
104+
105+
#### Complexity Analysis
106+
107+
##### Time Complexity
108+
109+
The function runs in linear time because it builds character-frequency counts once, then slides a fixed-size window
110+
across s2 while doing only constant work per step. Each step contributes to the time complexity as follows:
111+
112+
- Counting frequencies in s1 takes O(n1) time (one pass over s1).
113+
- Building the first window in s2 (size of s1.length) takes O(n1) time.
114+
- Sliding the window across s2 takes O(n2). This is because each slide:
115+
- Adds one character and removes one character (O(1) updates).
116+
- Compares two frequency arrays of size 26 (O(1), as 26 is constant).
117+
118+
This makes the total time complexity O(n1 + n2). As the length of s2 is greater than or equal to that of s1 in all valid
119+
cases, the overall time complexity is dominated by the length of s2, with a special O(1) case when s1 is longer than s2.
120+
Therefore, the final time complexity is O(n2).
121+
122+
##### Space Complexity
123+
124+
The space complexity of this solution is O(1), meaning it uses constant extra memory. This is because it only maintains
125+
two fixed-size arrays (of size 26) to store letter frequencies and a few simple integer variables.
126+
127+
### Optimized Sliding Window
128+
129+
The last approach can be optimized, if instead of comparing all the elements of the s1arr for every updated s2arr
130+
corresponding to every window of s2 considered, we keep a track of the number of elements which were already matching
131+
in the s1arr and update just the count of matching elements when we shift the window towards the right.
132+
133+
To do so, we maintain a count variable, which stores the number of characters(out of the 26 alphabets), which have the
134+
same frequency of occurence in s1 and the current window in s2. When we slide the window, if the deduction of the last
135+
element and the addition of the new element leads to a new frequency match of any of the characters, we increment the
136+
count by 1. If not, we keep the count intact. But, if a character whose frequency was the same earlier(prior to addition
137+
and removal) is added, it now leads to a frequency mismatch which is taken into account by decrementing the same count
138+
variable. If, after the shifting of the window, the count evaluates to 26, it means all the characters match in frequency
139+
totally. So, we return a True in that case immediately.
140+
141+
#### Complexity Analysis
142+
143+
Let l1 be the length of string s1 and l2 be the length of string s2.
144+
145+
##### Time complexity: O(l1 + (l2 −l1))≈O(l2)
146+
147+
Populating s1arr and s2arr takes O(l1) time since we iterate over the first l1 characters of both strings.
148+
149+
The outer loop runs l2 −l1 times. In each iteration, we update two characters (one entering and one leaving the window)
150+
in constant time O(1), and we maintain a count of matches. This step takes O(l2 −l1).
151+
152+
Checking if count == 26 also happens in O(1), since it's a constant comparison.
153+
154+
Thus, the total time complexity is: O(l1 +(l2 −l1))≈O(l2)
155+
156+
##### Space complexity: O(1)
157+
158+
Two fixed-size arrays (s1arr and s2arr) of size 26 are used for counting character frequencies. No additional space that
159+
grows with the input size is used.
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
def check_inclusion_optimized_sliding_window(s1: str, s2: str) -> bool:
2+
s1_len, s2_len = len(s1), len(s2)
3+
if s1_len > s2_len:
4+
return False
5+
6+
s1_arr = [0] * 26
7+
s2_arr = [0] * 26
8+
9+
for i in range(s1_len):
10+
x = ord(s1[i]) - ord("a")
11+
y = ord(s2[i]) - ord("a")
12+
s1_arr[x] += 1
13+
s2_arr[y] += 1
14+
15+
count = 0
16+
for i in range(26):
17+
if s1_arr[i] == s2_arr[i]:
18+
count += 1
19+
20+
for i in range(s2_len - s1_len):
21+
r = ord(s2[i + s1_len]) - ord("a")
22+
l = ord(s2[i]) - ord("a")
23+
24+
if count == 26:
25+
return True
26+
s2_arr[r] += 1
27+
if s2_arr[r] == s1_arr[r]:
28+
count += 1
29+
elif s2_arr[r] == s1_arr[r] + 1:
30+
count -= 1
31+
32+
s2_arr[l] -= 1
33+
if s2_arr[l] == s1_arr[l]:
34+
count += 1
35+
elif s2_arr[l] == s1_arr[l] - 1:
36+
count -= 1
37+
return count == 26
38+
39+
40+
def check_inclusion_sliding_window(s1: str, s2: str) -> bool:
41+
n1 = len(s1)
42+
n2 = len(s2)
43+
44+
# If s1 is longer than s2, a permutation of s1 cannot be a substring of s2
45+
if n1 > n2:
46+
return False
47+
48+
# Initialize frequency arrays for s1 and the current sliding window in s2
49+
# Use 26-element arrays for lowercase English letters 'a' through 'z'
50+
s1_counts = [0] * 26
51+
window_counts = [0] * 26
52+
53+
# Populate s1_counts with character frequencies from s1
54+
for c in s1:
55+
s1_counts[ord(c) - ord("a")] += 1
56+
57+
# Populate window_counts for the initial sliding window (first n1 characters of s2)
58+
for i in range(n1):
59+
window_counts[ord(s2[i]) - ord("a")] += 1
60+
61+
# Check if the initial window is a permutation
62+
# This can be done by comparing the two frequency arrays
63+
if s1_counts == window_counts:
64+
return True
65+
66+
# Slide the window across the rest of s2
67+
for i in range(n1, n2):
68+
# Character entering the window (at index i)
69+
char_added = ord(s2[i]) - ord("a")
70+
window_counts[char_added] += 1
71+
72+
# Character leaving the window (at index i - n1)
73+
char_removed = ord(s2[i - n1]) - ord("a")
74+
window_counts[char_removed] -= 1
75+
76+
# After updating the window, check if the frequencies match
77+
if s1_counts == window_counts:
78+
return True
79+
80+
# If no permutation is found after checking all windows
81+
return False
37.2 KB
Loading
56 KB
Loading
54.4 KB
Loading
85.8 KB
Loading
80 KB
Loading
78.8 KB
Loading
67 KB
Loading

0 commit comments

Comments
 (0)