# Longest Sub-string with at most K Distinct Characters

Posted: 11 Sep, 2020

Difficulty: Moderate

#### You are given string S of length N, and an integer K. Your task is to find the length of the longest substring that contains at most K distinct characters.

##### Input Format:

```
The first line contains an Integer 'T' which denotes the number of test cases/queries to be run.
Then the test cases follow.
The first line of input for each test case/query contains an integer K.
The second line of input for each test case/query contains a string S.
```

##### Output Format:

```
For each test case, print the length of the longest substring that contains at most K distinct characters.
Output for every test case will be printed in a separate line.
```

##### Note:

```
You do not need to print anything, it has already been taken care of. Just implement the function.
```

##### Constraints:

```
1 <= T <= 10
1 <= K <= 26
1 <= N <= 10^4
Time Limit: 1sec
```

Approach 1

- We will generate all the substrings using 2 nested for loops and we will have a ‘CHECK’ function which returns true if the number of distinct character in the substring is less than equal to K otherwise false.
- We will have an ans variable initialize to 0. We will call the ‘CHECK’ function with every substring and if it returns true then
- ANS = MAX(ANS , CURRENT_SUBSTRING.SIZE())

- To implement the check function we will have K and substring S as inputs and we will create an array of size 26 let’s call this ‘FREQ’ and initialize all position with 0.
- Increase the frequency of every character in the substring S.
- FREQ[S[i]-’a’]++

- Have a variable ‘DISTINCT’ = 0
- Then run a loop from i = 0 to i = 25 and increase the ‘DISTINCT’ if FREQ[i] is non-zero.

- Return true if DISTINCT <= K otherwise return false

- Increase the frequency of every character in the substring S.
- Finally, return ANS which have our required answer.

Approach 2

- We will run a loop from i = 0 to i = N - 1 where i will be starting position of the substring and j = i to j = N - 1 where j will be the end position of the substring. Before starting the second loop we will create a hash table or an array of size 26 let’s call this ‘FREQ’ array.
- Inside inner loop increase the frequency of the j’th character. Now freq array will have the frequency of the substring starting from the index i and ending at the index j. Now we will check the number of distinct characters which will be simply the number of the non-zero valued position in the frequency array.
- Now initialize a variable ANS = 0 and check if the number of distinct characters is less than equal to K then
- ANS = MAX(ANS, j - i + 1)

- Now return the ans which has the required answer.

Approach 3

- We will be doing the binary search on the answer that is the length of the substring where LOW = 1 and HIGH = N will be the range of our answer and we will have a variable ANS = 0 to store the required answer.
- Now, as usual, we will write a while loop till LOW <= HIGH and we will find the mid
- For the mid length, we will try to find if there is a substring of length equal to MID which has the distinct characters less than equal to K.
- To check the number of distinct characters in a substring of length equal to mid in O(N) we will use the sliding window with a hash table or an array of size 26 and initialize all the position with 0.
- We will run a loop from i = 0 to i = MID - 1 which has a length equal to mid and increase the count of the characters in the array let’s say FREQ i.e FREQ[S[i]-’a’]++.
- Now the number of distinct characters will be equal to the number of non-zero value in the freq array. We will have a variable let’s say possible = false we will make it to true of the number of distinct characters is less than equal to K for this first substring of length equal to MID.
- Now run a loop from i = MID to i = N - mid and increase the count of i’th character i.e
- FREQ[S[i]-’a’]++

- And decrease the count of i-mid character i.e
- FREQ[S[i-MID]-’a’]--

- It will slide our window by 1 length towards the right side so will need to decrease the count of the character which is removed from the window which i - MID character and we need to increase the count of the new character that is at index i.
- Now, will do the same procedure to count the distinct characters and update the variable possible.
- If the possible is true then we can try for bigger length substring so
- LOW = MID + 1
- ANS = MAX(ANS, MID)

- Otherwise, we need to decrease the substring size so
- HIGH = MID - 1

- Now after the binary search we will have our required answer in the variable ans so return it.

Approach 4

- Create a hash table or a frequency array lets name it ‘FREQ’ and initialize all the position with 0.
- We will run a loop from i = 0 to i = N - 1 which will be the right boundary of the sliding windows and we will have a variable start = 0 which will be the left boundary of the sliding window.
- Increase the frequency of i’th character by 1 i.e
- FREQ[S[i]-’a’]++

- Now our window has expanded by 1 length but we need to check if the substring has less than equal to K distinct characters.
- So we will run a while loop and check if FREQ array has greater than K distinct character then
- FREQ[S[START]-’a’]--

- Now once we come out of the while loop current window is bounded by start and i have less than equal to K distinct characters.
- We will maintain a variable ‘ANS’ which will be our required answer now since this substring is valid we can take a maximum of ans and current substring length.
- ANS = MAX(ANS, i - START +1)

- Increase the frequency of i’th character by 1 i.e
- Now we have our required answer in ‘ANS’ so simply return it.

SIMILAR PROBLEMS

# Matching Prefix

Posted: 21 Aug, 2021

Difficulty: Moderate

# Batch Photography

Posted: 10 Sep, 2021

Difficulty: Hard

# Page Faults

Posted: 12 Nov, 2021

Difficulty: Moderate

# Subarrays With Zero Sum

Posted: 1 Dec, 2021

Difficulty: Easy

# Smaller Than Triplet Sum

Posted: 1 Dec, 2021

Difficulty: Moderate