3. Longest Substring Without Repeating Characters
Given a string, find the length of the longest substring without repeating characters.
First time i get a very tedious algorithm, because of i did not thinking about the problem carefully in the beginning, it causes that i hava to fix bug again and again because of the situation i haven’t think about . this makes code become very redundant .
The solution for the first time
i have thinking about did i have problems in the way i tried before, the twists before let me realize to think carefully and then write the code,i am not going to analyze this code, because i almost can’t understand what i writen, I even embarrassed to write the annotate, just leave A lesson to me. The next step i will optimize and rethink this approach.
The solution for the second time
After the optimize ,the code is very clear. I haven’t thought that there is a Math.max () method can be used util i re-conceived the code, still need more sensitivity of the api.
The principle of this method is to traverse each character, in order to avoid nest loop, we use the hashmap to replace the time cost of space costs,determine whether there is a character in front by hashmap, and store his index number information. How do we get the longest substring? In fact, it can be divided into the following situations:
We use (p) to represent the previous repeat character, (n) to represent the latter repeat character.
- The character a is the first repeat character, and the length should be from the beginning to the location a(n)-1.
- the character a is not the first repeat character, the last repeat character is b, if a(p) is behind b(p) then the length is between two a, if a(p) is in front of b(p), then the length is between a(n) and b(p) (because a(n) to a(p) contains two b, which is irregular)
- This character does not repeat, the length is the character to location of the last repeat character a(p)+1.
Sort out the current situation util now .in fact, the conditions above can be sorted into one way to calculate, because the sub-string must be continuous, so it is to calculate the length between current character and the repeat character previous one ,and then get the largest length.
Time complexity: O(n).
Space complexity: O(n).
and then we look some type solution of this kind problem.
Approach 1：Brute Force
The principle of this method is very simple, the outermost loop L1 traverse of every character a, nest a loop L2 traverse of b (all the characters which is after a), use another loop with the hashset to traverse all the string between a and b which is beginning with a, and determine that if it is a string with no repeat character. Nested a three-tier loop! it lead to the results of Time Limit Exceeded, that means your method is too time-consuming, is not a good method.
Time complexity: O(n^3) . The first is the innermost [i, j) times cycle, the time complexity is O(j-i). And then is every cycle of j which takes [i + 1, n),
Finally count the outermost loop [0, n). The time complexity is
This transformation process is actually the use of the sum of the first n terms of an arithmetic progression, is too difficult to print the formula on my computer, i am not going to print the detail of the steps .
Space complexity: O(min(n,m)) . n is the length of the string,m is the set of the size of charset.Because there are m didn’t repeat value at most,so the hashset’s size will be m at most.
This method can make a little optimization logically , the main idea is that if from a to b is already containing repeated characters, then a to the characters after b must also contain repeated characters. You can omit those comparisons.
Approach 2：Sliding Window（K-Nearest Neighbor algorithm）
Method 1 is repeatly to check each sub-string has duplicate character,but in fact, it is unnecessary, the second method is avoid repeated checks, when i ~ j has not repeat, we only need to check s[j + 1] is already in i ~ j characters . Until the characters of j + 1 is already in i~j, we get the longest non-repetitive substring which start with i character. we can stop the calculation When j is not less than n , because at this time [i, j) must be longer than [i + 1, j).
Time complexity: O(2n) = O(n)
Space complexity: O(min(m,n)) n is the length of the string,m is the size of the charset.Because there are m didn’t repeat value at most,so the hashset’s size will be m at most.
####Approach 3:Sliding Window Optimized
This method is the improvement of the method two, we use the HashMap to save the characters and his position, when there are duplicate characters appeared, we can directly locate the location of the repeated characters (Note that the location here and the program’s index number is different , it’s first is 1 rather than our program is 0.), compared to the previous i and take a larger value assigned to the new i. Why should we take a large value can see the above solution for the second time.
Time complexity： O(n)
Space complexity： O(min(m,n)) the same to the method 2
The most efficient one, the reason of this method can run more fast than the previous hashmap while they have the same complexity, simply said, in fact of most collection is composed of arrays, hashmap is actually composed by the array and the list , So the array’s search speed will be better than the hashmap search speed.
This method’s principle is almost the same as the previous method which is using hashmap, but this method uses int to store key-value pairs, using characters to represent subscripts (yes, characters can be used as subscripts, exactly is Integer character constants can be used as subscript, it will be parsed to ASCII value), the location use as value.
Time complexity： O(n)
Space complexity： O(m) m is the charset’s size.
If you have a better method or have other opinions on the description of me here, please contact me. Thank you.
My blog leonchen1024.com
My Github https://github.com/LeonChen1024