Pages

Tuesday, 2 February 2016

Algorithm to check that String has all unique character

Implement an algorithm to determine if a string has all unique characters without using any additional data structures?

Time complexity is O(n), where n is the length of the string, and space complexity is O(n).
package crack.coding.interview;

public class NoCharRptInStr {
     /**
      * @param str
      * @return
      */
     private static boolean isStringWithUniqueChar(String str) {
           /** Checking for null. */
           if(str == null) {
                return false;
           }
          
           boolean result = true;
           char[] array = str.toCharArray();
           boolean[] isCharExt = new boolean[256];

           for (int i = 0; i < array.length; i++) {
                int charASCII = array[i];
               
                /** Return false for repeating character. */
                if(isCharExt[charASCII]) {
                     return false;
                }
                isCharExt[charASCII] = true;
           }
           return result;
     }

     /**
      * Driver Method
      * @param args
      */
     public static void main(String[] args) {
           String str = "geksfor";
           boolean res = isStringWithUniqueChar(str);
           System.out.println("String contains unique count: "+res);
     }
}

Using Bit Vector:
We can reduce our space usage a little bit by using a bit vector. We will assume, in the below code, that the string is only lower case ‘a’ through ‘z’. This will allow us to use just a single int.
package crack.coding.interview;
public class NoCharRptInStr1 {

     public static boolean isUniqueChars(String str) {
           /**
            * Checker is used as a bitmap to indicate
            * which characters have been seen already.
            **/
           int checker = 0;
          
           for (int i = 0; i < str.length(); i++) {
                /**
                 * Set val to be the difference between the char at i and 'a'
                 * unicode 'a' is 97 if you have an upper-case letter e.g.
                 * 'A' you will get a negative 'val' which is illegal
                 */
                int val = str.charAt(i) - 'a';
               
                /**
                 * if this lowercase letter has been seen before, then the corresponding
                 * bit in checker will have been set and we can exit immediately.
                 **/
                int bitPos = 1 << val;
                if ((checker & bitPos)> 0) {
                     return false;
                }
                /**
                 * set the bit to indicate we have now seen the letter.
                 */
                checker = checker | bitPos;
           }
           /** none of the characters has been seen more than once. */
           return true;
     }

     /**
      * Driver Method
      * @param args
      */
     public static void main(String[] args) {
           String str = "geksfor";
           boolean res = isUniqueChars(str);
           System.out.println("String contains unique count: "+res);
     }
}



Imagine that you have an array of 26(=ASCII value of Character-‘a’) Booleans, each one tracking whether a particular character has appeared in the string already. You start with all of them false (=0).
Now iterate all the characters of the string and each time you see a character you look into the array slot for that character.

If it's false, this is the first time you've seen the character and you can set the slot to true.

If it's true, you've already seen this character and you can immediately report that there's a duplicate.

We have been opted a clever trick to mark the appearance of the character. Since there are only 26 different characters possible and there are 32 bits in an int, the solution creates an int variable where each bit of the variable corresponds to one of the characters in the string.

Instead of reading and writing an array, the solution reads and writes the bits of the number.
For example, look at this line:
int bitPos = 1 << val;
if ((checker & bitPos)> 0) {
       return false;
}

What does checker & bitPos do?

Well, 1 << val creates an int value that has all bits zero except for the valth bit. It then uses bitwise AND to AND this value with checker. If the bit at position val in checker is already set, then this evaluates to a nonzero value (meaning we've already seen the number) and we can return false. Otherwise, it evaluates to 0, and we haven't seen the number.

The next line is this:
checker = checker | bitPos;

This uses the "bitwise OR with assignment" operator, which is equivalent to
checker = checker | bitPos;

This ORs checker with a value that has a 1 bit set only at position val, which turns the bit on. It's equivalent to setting the valth bit of the number to 1.

No comments:

Post a Comment