Wednesday, November 23, 2011

How string comparison works in java

explain "how String comparison works in Java?" is the most common question interviewers ask these days, and today this post is going to explain the answer in detail. String is one of Java's most admired feature and most used class in the whole Java for sure and I hope you all won't deny that fact. But, sadly most programmers fail to answer this question. Reason is either learners or programmers don't take String seriously or they just don't take Java seriously. So, let's start our learning.....

There are two thing into consideration when we talk about String comparison
1. "==" normal Java comparison operator
2. ".equals()" method of Object class

"=="  in Java is used to compare two values, when I say values... I mean, they can be of any primitive types or any object types. Now when our focus is specifically on String class, we take objects alone into consideration. So, what really happens when you compare any two objects using == operator..? It's simple.. They just compare the two objects and see if they both are pointing to same memory location. If Java found that both strings are pointing to same memory location, then it returns boolean true else if both objects are pointing to different memory locations then it returns boolean false. Now, at this point of time don't worry about the implementation as I will explain with example in detail.

".equals()" in Java on the other hand is actually used to compare the contents of two objects. When I say contents of the objects, I mean value within the string and not the memory location to which the objects point to. In this kind of comparison if Java finds that the value in two different string objects are same, they return boolean true else boolean false. As simple as that isn't it..! Just give this below example a try in your favorite editor,

public class StringTestOne {
    public static void main(String[] args) {
        String s1 = "AAA";
        String s2 = "AAA";
        String s3 = "BBB";
       
        if(s1==s2){
            System.out.println("s1 == s2 : TRUE");
        } else {
            System.out.println("s1 == s2 : FALSE");
        }
        if(s1.equals(s2)){
            System.out.println("s1.equals(s2) : TRUE");
        } else {
            System.out.println("s1.equals(s2) : FALSE");
        }
       
        if(s1==s3){
            System.out.println("s1 == s3 : TRUE");
        } else {
            System.out.println("s1 == s3 : FALSE");
        }
        if(s1.equals(s3)){
            System.out.println("s1.equals(s3) : TRUE");
        } else {
            System.out.println("s1.equals(s3) : FALSE");
        }
    }
}
now let us trace the program,

-first the fourth if statement : it is resulting FALSE as we know that .equals() method compares the content of two object i.e "AAA" and "BBB"(no human on earth can prove that "AAA" is equal to "BBB"),  thus it's very clear that it has returned FALSE.

-then the third if statement : it returns FALSE as again we know that == operator will compare the memory location to which the objects s1 and s3 refers to, but not the values . As of now it makes sense that Java would create a separate memory location for every object created on heap. Thus, even in this case Java would have created two different objects s1 and s3 which would be pointing to two different memory places in memory. So, expected result FALSE. hmmm  you are smart..!

-then the second if statement : again as expected the values of s1 and s2 are same and hence the .equals() method will and should return TRUE.

-Now comes the first if statement : Why on earth is it returning TRUE while there are two different object s1 and s2 into consideration. Yes I agree to the point that they both have same values, but Java needs to compare the object handles while using == operator... isn't it what wee have learned till now..? Yes very much true and that's the reason we have this post today. 

This TRUE thing in the first if statement is due to the concept of  String class being immutable. You might have certainly learned in you college that your professors saying String is immutable. That's the reason Strings are so special in Java and that's the reason why few programmers still feel String comparison a confusing task. Don't worry as it would be explained clearly now, 

Internally Java uses the concept of String pool in order to make use of strings  efficiently across Java. Thus, whenever you create new string objects in your program, it will first search that string pool to see if there is any existing string value which is already created having same value. And if Java thinks that an value already exists in pool, then it takes no permission from you(as string is already final and programmer has no control on it) to point your new string object to the same memory location to which earlier object was pointing to. If it thinks there is not such value in the pool, they make a new entry into the pool. Let's take the earlier examples,

String s1 = "AAA"; //This will first search the String pool to see if there is any such value already existing in the pool. Knowing that there is no such string value, it will add the value into Java's String pool for the first time and assigns the memory location to object handle s1.
String s2 = "AAA"; //This will also first search the String pool to see if there is any such entry in the pool with the value "AAA", As you it finds one such entry it will just make use of that memory location and assigns the same to string object s2 as well.

Which means that, even though we have two objects s1 and s2 created, Java has done some internal optimization to ensure that it doesn't waste memory unnecessarily and thus s1 and s2 will be pointing to the same memory location where the value "AAA" is actually stored. That is the reason why we got TRUE in the first if statement earlier. And that is the reason why string is called immutable.

Please feel free to comment your feedback on this post and your valuable suggest are always welcome educate others.

No comments:

Post a Comment