Comparing Files In Java

author-image  By Peter Hill, 15 November, 2018   0K
comparing-files-in-java

Java is one of the best programming languages that already took popularity over the C++ among the programmers. I am not saying this as a passionate Java developer. In the last 20 years, Java has proved its efficiency over other programming languages. Nowadays, Android apps development is generating more values to learn Java.

When you are learning Java, network programming can be a great source of practicing Java. You can use a raw socket connection to copy a specific file from clients to servers. In that case, you must have a complete understanding of the Java NIO. I have seen many Java programmers face difficulties to compare files in Java.

You may want to get the required file after checking whether the file has any duplicate version in either client or server. It can happen when you have your client and server on the same machine. In that case, you need to compare the files to copy them with a different name. Comparing files in java also can help you to differentiate between the local and the remote files. You can quickly identify the duplicate lines, which allows you to remove the file entirely.

In this tutorial, I am going to show you how you can compare files in Java. You can use any IDE to write codes for Java. Here, I am using a highly simplified approach to compare two files in Java. With the use of this program, you will not only get the message whether the files are copied or not, but also identify where the difference is.

Steps

Step 1: At first, you need to define two BufferedReader objects, reader1 and reader2. These objects read your chosen files.

BufferedReader reader1 = new BufferedReader(new FileReader(“Pass the path of file1 here”))

BufferedReader reader2 = new BufferedReader(new FileReader(“Pass the path of file2 here”))

Step 2: Now you need to initialize a true value to the boolean variable areEqual and 1 to the integer variable linenum. areEqual is used for flagging when a difference is found in the two contents, whereas linenum dictates the no. of lines.

boolean areEqual = true;

int lineNum = 1;

Step 3: Read the lines of both file1 and file2 respectively into line1 and line2 till the last buffer.

String line1 = reader1.readLine()

String line2 = reader2.readLine()

Step 4: In case of either line1 or line2 being null, assign false to areEqual and break the loop. Given, both of them aren't null, compare them using the equalsIgnoreCase() method. Continue the loop if equalsIgnoreCase() returns true, else break from the loop and assign false to areEqual.

while (line1 != null || line2 != null)
{
        if(line1 == null || line2 == null)
        {
                areEqual = false;
                break;
        }
        else if(! line1.equalsIgnoreCase(line2))
        {
                areEqual = false;
                break;
        }

        line1 = reader1.readLine();
        line2 = reader2.readLine();
        lineNum++;
}

Step 5: You can declare both the files have the same contents if the areEqual boolean is true, otherwise they don't contain the same contents.

Final Java Code:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
 
public class CompareTextFiles
{	
    public static void main(String[] args) throws IOException
    {	
        BufferedReader reader1 = new BufferedReader(new FileReader("C:\\file1.txt"));
         
        BufferedReader reader2 = new BufferedReader(new FileReader("C:\\file2.txt"));
         
        String line1 = reader1.readLine();
         
        String line2 = reader2.readLine();
         
        boolean areEqual = true;
         
        int lineNum = 1;
         
        while (line1 != null || line2 != null)
        {
            if(line1 == null || line2 == null)
            {
                areEqual = false;
                 
                break;
            }
            else if(! line1.equalsIgnoreCase(line2))
            {
                areEqual = false;
                 
                break;
            }
             
            line1 = reader1.readLine();
             
            line2 = reader2.readLine();
             
            lineNum++;
        }
         
        if(areEqual)
        {
            System.out.println("Two files have same content.");
        }
        else
        {
            System.out.println("Two files have different content. They differ at line "+lineNum);
             
            System.out.println("File1 has "+line1+" and File2 has "+line2+" at line "+lineNum);
        }
         
        reader1.close();
         
        reader2.close();
    }
}

This program is simple in nature and may take up a lot of memory and time to execute. If you're looking for a very fast and efficient solution, you need to use a highly advanced technique called memory-mapping. Files that are mapped by the memory are read directly by your operating system. The OS allocates the memory itself and hence the program doesn't consume heap memory.

Open your files by using the RandomAccessFile class and ask for the channel from this object if you want to memory map the two files. MappedByteBuffer, a representation of the memory area of your file's contents can be created from the channel that you get access from the RandomAccessFile class.

The below program outlines what you need to do.

package packt.java9.network.niodemo;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class MapCompare {
    public static void main(String[] args) throws IOException {
        long start = System.nanoTime();
        FileChannel ch1 = new RandomAccessFile("sample.txt", "r").getChannel();
        FileChannel ch2 = new RandomAccessFile("sample-copy.txt", "r").getChannel();
        if (ch1.size() != ch2.size()) {
            System.out.println("Files have different length");
            Return;
        }

        long size = ch1.size();
        ByteBuffer m1 = ch1.map(FileChannel.MapMode.READ_ONLY, 0L, size);
        ByteBuffer m2 = ch2.map(FileChannel.MapMode.READ_ONLY, 0L, size);
        for (int pos = 0; pos < size; pos++) {
            if (m1.get(pos) != m2.get(pos)) {
                System.out.println("Files differ at position " + pos);
                return;
            }
        }

        System.out.println("Files are identical, you can delete one of them.");
        long end = System.nanoTime();
        System.out.print("Execution time: " + (end - start) / 1000000 + "ms");
    }
}

You can also utilize the java-diff-utils API to do this job interactively without writing any of the code yourself. Follow this post carefully to do the this efficiently all by yourself.

JAVA programs are often used simultaneously by both the server and the client. This opens up the possibility of duplicative code that reuses already utilized memory and processing power. So, the elimination of such programs is elemental to ensure efficient and productive workstations.

You can use the programs mentioned above to find out such duplicative files pretty easily. The first program is more than enough for removing small files. However, for files that are very large.

About The Author

author-image

Further Reading on Core Java

1. Serialization And Deserialization Java Example

2. Why Wait Notify Notifyall Defined In Object Class

3. Random Password Generator Java

4. Java Aes Encypt Decrypt

5. Java8 Streams Operations

If You Appreciate What We Do Here On Devglan, You Should Consider: