Java program to find max repeated words from a file

Java program to find max repeated words from a file thumbnail
11K
By Dhiraj Ray 01 January, 2018

Description

In java interview, this program can be asked in a multiple ways such as write program to find max repeated words or duplicate words or the count of each duplicate words.Whatever the question, the main programming concept is the same to count the occurrence of each word in a .txt file. To solve this programatically, we can use Map implmenetation in Java that does not allow any duplicate key and at the end of iteration we can find out the count.Following is the complete program.Here, we are using java 8 Lambda operator during sorting.

MaxRepeatedWord.java
package com.devglan;

import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.StringTokenizer;
import java.util.Map.Entry;

public class MaxRepeatedWord {

    public Map getWordsCount(String fileName){

        FileInputStream fis;
        DataInputStream dis;
        BufferedReader br = null;
        Map wordMap = new HashMap();
        try {
            fis = new FileInputStream(fileName);
            dis = new DataInputStream(fis);
            br = new BufferedReader(new InputStreamReader(dis));
            String line;
            while((line = br.readLine()) != null){
                StringTokenizer st = new StringTokenizer(line, " ");
                while(st.hasMoreTokens()){
                    String tmp = st.nextToken().toLowerCase();
                    if(wordMap.containsKey(tmp)){
                        wordMap.put(tmp, wordMap.get(tmp)+1);
                    } else {
                        wordMap.put(tmp, 1);
                    }
                }
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally{
            try {
                if (br != null) {
                    br.close();
                }
            } catch(Exception ex){

            }
        }
        return wordMap;
    }

    public List> sortByValue(Map wordMap){

        Set> set = wordMap.entrySet();
        List> list = new ArrayList>(set);
        Collections.sort(list, (o1, o2) -> (o2.getValue()).compareTo( o1.getValue() ));
        return list;
    }

    public static void main(String a[]){
        MaxRepeatedWord maxRepeatedWord = new MaxRepeatedWord();
        Map wordMap = maxRepeatedWord.getWordsCount("C:/test.txt");
        List> list = maxRepeatedWord.sortByValue(wordMap);
        System.out.println("Max repeated word is " + list.get(0).getKey() + " with count - " + list.get(0).getValue());
    }
}

Explanation

Tokenize each line into words and put into a Map.If the map already contains the key, incrase the count else make the count as 1.At the end, sort it based on value in descending order so that the max repeated word will be on top.

Share

If You Appreciate This, You Can Consider:

We are thankful for your never ending support.

About The Author

author-image
A technology savvy professional with an exceptional capacity to analyze, solve problems and multi-task. Technical expertise in highly scalable distributed systems, self-healing systems, and service-oriented architecture. Technical Skills: Java/J2EE, Spring, Hibernate, Reactive Programming, Microservices, Hystrix, Rest APIs, Java 8, Kafka, Kibana, Elasticsearch, etc.