Pattern matching using java.util.regex

java.util.regex library is available since Java 1.4. It can be used for matching character sequences against pattern specified by regular expression. It has two main classes Pattern and Matcher. An instance of the Pattern class represents a regular expression that is specified in string form in a syntax.

Instances of the Matcher class are used to match character sequences against a given pattern. Input is provided to matchers via the CharSequence interface in order to support matching against characters from a wide variety of input sources.

Here is the simple example, If you have to extract part of string based on the fixed pattern defined. For example the simple greeting string

“Hello Yogesh, Welcome to Hyderabad.”

In this sentence, we have to extract two info like name and location ( Yogesh & Hyderabad). First of all we need to define pattern to extract the same information. Java support regular expression for pattern matching, go through Java Pattern documentation to know more about it.

image

Above figure shows building pattern for given text to get specific text out of it. Here we have using grouping ‘(‘ & ’)’ to exact what he want. Sample code is below

import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class TestPattern {
    public static void main(String[] args) {
        String patternString = "Hi (.*), Welcome to (.*).";
        String sampleText =    "Hi Yogesh, Welcome to Hyderabad(India).";
        //Create Pattern from pattern string.
        Pattern pattern = Pattern.compile(patternString);
        Matcher matcher = pattern.matcher(sampleText);
        if(matcher.find()){
            System.out.println("Name = " + matcher.group(1));
            System.out.println("Location = " + matcher.group(2));
        }else{
            System.out.println("Didn't find anything");
        }
    }
}

Output of Above Program :

Name = Yogesh
Location = Hyderabad(India)

 

As per pattern string , we have grouped the sample text into three groups as shown in figure below.

image




By default group 0 will be the complete text and other groups are numbered from left to right. As result of this it prints “Yogesh” for group 1 and “Hyderabad(India)” for group 2.


Escaping the literals : There will be some text which includes regular expression literal like ‘.’, ’(‘, ’)’, ’+’ etc. Pattern class has provided with an utility method Pattern.quote() to escape such literals.

Comments

  1. public class TestPattern {
    public static void main(String[] args) {
    //String patternString = "Hi (.*), Welcome(s) to (.*).";
    //String sampleTest = "Hi Yogesh, Welcome(s) to Hyderabad(India).";
    String patternString = Pattern.quote("(Hello)") + " (.*)";
    Pattern pattern = Pattern.compile(patternString);
    Matcher matcher = pattern.matcher("(Hello) Yogesh");
    if(matcher.find()){
    System.out.println("Name = " + matcher.group(1));
    //System.out.println("Location = " + matcher.group(2));
    }else{
    System.out.println("Didn't find anything");
    }
    }
    }

    ReplyDelete

Post a Comment

Popular posts from this blog

Composite Design Pattern by example

State Design Pattern by Example

Eclipse command framework core expression: Property tester