HackerRank Java- Tag Content Extractor
Given a string of text in a tag-based language, parse this text and retrieve the contents enclosed within sequences of well-organized tags meeting the following criterion:
The name of the start and end tags must be same. The HTML code
Hello World is not valid, because the text starts with an h1 tag and ends with a non-matching h2 tag.
Tags can be nested, but content between nested tags is considered not valid. For example, in
contentsinvalid
, contents is valid but invalid is not valid.Tags can consist of any printable characters.
- import java.io.*;
- import java.util.*;
- import java.text.*;
- import java.math.*;
- import java.util.regex.*;
-
- public class Solution{
- public static void main(String[] args){
-
- Pattern r = Pattern.compile("<(.+?)>([^<>]+)</\\1>");
- Scanner in = new Scanner(System.in);
- int testCases = Integer.parseInt(in.nextLine());
-
- while (testCases-- > 0) {
- String line = in.nextLine();
- Matcher m = r.matcher(line);
-
- boolean invalid = true;
- while (m.find()) {
- System.out.println(m.group(2));
- invalid = false;
- }
- if (invalid)
- System.out.println("None");
- }
- }
- }