HackerRank Java- Tag Content Extractor

Given a string of text in a tag-based language, parse this text and retrieve the contents enclosed within sequences of well-organized tags meeting the following criterion:

The name of the start and end tags must be same. The HTML code

Hello World

is not valid, because the text starts with an h1 tag and ends with a non-matching h2 tag.

Tags can be nested, but content between nested tags is considered not valid. For example, in


, contents is valid but invalid is not valid.

Tags can consist of any printable characters.

  1. import java.io.*;
  2. import java.util.*;
  3. import java.text.*;
  4. import java.math.*;
  5. import java.util.regex.*;
  6. public class Solution{
  7. public static void main(String[] args){
  8. Pattern r = Pattern.compile("<(.+?)>([^<>]+)</\\1>");
  9. Scanner in = new Scanner(System.in);
  10. int testCases = Integer.parseInt(in.nextLine());
  11. while (testCases-- > 0) {
  12. String line = in.nextLine();
  13. Matcher m = r.matcher(line);
  14. boolean invalid = true;
  15. while (m.find()) {
  16. System.out.println(m.group(2));
  17. invalid = false;
  18. }
  19. if (invalid)
  20. System.out.println("None");
  21. }
  22. }
  23. }


Codesadda.com is your home of programming solutions, tutorials, video tutorials and much more. Sign Up for our weekly newsletter to get update about new content.

Like us on Facebook | Connect with us on LinkedIn | Subscribe our Channel on Youtube