HackerRank Java- Tag Content Extractor




Given a string of text in a tag-based language, parse this text and retrieve the contents enclosed within sequences of well-organized tags meeting the following criterion:

The name of the start and end tags must be same. The HTML code

Hello World

is not valid, because the text starts with an h1 tag and ends with a non-matching h2 tag.

Tags can be nested, but content between nested tags is considered not valid. For example, in

contentsinvalid

, contents is valid but invalid is not valid.

Tags can consist of any printable characters.



  1. import java.io.*;
  2. import java.util.*;
  3. import java.text.*;
  4. import java.math.*;
  5. import java.util.regex.*;
  6. public class Solution{
  7. public static void main(String[] args){
  8. Pattern r = Pattern.compile("<(.+?)>([^<>]+)</\\1>");
  9. Scanner in = new Scanner(System.in);
  10. int testCases = Integer.parseInt(in.nextLine());
  11. while (testCases-- > 0) {
  12. String line = in.nextLine();
  13. Matcher m = r.matcher(line);
  14. boolean invalid = true;
  15. while (m.find()) {
  16. System.out.println(m.group(2));
  17. invalid = false;
  18. }
  19. if (invalid)
  20. System.out.println("None");
  21. }
  22. }
  23. }
Please click on the like button if it worked

Solution not working or have any suggestions? Please send an email to [email protected]


donate a cup of tea :)


Join Our Facebook Group

Share this solution






codesadda.com

Codesadda.com is your home of programming solutions, tutorials, video tutorials and much more. Sign Up for our weekly newsletter to get update about new content.

Like us on Facebook | Connect with us on LinkedIn | Subscribe our Channel on Youtube