java - Excluding markup on lowercased parentheses letters -
a string can contain 1 many parentheses in lower case letters string content = "this (a) nightmare"; want transform string "<centamp>this </centamp>(a) <centamp>nightmare</centamp>"; add centamp markup around string if has lowercase letter in parentheses should excluded markup.
this have tried far, doesn't achieve desired result. there none many parentheses in string , excluding markup should happen every parentheses.
pattern pattern = pattern.compile("^(.*)?(\\([a-z]*\\))?(.*)?$", pattern.multiline); string content = "this (a) nightmare"; system.out.println(content.matches("^(.*)?(\\([a-z]*\\))?(.*)?$")); system.out.println(pattern.matcher(content).replaceall("<centamp>$1$3</centamp>$2"));
this can done in 1 replaceall:
string outputstring = inputstring.replaceall("(?s)\\g((?:\\([a-z]+\\))*+)((?:(?!\\([a-z]+\\)).)+)", "$1<centamp>$2</centamp>"); it allows non-empty sequence of lower case english alphabet character inside bracket \\([a-z]+\\).
features:
- whitespace sequences tagged.
- there no tag surrounding empty string.
explanation:
\gasserts match boundary, i.e. next match can start end of last match. can match beginning of string (when have yet find match).each match of regex contain sequence of: 0 or more consecutive
\\([a-z]+\\)(no space between allowed), , followed @ least 1 character not form\\([a-z]+\\)sequence.0 or more consecutive
\\([a-z]+\\)cover case string not start\\([a-z]+\\), , case string not contain\\([a-z]+\\).in pattern portion
(?:\\([a-z]+\\))*+- note+after*makes quantifier possessive, in other words, disallows backtracking. put, optimization.one character restriction necessary prevent adding tag encloses empty string.
in pattern portion
(?:(?!\\([a-z]+\\)).)+- note every character, check whether part of pattern\\([a-z]+\\)before matching(?!\\([a-z]+\\))..
(?s)flag cause.match character including new line. allow tag enclose text spans multiple lines.
Comments
Post a Comment