需求,把以下字符替换成空格:html
!#$%&()[]*+-@?{|}~¢£¤¥¦§©ª«¬®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_\\^þÞ¡¨!<>\'*˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚“”„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢjava
天然考虑使用String的replaceAll来替换,jdk中此方法的定义以下:正则表达式
/** * Replaces each substring of this string that matches the given <a * href="../util/regex/Pattern.html#sum">regular expression</a> with the * given replacement. * * <p> An invocation of this method of the form * <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )} * yields exactly the same result as the expression * * <blockquote> * <code> * {@link java.util.regex.Pattern}.{@link * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link * java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>) * </code> * </blockquote> * *<p> * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the * replacement string may cause the results to be different than if it were * being treated as a literal replacement string; see * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}. * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special * meaning of these characters, if desired. * * @param regex * the regular expression to which this string is to be matched * @param replacement * the string to be substituted for each match * * @return The resulting {@code String} * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 */ public String replaceAll(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceAll(replacement); }
第一个参数是正则表达式,把须要替换的字符放到[]中,而后放入第一个参数,这还没完,须要把这些字符中的属于正则表达式的特殊字符转义一下。express
特殊字符可见以下连接:连接测试
把特殊字符抽取出来,单独替换,代码以下:this
result = result.replaceAll("[\\$\\(\\)\\*\\+\\.\\[\\]\\?\\\\^\\{\\}\\|]", " "); result = result.replaceAll("[!#%&-@~¢£¤¥¦§©ª«¬\u00AD®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_þÞ¡¨!<>'˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚“”„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢ]", " ");
写完了以后测试发现数字也能够被替换掉,这就奇怪了,使用二分法来筛选究竟是哪块除了问题,最后定位到&-@,原来横线也是特殊字符,只要ASCII码在&(38)和@(64)之间的(好比数字、括号、星号、加号)都会知足正则表达式。把它也抽取出来转义就行了,以下:spa
result = result.replaceAll("[\\$\\(\\)\\*\\+\\.\\[\\]\\?\\\\^\\{\\}\\|\\-]", " "); result = result.replaceAll("[!#%&@~¢£¤¥¦§©ª«¬\u00AD®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_þÞ¡¨!<>'˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚“”„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢ]", " ");