English 中文(简体)
Java RegEx Fun - 迎剧
原标题:Java RegEx Fun - Playing with Sentences
  • 时间:2011-09-07 21:43:39
  •  标签:
  • java
  • regex

Input string:

Lorem ipsum tip. Lorem ipsum loprem ipsum septum #match this#, lorem ipsum #match this too#. #Do not match this because it is already after a period#.

预期产出:

Lorem ipsum tip. #match this# #match this too# Lorem ipsum loprem ipsum septum, lorem ipsum. #Do not match this because it is already after a period#.

注:#match this# and #match this also# 这两条都移至最近一段时期(......)。 基本上说来,所有东西都是#。 应移至左边最接近的时期。

RegEx和Juan String能否做到这一点?

This most basic RegEx to match #anything# is this:

#(.*?)#

除此之外,我还遇到困难。

Edit:你不必告诉我如何撰写完整的方案。 我只需要一个充分的“快车道”解决办法,然后,我自行尝试操纵。

这里,我的解决办法源自<<>glowcoder 回答:

public static String computeForSlashline(String input) {

   String[] sentences = input.split("\.");

   StringBuilder paragraph = new StringBuilder();
   StringBuilder blocks = new StringBuilder();

   Matcher m;

   try {

      // Loop through sentences, split by periods. 
      for (int i = 0; i < sentences.length; i++) {

         // Find all the #____# blocks in this sentence
         m = Pattern.compile("(\#(.*?)\#)").matcher(sentences[i]);

         // Store all the #____# blocks in a single StringBuilder
         while (m.find()) {

            blocks.append(m.group(0));

         }

         // Place all the #____# blocks at the beginning of the sentence. 
         // Strip the old (redundant) #____# blocks from the sentence.
         paragraph.append(blocks.toString() + " " + m.replaceAll("").trim() + ". ");

         // Clear the #____# collection to make room for the next sentence.
         blocks.setLength(0);

   }

   } catch(Exception e) { System.out.println(e); return null; } 

   // Make the paragraph look neat by adding line breaks after
   // periods, question marks and #_____#. 
   m = Pattern.compile("(\. |\.&nbsp;|\?|\])").matcher(paragraph.toString());

   return m.replaceAll("$1<br /><br />");

}

这给我带来了预期产出。 但有一个问题: 如果在<>#>之间有一段时期 (例:#Mrs. Smith kicks Ms. Smith in the relevantsite<#/em>), the input.split(>/>); Line>,将打破第__#。 因此,我将将<条码>输入.split()改为“Ex”。

最佳回答

第一类金字塔的使用如下:

String computeForSlashline(String input) {

    String[] sentences = input.split(".");
    for(int i = 0; i < sentences.length; i++) {
        // perform a search on each sentence, moving the #__# to the front
    }
    StringBuilder sb = new StringBuilder();
    for(String sentence : sentences) {
        sb.append(sentence).append(". ");
    }
    return sb.toString().trim();

}
问题回答

暂无回答




相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...