English 中文(简体)
删除组成的单词
原标题:Remove composed words
  • 时间:2009-09-29 06:23:21
  •  标签:

我的言辞清单中有一些是言词,例如,

  • palanca
  • plato
  • platopalanca

I need to remove "plato" and "palanca" and let only "platopalanca". Used array_unique to remove duplicates, but those composed words are tricky...

Should I sort the list by word length and compare one by one? A regular expression is the answer?

更新: 言词清单更大,有好有坏,而不仅仅是相关词。

更新2:我可以安全地将阵列化为扼杀。

更新3:我试图避免这样做,好像这样说。 必须有更有效的方式这样做。

Well, I think that a buble-sort like approach is the only possible one :-( I don t like it, but it s what i have... Any better approach?

function sortByLengthDesc($a,$b){
return strlen($a)-strlen($b);
}

usort($words, sortByLengthDesc );
$count = count($words);
for($i=0;$i<=$count;$i++) {
    for($j=$i+1;$j<$count;$j++) {
        if(strstr($words[$j], $words[$i]) ){
            $delete[]=$i;
        }
    }
}
foreach($delete as $i) {
    unset($words[$i]);
}

update 5: Sorry all. I m A moron. Jonathan Swift make me realize I was asking the wrong question. Given x words which START the same, I need to remove the shortests ones.

  • "hot, dog, stand, hotdogstand" should become "dog, stand, hotdogstand"
  • "car, pet, carpet" should become "pet, carpet"
  • "palanca, plato, platopalanca" should become "palanca, platopalanca"
  • "platoother, other" should be untouchedm they both start different
最佳回答

我认为,你需要再界定这个问题,以便我们能够作出坚定的回答。 这里有一些病理学清单。 应删除哪些物品?

  • hot, dog, hotdogstand.
  • hot, dog, stand, hotdogstand
  • hot, dogs, stand, hotdogstand

www.un.org/Depts/DGACM/index_spanish.htm SOME CODE

这部法典应当比你拥有的更有效:

$words = array( hatstand , hat , stand , hot , dog , cat , hotdogstand , catbasket );

$count = count($words);

for ($i=0; $i<=$count; $i++) {
    if (isset($words[$i])) {
        $len_i = strlen($words[$i]);
        for ($j=$i+1; $j<$count; $j++) {
            if (isset($words[$j])) {
                $len_j = strlen($words[$j]);

                if ($len_i<=$len_j) {
                    if (substr($words[$j],0,$len_i)==$words[$i]) {
                        unset($words[$i]);  
                    }
                } else {
                    if (substr($words[$i],0,$len_j)==$words[$j]) {
                        unset($words[$j]);
                    }
                }
            }
        }
    }
}

foreach ($words as $word) {
    echo "$word<br>";
}

你们可以选择,在休息前的阵列中储存字长。

问题回答

您可以逐字看一看,阵列中的任何字从一开始,还是从一开始。 如果是的话,应当删除这个词(unset(unset)()。

您可将言辞纳入一个阵列,按字母顺序排列阵列,然后按目前的指数核对下一个字,从而形成文字。 如果是的话,你可以删除现行指数中的字句和后面几部分。

与此类似:

$array = array( palanca ,  plato ,  platopalanca );
// ok, the example array is already sorted alphabetically, but anyway...
sort($array);

// another array for words to be removed
$removearray = array();

// loop through the array, the last index won t have to be checked
for ($i = 0; $i < count($array) - 1; $i++) {

  $current = $array[$i];

  // use another loop in case there are more than one combined words
  // if the words are case sensitive, use strpos() instead to compare
  while ($i < count($array) && stripos($array[$i + 1], $current) === 0) {
    // the next word starts with the current one, so remove current
    $removearray[] = $current;
    // get the other word to remove
    $removearray[] = substr($next, strlen($current));
    $i++;
  }

}

// now just get rid of the words to be removed
// for example by joining the arrays and getting the unique words
$result = array_unique(array_merge($array, $removearray));

Regex可以工作。 如果限制的开始和结束适用,你可以在监管范围内加以界定。

^ defines the start $ defines the end

类似情况

foreach($array as $value)
{
    //$term is the value that you want to remove
    if(preg_match( /^  . $term .  $/ , $value))
    {
        //Here you can be confident that $term is $value, and then either remove it from
        //$array, or you can add all not-matched values to a new result array
    }
}

避免你的问题

但是,如果你只是检查两种价值观是平等的,=将同样工作(而且可能比以前更快)。

如果数额和价值清单是巨大的,那就是一个最有效的战略,但这是一个简单的解决办法。

如果履约是一个问题,分类(注解提供的sort功能),清单可能更有用,然后将清单放在一边。 我在把该守则摆在我前面之前将实际检验这一想法。





相关问题
热门标签