English 中文(简体)
haskell grouping problem
原标题:
  • 时间:2009-11-11 09:17:33
  •  标签:
  • haskell
group :: Ord a => [(a, [b])] -> [(a, [b])]

I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...

I got as far as:

group ((s, ls):(s , ls ):ps) = 
    if s == s  
    then group ((s, ls++ls ):ps) 
    else (s, ls) : group ((s , ls ):ps)
group p = p

but obviously this ain t going to cut it, because it doesn t group everything.

Edit: example

[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]

would output

[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]
最佳回答

Two alternative solutions to barkmadley s answer:

  • As Tirpen notes in a comment, the best way to attack this problem depends on the number m of distinct first elements in the tuples of the input list. For small values of m barkmadley s use of Data.List.partition is the way to go. For large values however, the algorithm s complexity of O(n * m) is not so nice. In that case an O(n log n) sort of the input may turn out to be faster. Thus,

    import Data.List (groupBy, sortBy)
    combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
    combine = map mergeGroup . myGroup . mySort
      where
        mySort = sortBy (a b -> compare (fst a) (fst b))
        myGroup = groupBy (a b -> fst a == fst b)
        mergeGroup ((a, b):xs) = (a, b ++ concatMap snd xs)
    

    This yields [("Dup",["2","3","1","5"]),("Non",["4"])] on barkmadley s input.

  • Alternatively, we can call in the help of Data.Map:

    import Data.Map (assocs, fromListWith)
    combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
    combine = assocs . fromListWith (++)
    

    This will yield [("Dup",["5","1","2","3"]),("Non",["4"])], which may or may not be an issue. If it is, then there are again two solutions:

    • Reverse the input first using Data.List.reverse:

      import Data.List (reverse)
      import Data.Map (assocs, fromListWith)
      combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
      combine = assocs . fromListWith (++) . reverse
      
    • Prepend (flip (++)) instead of append ((++)) (Thanks to barkmadley; I like this solution better):

      import Data.Map (assocs, fromListWith)
      combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
      combine = assocs . fromListWith (flip (++))
      

    Both of these definitions will cause combine to output [("Dup",["2","3","1","5"]),("Non",["4"])].

As a last remark, note that all these definitions of combine require the first element of the tuples in the input list to be instances of class Ord. barkmadley s implementation only requires these elements to be instances of Eq. Thus there exist inputs which can be handled by his code, but not by mine.

问题回答
import Data.List hiding (group)

group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
    where
        (matches, nonmatches) = partition (x-> fst x == s) rest
group x = x

this function produces the result:

group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
    = [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]

it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don t. This effectly means you will have one tuple in the output list per key in the input list.

Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a is an instance of Ord (BTW your original definition requires that a is an instance of Eq, which barkmadley has incorporated in his solution).

import qualified Data.Map as M

group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
  where
    insert (s, l) m = M.insertWith (++) s l m

If you re a big fan of obscurity, replace the last line with:

    insert = uncurry $ M.insertWith (++)

This omits the unnecessary m and uncurry breaks the (s, l) pair out into two arguments s and l.





相关问题
Euler Problem in Haskell -- Can Someone Spot My Error

I m trying my hand at Euler Problem 4 in Haskell. It asks for that largest palindrome formed by multiplying two three-digit numbers. The problem was simple enough, and I thought my Haskell-fu was up ...

How does foldr work?

Can anybody explain how does foldr work? Take these examples: Prelude> foldr (-) 54 [10, 11] 53 Prelude> foldr (x y -> (x+y)/2) 54 [12, 4, 10, 6] 12.0 I am confused about these executions....

Efficient queue in Haskell

How can I efficiently implement a list data structure where I can have 2 views to the head and end of the list, that always point to a head a tail of a list without expensive calls to reverse. i.e: ...

Problem detecting cyclic numbers in Haskell

I am doing problem 61 at project Euler and came up with the following code (to test the case they give): p3 n = n*(n+1) `div` 2 p4 n = n*n p5 n = n*(3*n -1) `div` 2 p6 n = n*(2*n -1) p7 n = n*(5*n -3)...

Ways to get the middle of a list in Haskell?

I ve just started learning about Functional Programming, using Haskel. I m slowly getting through Erik Meijer s lectures on Channel 9 (I ve watched the first 4 so far) and in the 4th video Erik ...

haskell grouping problem

group :: Ord a => [(a, [b])] -> [(a, [b])] I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding ...

Closest equivalent to subprocess.communicate in Haskell

I want to do a popen() / python s subprocess.communicate from Haskell - start a program, give it stdin, and get its stdout/stderr. What s the most direct / Haskellish way to do this?

热门标签