Question

group :: Ord a => [(a, [b])] -> [(a, [b])]

I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...

I got as far as:

group ((s, ls):(s , ls ):ps) = 
    if s == s  
    then group ((s, ls++ls ):ps) 
    else (s, ls) : group ((s , ls ):ps)
group p = p

but obviously this ain t going to cut it, because it doesn t group everything.

Edit: example

[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]

would output

[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]

Answer 1

Two alternative solutions to barkmadley s answer:

As Tirpen notes in a comment, the best way to attack this problem depends on the number m of distinct first elements in the tuples of the input list. For small values of m barkmadley s use of Data.List.partition is the way to go. For large values however, the algorithm s complexity of O(n * m) is not so nice. In that case an O(n log n) sort of the input may turn out to be faster. Thus,
```
import Data.List (groupBy, sortBy)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = map mergeGroup . myGroup . mySort
  where
    mySort = sortBy (a b -> compare (fst a) (fst b))
    myGroup = groupBy (a b -> fst a == fst b)
    mergeGroup ((a, b):xs) = (a, b ++ concatMap snd xs)
```
This yields [("Dup",["2","3","1","5"]),("Non",["4"])] on barkmadley s input.
Alternatively, we can call in the help of Data.Map:
```
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++)
```
This will yield [("Dup",["5","1","2","3"]),("Non",["4"])], which may or may not be an issue. If it is, then there are again two solutions:
- Reverse the input first using Data.List.reverse:
```
import Data.List (reverse)
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++) . reverse
```
- Prepend (flip (++)) instead of append ((++)) ^{(Thanks to barkmadley; I like this solution better)}:
```
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (flip (++))
```
Both of these definitions will cause combine to output [("Dup",["2","3","1","5"]),("Non",["4"])].

As a last remark, note that all these definitions of combine require the first element of the tuples in the input list to be instances of class Ord. barkmadley s implementation only requires these elements to be instances of Eq. Thus there exist inputs which can be handled by his code, but not by mine.

Answer 2

import Data.List hiding (group)

group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
    where
        (matches, nonmatches) = partition (x-> fst x == s) rest
group x = x

this function produces the result:

group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
    = [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]

it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don t. This effectly means you will have one tuple in the output list per key in the input list.

Answer 3

Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a is an instance of Ord (BTW your original definition requires that a is an instance of Eq, which barkmadley has incorporated in his solution).

import qualified Data.Map as M

group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
  where
    insert (s, l) m = M.insertWith (++) s l m

If you re a big fan of obscurity, replace the last line with:

    insert = uncurry $ M.insertWith (++)

This omits the unnecessary m and uncurry breaks the (s, l) pair out into two arguments s and l.

友情链接