原标题:Using Parsec with Data.Text

  • [Char] with Text.Parsec.String
  • Data.ByteString with Text.Parsec.ByteString
  • Data.ByteString.Lazy with Text.Parsec.ByteString.Lazy

{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}

module Text.Parsec.Text
    ( Parser, GenParser
    ) where

import Text.Parsec.Prim

import qualified Data.Text as T

instance (Monad m) => Stream T.Text m Char where
    uncons = return . T.uncons

type Parser = Parsec T.Text ()
type GenParser t st = Parsec T.Text st
  1. Does it make sense to do so?
  2. It this compatible with the rest of the Parsec API?


module TestText where

import Data.Text as T

import Text.Parsec
import Text.Parsec.Prim
import Text.Parsec.Text

input = T.pack "xxxxxxxxxxxxxxyyyyxxxxxxxxxp"

parser = do
  x1 <- many1 (char  x )
  y <- many1 (char  y )
  x2 <- many1 (char  x )
  return (T.pack x1, T.pack y, T.pack x2)

test = runParser parser () "test" input


它应与包括Parsec在内的其他部分相容。 Char parsers.



Since Parsec 3.1.2 support of Data.Text is built-in! See http://hackage.haskell.org/package/parsec-3.1.2

如果你用旧版本 st,则其他答复中的密码也是有益的。

我增加了一个功能<代码>parseFromUtf8File,以帮助高效阅读UTF-8编码文档。 工作性质不合法。 功能类型匹配parseFromFile from Text.Parsec.ByteString。 这一版本使用严格的《公约》。

-- A derivate work from
-- http://stackoverflow.com/questions/4064532/using-parsec-with-data-text

{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}

module Text.Parsec.Text
    ( Parser, GenParser, parseFromUtf8File
    ) where

import Text.Parsec.Prim
import qualified Data.Text as T
import qualified Data.ByteString as B
import Data.Text.Encoding
import Text.Parsec.Error

instance (Monad m) => Stream T.Text m Char where
    uncons = return . T.uncons

type Parser = Parsec T.Text ()
type GenParser t st = Parsec T.Text st

-- | @parseFromUtf8File p filePath@ runs a strict bytestring parser
-- @p@ on the input read from @filePath@ using
--  ByteString.readFile . Returns either a  ParseError  ( Left ) or a
-- value of type @a@ ( Right ).
-- >  main    = do{ result <- parseFromFile numbers "digits.txt"
-- >              ; case result of
-- >                  Left err  -> print err
-- >                  Right xs  -> print (sum xs)
-- >              }
parseFromUtf8File :: Parser a -> String -> IO (Either ParseError a)
parseFromUtf8File p fname = do 
  raw <- B.readFile fname
  let input = decodeUtf8 raw
  return (runP p () fname input)

