-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict left folds of Stream (ByteStream m) m r
?
#41
Comments
Following up on the initial notes, I think the below comes closer to the mark.
{-# LANGUAGE BangPatterns, BlockArguments, RankNTypes, LambdaCase #-}
module Main where
import qualified Data.ByteString as B
import qualified Data.ByteString.Unsafe as B
import qualified Streaming.ByteString as Q
import qualified Streaming.ByteString.Char8 as Q8
import Control.Monad.Trans.Resource (runResourceT)
import Data.Maybe (fromMaybe, listToMaybe)
import Streaming (Of(..))
import Streaming.Internal (Stream(..))
import Data.Word (Word8)
import System.Environment (getArgs)
countStarts :: Monad m => Word8 -> Q.ByteStream m r -> m (Of Int r)
countStarts w = Q.chunkFold step Nothing \case { Just 1 -> 1; _ -> 0 }
where
step Nothing bs | B.length bs == 0 = Nothing
| B.unsafeHead bs == w = Just 1
| otherwise = Just 0
step x _ = x
nestedFold :: Monad m
=> (forall k. f k -> m (Of a k))
-> (b -> a -> b)
-> b
-> Stream f m r
-> m (Of b r)
nestedFold inner merge = loop
where
loop !b = \case
Return r -> return $ b :> r
Effect m -> m >>= loop b
Step f -> inner f >>= \(a :> rest) -> loop (merge b a) rest
main :: IO ()
main = do
fname <- listToMaybe <$> getArgs
(n :> _) <- runResourceT
$ nestedFold (countStarts 0x69) (+) 0
$ Q8.lines
$ fromMaybe Q.stdin (Q.readFile <$> fname)
print n |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
While working on the expanded WIP README.md, I ended up thinking about what a general-purpose strict left fold would look like for a sub-divided bytestream (i.e.
Stream (ByteStream m) m r
, a stream of monadic bytestreams over the same monad).The
streaming
library provides an excellent lazy right-fold for this type:streamFold
, and also provides multiple strict left-folds, but only forStream (Of a) m r
and the like, because fitting into the composable parallelFold
paradigm fromControl.Foldl
requires pure stream values (the folds themselves can be monadic viaFoldM
,impurely
, ... but the stream elements need to be pure to match that interface).So types like
Stream (ByteStream m) m r
don't appear to be supported for strict left folds, which are often useful.I came up with the below, which perhaps belongs in
Streaming
rather thanStreaming.ByteString
(if it is generally useful to add to the ecosystem). It has the disadvantage that it does match theFold
orFoldM
signatures, so there is not yet a way to get parallel fold support, though I suspect that too could be added (by extendingControl.Foldl
with a new mechanism for stream-of-stream folds, where the pure values are one level down, and each sub-stream returns the next sub-stream head along with the monadic accumulator).Anyway, this is the proof-of-concept function:
It is able to fold
Stream (ByteString m) m r
as follows (rewrite of README.md example that counts lines in an input file that start with the letteri
as a fold, rather than a series of stream transformations):So my questions (issues) are: Is something like nestedFoldM a sensible interface to add to either
Streaming
orStreaming.ByteString
? Can it be improved, or is it about right? And, finally, would it make sense to pursue composable parallel folds for this type of fold?cc: @chessai , @cartazio, @archaephyrryx, @Bodigrim (feel free to remain silent, or say you don't care if this is of no interest...)
[ EDIT: It occurs to me that another approach might to expose the isomorphism
f (Stream f m r) -> Stream (Of a) m r
and then use some of the existing left fold machinery on that, with helper functions to combine the results of the inner folds into the outer accumulator, ... This might then make it possible to use the parallel fold machinery without changes toControl.Foldl
, I'll explore this a bit further. Suggestions welcome... ]The text was updated successfully, but these errors were encountered: