Best Practice on design and usage of data type in Haskell

Question

My question is related to a more general question on Haskell program design. But I would like to focus on a specific use case.

I defined a data type (e.g. Foo), and used it in a function (e.g. f) through pattern matching. Later, I realized that the type (Foo) requires some additional field to support new functionalities. However, adding the field would change how the type can be used; i.e. the existing functions depending on the type could be affected. Adding new functionalities to existing code, however unappealing, is hard to avoid. I am wondering what are the best practices at the Haskell language level to minimize the impact of such kind of modifications.

For example, the existing code is:

data Foo = Foo {
  vv :: [Int]
}

f :: Foo -> Int
f (Foo v) = sum v

The function f will be syntax wrong if I add another field to Foo:

data Foo = Foo {
  vv :: [Int]
  uu :: [Int]
}

However, if I had defined function f as the following in the first place:

f :: Foo -> Int
f foo = sum $ vv foo

, then even with the modification on Foo, f would still be correct.

This is known as the expression problem in computer science. — Robin Green, Dec 02 '18 at 18:48

score 6 · Answer 1 · answered Jan 03 '14 at 08:16

6

Lenses solve this problem well. Just define a lens that points to the field of interest:

import Control.Lens

newtype Foo = Foo [Int]

v :: Lens' Foo [Int]
v k (Foo x) = fmap Foo (k x)

You can use this lens as a getter:

view v :: Foo -> [Int]

... a setter:

set v :: [Int] -> Foo -> Foo

... and a mapper:

over v :: ([Int] -> [Int]) -> Foo -> Foo

The best part is that if you later change your data type's internal representation, all you have to do is change the implementation of v to point to the new location of the field of interest. If your downstream users only used the lens to interact with your Foo then you won't break backwards compatibility.

answered Jan 03 '14 at 08:16

Gabriella Gonzalez

34,863
3
77
135

Do you have a sense of what amount of overhead (if any) is introduced if you use `lens` to access all record fields, instead of using the field accessors directly? – Chris Taylor Jan 03 '14 at 08:48
3

@ChrisTaylor I believe lens is fast and light-weight, but I haven't benchmarked it. I know that Edward does all kinds of tricks to ensure they compile to ridiculously fast code and when I asked him how fast the library was he said it was really fast and I took his word for it. – Gabriella Gonzalez Jan 03 '14 at 08:49
Although lenses are a great idea for complicated datastructures, I think they're overkill for simple use cases, introducing extra boilerplate and conceptual complexity. – Ganesh Sittampalam Jan 03 '14 at 11:25

score 2 · Answer 2 · answered Jan 03 '14 at 06:50

2

The best practice for processing types that might get new fields added that you want to ignore in existing code is indeed to use record selectors as you've done.

I would say that you should always define any type that might change using record notation, and you should never pattern match on a type defined with record notation using the first style with positional arguments.

Another way of expressing the above code is:

f :: Foo -> Int
f (Foo { vv = v }) = sum v

This is arguably more elegant, and it also works better in the case where Foo has multiple data constructors.

answered Jan 03 '14 at 06:50

Ganesh Sittampalam

28,821
4
79
98

Also, with the `NamedFieldPuns` extension you could write `f (Foo { vv }) = sum vv` and with the `RecordWildCards` extension you could write `f (Foo { .. }) = sum vv`. – danidiaz Jan 03 '14 at 15:20
2

Don't do the latter or your code will break if someone adds `sum` to the record :-) There are good use cases for `RecordWildCards` (and I work for the company that originally implemented them to make a DSL work better) but a short function like this is not one of them. – Ganesh Sittampalam Jan 03 '14 at 15:26

danidiaz · Answer 3 · 2014-01-03T15:38:12.203

Your f function is so simple that perhaps the easiest answer would be to write it in point-free style using composition:

f' :: Foo -> Int
f' = sum . vv

If your function needs more than one field from the Foo value, the above wouldn't work. But we could employ the Applicative instance for (->) and do the following trick:

import Control.Applicative

data Foo2 = Foo2 {
    vv' :: [Int]
  , uu' :: [Int]
  }

f2 :: Foo2 -> Int
f2 = sum . liftA2 (++) vv' uu'

For functions, liftA2 applies an input argument to two functions and then combines the results in another function, (++) in this case. But perhaps this borders on the obscure.

Best Practice on design and usage of data type in Haskell

3 Answers3