Why do so many T-shirts have λ’s on them

August 31, 2016

Last week I was wrestling with some code when a light went off and I came up with a short and sweet solution that uses foldl.

Looking down a long pool that is inside a set of massive stone arches class=
© 2016 Tyler Hendy for Unsplash

I found this functional programming approach to be much simpler than object oriented. It allowed us to approach problems from a new direction in the sense that you only have to worry about transforming data through functions, you don’t have to worry about inheritance and these kinds of things. This makes it a lot easier to reason about.

I want to group events by artist.

I’m working on a side project that let’s people know about local music events in our area. It’s working, but has a couple rough edges that I want to polish before releasing it to a wider audience.

One of those rough edges is that if one artist has multiple shows in a week, the shows are listed individually, instead of being grouped by the artist.

So, given a list of events, the task is to:

  • return a list that contains one element for each unique artist name, where
  • each element has a list of one or more dates as well as the rest of the information about that event, and
  • the list is ordered by first show date.

First attempt: sort, group, then return a list ordered by first show date.

I’m not using SQL, so while it would be pretty easy to do this with a SQL query, that option is not available.

And it didn’t seem that hard. Just

  1. sort the event list by artist and then date
  2. scan the list, grouping (with a map) by artist name
  3. convert that map back to a list that looks like this: {artist, first_date, [all_dates], ticket_url, desc, …}
  4. sort that list by the first_date field and then artist.

And while that certainly would have worked, when I started implementing this approach it was taking much longer than I expected and the code was not coming easily. So I stopped to think if there was a better way.

Enter foldl, a higher order function.

A higher-order function is a function that takes other functions as arguments or returns a function as result.

The function foldl takes a function as an argument. Most of the examples that are on the web are pretty simple; for example, sum the elements in a list. Here’s an example that does just that:


    1> L=[1,2,3,5].
    [1,2,3,5]
    2> F = fun(X, Sum) -> X + Sum end.
    #Fun<erl_eval.12.52032458>
    3> lists:foldl(F, 0, L).
    11
    4>
    

You apply F to the first element in the list and get the result, and then call F with the second element and that result, and so on. For example,


    1> F = fun(X, Sum) -> X + Sum end.
    #Fun<erl_eval.12.52032458>
    2> F(1,0).
    1
    3> F(2, F(1, 0)).
    3
    > F(3, F(2, F(1, 0))).
    6
    >
    

But you can do way more interesting things with foldl.

You can group with foldl.

foldl is a new function to me, I haven’t used it much in Java. Looks like Java 8 supports this with Stream.reduce(). :)

So, I started building up from simple experiments in the Erlang interpreter. For my task, the result of the function passed to foldl should be a list. After some playing around with ideas in the interpreter, I came up with a function that groups:


    group({Artist, Val}, [{Artist, Vals}|Rest] ) ->
    	H1 = {Artist, [Val|Vals]},
    	[H1|Rest];
    group({Artist, Val}, Acc) ->
    	H = {Artist, [Val]},
    	[H|Acc].
    

Here,

  • the event data is the artist and another field I want to group
  • the function assumes data comes in sorted by artist and then date
  • pattern matching distinguishes when we hit a new artist in the list
  • if same artist (first function clause), we replace the current list head with an updated value
  • if new artist (second function clause), we add a new entry to the return list

Testing this,


    1> c(t)
    2> t:group({a,2}, [{a,[1]}]).
    [{a,[2,1]}]
    3> L = [{a,1}, {a,2}, {b,1}].
    [{a,1},{a,2},{b,1}]
    4> lists:foldl(fun t:group/2, [], L).
    [{b,[1]},{a,[2,1]}]
    

This foldl approach works with Erlang records too.


    group5(#event{act = Act} = E, [{Act, FirstVal, Dates}|Rest] ) ->
            H1 = {Act, FirstVal, [E#event.date|Dates]},
            [H1|Rest];
    group5(E, Acc) when is_record(E, event) ->
            H = {E#event.act, E, [E#event.date]},
            [H|Acc].
    

The interpreter test (my test code and #event record was in the module t.erl):


    1> c(t).
    {ok,t}
    2> rr(t).
    [event]
    3> Events = [#event{act="Big Apple", date={2018, 8, 20}, price=50.00},
    3> #event{act="Big Apple", date={2018, 8, 21}, price=50.00},
    3> #event{act="Red Sox", date={2018, 8, 20}, price=80.00}].
    [#event{act = "Big Apple",date = {2018,8,20},price = 50.0},
     #event{act = "Big Apple",date = {2018,8,21},price = 50.0},
     #event{act = "Red Sox",date = {2018,8,20},price = 80.0}]
    4> lists:foldl(fun t:group5/2, [], Events).
    [{"Red Sox",
      #event{act = "Red Sox",date = {2018,8,20},price = 80.0},
      [{2018,8,20}]},
     {"Big Apple",
      #event{act = "Big Apple",date = {2018,8,20},price = 50.0},
      [{2018,8,21},{2018,8,20}]}]
    5>
    

Getting back to the original blog question …

I think this kind of experience is what prompts people to buy t-shirts with λ’s on them. The difference between my traditional approach and the functional approach was like day and night. The functional approach required far less code and was conceptually simpler (once I got my head around how foldl works).

I may have to get myself one.

Change Log

Sep. 17, 2016

  • Display id that comes after #Fun in interpreter output.

Sep. 1, 2016

  • Fixed bug in record example and updated output.

Tags: functional