Category Theory for Java Programmers

October 27, 2024

Summary

The simple rules of set theory were helpful when I was working on a new Java application.

the top-level design was as simple as defining the sets (and no simpler!)
once written, the top-level design didn’t change
Java 21 makes it easy to represent unions types with
- sealed interfaces (Java 17),
- records (Java 17) and/or enums, and
- switch with pattern matching (Java 21).

Understanding how sets and their functions work is a useful tool to have in your programming tool kit.

Start with defining the sets of things that define your domain; for example, HttpRequest → DomainView → HttpResponse. These are pretty easy to define and allow you to reason about the problem you are solving at a more general level.

Then, define the functions that map from one set to another. Note that in set theory, a valid function must map

from a source set to a target set such that
every element in source set maps to
one and only one element in the target set.

This rules out one-to-many and many-to-may relationships.

The rest of this blog gives a summary of set theory, as ⊕ To get the PDF, you can either click the “Get PDF” link in the on-line book or build the PDF yourself from category-theory-illustrated repository on github. described in Category Theory Illustrated by Jencel Panic (previously known as Boris Marinov), with a brief hat tip to Bartosz Milewski’s book in the final section.

I hope to write up a follow-up post that gives more technical detail of how this approach worked with Java when tackling a specific problem.

Sets

Start on page 21 of the PDF. ⊕ Category Theory Illustrated does not have a table of contents, so I’ll note the PDF page number where the topic starts.

When these notes are direct quotes, I’ll put them in quotes.

“A set is a collection of things where the “things” can be anything you want”
“A set has no structure (there is no order … no members which are “special” with respect to their membership of the set.)”
The main purpose of a set is to help you “reason about several things as if they were one.”
You can have subsets (all elements of the set belong to another set), singleton sets, and the empty set.

If Y is a subset of G, the notation is $Y \subseteq <!– ⊆ symbol → G$
Every set is a subset of itself.
The empty set is special. Since “a set is defined only by the items it contains” the empty set is unique—“there is no difference between the set that contains zero balls and the set that contains zero numbers”. Another way it is special is that is the empty set is a subset of all sets, which in set notation is written as $\forall <!– ∀ symbol → A, \emptyset <!– ∅ symbol → \subseteq <!– ⊆ symbol → A$
Singleton sets are for things that are one-of-a-kind. For example, your father is a singleton subset of all people.

Functions

Start on page 27 of the PDF.

A function “is a relationship between two sets.”

A function “matches each element in one set … to exactly one element in another set.”

There are lots of names for these two sets:

source -> target

domain -> codomain

input -> output

argument type -> return type    programming

premise -> conclusion           logic

as well as for functions themselves; a function can

go from one set to another,

connect one set to another, or

convert a value from one set to another.

Types of functions:

one-to-one: each source element maps to exactly one target element, and the reverse is true.
many-to-one: maps many source elements to one target element, and the reverse is not true. These types of functions categorize the elements in the source set.

A couple rules about functions:

“You cannot have a source element that is not mapped to anything.”
You cannot have a source element “that is mapped to more than one target element”.

Rule #2 means that a function cannot represent a many-to-many relationship.

Real world examples of functions:

How far are we from New York?

    This function connects

        all places in the world (source set)

    to

        all positive numbers (target set)


Who is my father?

    This function connects

        all people in the world

    to

        a singleton set.

There are a number of special functions:

identity function: maps every element in the to itself.
image of a subset: maps every element in the subset to the same element in the set. In the case where the subset is the set itself, the image of the subset is the identity function.
empty list: maps the empty set to any set. A function can be thought of as a list of from/to pairs, with one pair for each element in the source set. A function must map all elements in the source set, but when the source set is the empty set, the from/to list is empty. So this “empty list” function satisifies the definition of a function. It is also required to keep consistent with the rule that every subset has an image function.
singletons: every singleton set has a unique function that maps from any set to the given singleton set.

There is an interesting quote in this section that suggests that a set itself can be represented by a function: “You can think of the ID_G as a function which represents the set G in the realm of functions.” (I don’t understand here what he means by the “realm of functions.”)

A note on Bartosz Milewski’s book

Bartosz Milewski’s book Category Theory For Programmers ⊕ You can build a PDF of this book from the github project milewski-ctfp-pdf . is another Google hit when searching for category theory for programmers.

The first chapter of Category Theory for Programmers gets right to business, defining a category as a thing with objects and arrows, whose fundamental property is composition. His presentation is both a clear and concise introduction to a category, and is copied below.

Given:

    a function f that takes an "A" and returns a "B"
    that is,

           f :: A -> B

    and a function g that takes a "B" and returns an "C"

           g :: B -> C


Then if A, B and C belong to the same category you can define
a new function h that composes these two:

          h :: A -> C

          h a = g(f(a))

or, more simply,

          h = g . f

The notation here is the Haskell programming language. You should read the period in that last line as “after”, as in “apply g after f”.

As a beginner, I found the graphical, set-based introduction of Category Theory Illustrated easier to understand.

Tags: functional