*
[My notes on StrangeLoop 2013:
Table of Contents]
*

- We have a function
that takes two values from a set and produces another member of the same set.`f` - The order of
's arguments doesn't matter.`f` - The grouping of
's arguments doesn't matter.`f` - There is some identity value, a conceptual "zero", that
doesn't matter, in the sense that
`f(i,zero)`for any*i*is simply*i*.

S ⊕ S → S x ⊕ y = y ⊕ x x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z x ⊕ 0 = xI learned about this and other such patterns in grad school when I took an abstract algebra course for kicks. No one told me at the time that I'd being seeing them again as soon as someone created the Internet and it unleashed a torrent of data on everyone. Just why we are seeing the idea of a commutative monoid again was the heart of Bryant's talk. When we have data coming into our company from multiple network sources, at varying rates of usage and data flow, and we want to extract meaning from the data, it can be incredibly handy if the meaning we hope to extract -- the sum of all the values, or the largest -- can be computed using a commutative monoid. You can run multiple copies of your function at the entry point of each source, and combine the partial results later, in any order. Bryant showed this much more attractively than that, using cute little pictures with boxes. But then, there should be an advantage to going to the actual talk... With pictures and fairly straightforward examples, he was able to demystify the abstract math and deliver on his talk's abstract:

That's an important point. Just because you haven't studied group theory or abstract algebra doesn't mean you shouldn't do analytics. You just need to be prepared to learn some new math when it's helpful. As programmers, we are all looking for opportunities to capitalize on patterns and to generalize code for use in a wider set of circumstances. When we do, we may re-invent the wheel a bit. That's okay. But also look for opportunities to capitalize on patterns recognized and codified by others already. Unfortunately, not all data analysis is as simple as summing or maximizing. What if I need to find an average? The average operator doesn't form a commutative monoid with numbers. It falls short in almost every way. But, if you switch from the set of numbers to the set of pairs [A mathematician friend of mine tweeted that anyone who doesn't understand abelian groups shouldn't build analytics systems. I'd turn that around and say that anyone who builds analytics systems ends up understanding abelian groups, whether they know it or not.

- prepare the data by transforming it into the space of a commutative monoid
- reduce the data to a single value in that space, using the appropriate operator
- present the result by transforming it back into its original space