Table of Contents for PIM

I am down to the home stretch for publishing my upcoming book, “A Programmer’s Introduction to Mathematics.” I don’t have an exact publication date—I’m self publishing—but after months of editing, I’ve only got two chapters left in which to apply edits that I’ve already marked up in my physical copy. That and some notes from external reviewers, and adding jokes and anecdotes and fun exercises as time allows.

I’m committing to publishing by the end of the year. When that happens I’ll post here and also on the book’s mailing list. Here’s a sneak preview of the table of contents. And a shot of the cover design (still a work in progress)

img_20180914_184845-1-e1540048234788.jpg

Hanabi: a card game for logicians

Mathematics students often hear about the classic “blue-eyed islanders” puzzle early in their career. If you haven’t seen it, read Terry Tao’s excellent writeup linked above. The solution uses induction and the idea of common knowledge—I know X, and you know that I know X, and I know that you know that I know X, and so on—to make a striking inference from a seemingly useless piece of information.

Recreational mathematics is also full of puzzles involving prisoners wearing colored hats, where they can see others hats but not their own, and their goal is to each determine (often with high probability) the color of their own hat. Sometimes they are given the opportunity to convey a limited amount of information.

Over the last eight years I’ve been delighted by the renaissance of independent tabletop board and card games. Games leaning mathematical, like Set, have had a special place in my heart. Sadly, many games of incomplete information often fall to an onslaught of logic. One example is the popular game The Resistance (a.k.a. Avalon), in which players with unknown allegiances must either deduce which players are spies, or remain hidden as a spy while foiling a joint goal. With enough mathematicians playing, it can be easy to dictate a foolproof strategy. If we follow these steps, we can be 100% sure of victory against the spies, so anyone who disagrees with this plan or deviates from it is guaranteed to be a spy. Though we’re clearly digging a grave for our fun, it’s hard to close pandora’s logic box after it’s open. So I’m always on the lookout for games that resist being trivialized.

Enter Hanabi.

hanabi

A friend recently introduced me to the game, which channels the soul and complexion of the blue-eyed islanders and hat-donning prisoners into a delightful card game.

The game has simple rules: each player gets a hand that they may not see, but they reveal to all other players. The hands come from the following set of cards (with more 1’s than 2’s, and the fewest 5’s), and players work together, aiming to place cards from 1-5 in order in each color. It’s like solitaire, where stacks of different colors may progress independently, but a 2 must be placed before a 3.

Screen Shot 2018-08-09 at 8.34.18 PM

Then the players take turns, and on each turn a player may do one of the following:

  1. Choose a card from your hand to play. If the chosen card cannot be played (e.g, it’s a red 3 but only a red 1 is on the table), everyone gets a strike. Three strikes ends the game in a loss.
  2. Use an information token (limited in supply) to give one piece of information to one other player; the allowed types of information are explained below.
  3. Choose a card from your hand to discard, and regain an information token for future use.

The information you can give to a player to choosing a single feature (a specific rank or color), and pointing to all cards in that player’s hand that have that feature. Example: “these two cards are green”, or “this card is a 4”. House rules dictate whether “no cards are blue” is a valid piece of information. Officially—I like to think it’s in the spirit of the blue-eyed islander’s puzzle where “someone has blue eyes”—you must be able to point at something to reveal information about it.

So the game involves some randomness (the draw), and some resource management (the information tokens), but the heart of the game is figuring out how to convey as much information as possible in a single clue.

Just like the blue-eyed islander’s puzzle, giving a public piece of information to one player can indicate much more. Imagine their are 4 players. I can see your hand, but if, after looking at your hand, I decide instead to give Blair a clue, that gives you information that what’s in Blair’s hand is more valuable for me to reveal to her than what’s in your hand would be to reveal to you.

Another trick: say I know I have a 4, and say it’s the beginning of the game where 4’s are not playable, and the board has a blue 1 on it. If you play before me and you tell me that that same 4 and a second card are both blue, what does that tell me? It was certainly somewhat redundant: you told me more information about a card I knew was not playable, and seemingly not super-helpful information about a second card. After some reflection you can often infer that not only is the second card a blue 2, but also that you have at least one more 2 elsewhere in your hand that’s not immediately playable. That’s a lot of information!

The idea of common knowledge takes it down a rabbit hole that I haven’t quite gotten my head around, but which makes the game continually fun. If I know that you know that I can infer the above scenario with the blue 2, then you not giving me that clue tells me that either that situation isn’t present in my hand, or else that whatever information you’re instead giving to Matthieu is a higher priority. The more the group can understand to be commonly inferable (say, discussing strategies before starting the game), the more one can take advantage of common knowledge. The game starts to feel like a logical olympiad, where your worst enemy is your fallible memory, and if people aren’t playing at the same level, relying too much on an inference your teammate didn’t intend can cause grave mistakes!

It’s a guaranteed hit at your next gathering of logic-loving mathemalites!

For mathematicians, = does not mean equality

Every now and then I hear some ridiculous things about the equals symbol. Some large subset of programmers—perhaps related to functional programmers, perhaps not—seem to think that = should only and ever mean “equality in the mathematical sense.” The argument usually goes,

Functional programming gives us back that inalienable right to analyze things by using mathematics. Never again need we bear the burden of that foul mutant x = x+1! No novice programmer—nay, not even a mathematician!—could comprehend such flabbergastery. Tis a pinnacle of confusion!

It’s ironic that so much of the merits or detriment of the use of = is based on a veiled appeal to the purity of mathematics. Just as often software engineers turn the tables, and any similarity to mathematics is decried as elitist jibber jabber (Such an archaic and abstruse use of symbols! Oh no, big-O!).

In fact, equality is more rigorously defined in a programming language than it will ever be in mathematics. Even in the hottest pits of software hell, where there’s = and == and ===, throwing in ==== just to rub salt in the wound, each operator gets its own coherent definition and documentation. Learn it once and you’ll never go astray.

Not so in mathematics—oh yes, hide your children from the terrors that lurk. In mathematics equality is little more than a stand-in for the word “is,” oftentimes entirely dependent on context. Now gather round and listen to the tale of the true identities of the masquerader known as =.

Let’s start with some low-hanging fruit, the superficial concerns.

\displaystyle \sum_{i=1}^n i^2 + 3

If = were interpreted literally, i would be “equal” to 1, and “equal” to 2, and I’d facetiously demand 1 = 2. Aha! Where is your Gauss now?! But seriously, this bit of notation shows that mathematics has both expressions with scope and variables that change their value over time. And the \sum use for notation was established by Euler, long before algorithms jumped from logic to computers to billionaire Senate testimonies.

Likewise, set-builder notation often uses the same kind of equals-as-iterate.

\displaystyle A = \{ n^2 : n = 1, 2, \dots, 100 \}

In Python, or interpreting the expression literally, the value of n would be a tuple, producing a type error. (In Javascript, it produces 2. How could it be Javascript if it didn’t?)

Next up we have the sloppiness of functions. Let f(x) = 2x + 3. This is a function, and x is a variable. Rather than precisely say, f(2) = 7, we say that for x=2, f(x) = 7. So x is simultaneously an indeterminate input and a concrete value. The same scoping for programming functions bypass the naive expectation that equality means “now and forever.” Couple that with the question-as-equation f(x) = 7, in which one asks what values of x produce this result, if any, and you begin to see how deep the rabbit hole goes. To understand what someone means when they say f(x) = 7, you need to know the context.

But this is just the tip of the iceberg, and we’re drilling deep. The point is that = carries with it all kinds of baggage, not just the scope of a particular binding of a variable.

Continuing with functions, we have rational expressions like f(x) = \frac{(x+1)x}{x}. One often starts by saying “let’s let f be this function.” Then we want to analyze it, and in-so-doing we simplify to f(x) = x+1. To keep ourselves safe, we modify the domain of f to exclude x=0 post-hoc. But the flow of the argument is the same: we defer the exclusion of x=0 until we need it, meaning the equality at the beginning is a different equality than at the end. In effect, we have an infinitude of different kinds of equality for functions, one for each choice of what to exclude from the domain. And a mathematical proof might switch between them as needed.

“Why not just define a new function g with a different domain,” you ask? You can, but mathematicians don’t. And if you’re arguing in favor or against a particular notation, and using “mathematics” as your impenetrable shield, you’ve got to remember the famous definition of Reuben Hersh, that “mathematics is what mathematicians do.” For us, that means you can’t claim superiority based on an idea of mathematics that disagrees with mathematical practice. And mathematics, dear reader, is messier than programmers and philosophers would have one believe.

And now we turn to the Great Equality Contextualizer, the isomorphism. 

You see, all over mathematics there are objects which are not equal, but we want them to be. When you study symmetry, say, you learn that there is an algebraic structure to symmetry called a group. And the same structure—that is, the same true underlying relationships between the symmetries of a thing—can show up in many different guises. As a set, as a picture, as a class of functions, in polynomials and compass constructions and wallpapers, oh my! In each of these things we want to say that two symmetry structures are the same even if they look different. We want to overload equality when four-fold rotational symmetry applies to my table as well as a four-pointed star.

The tool we use for that is called an isomorphism. In brief terms, it’s a function between two objects, with an inverse, that preserves the structure you care about both ways. In fact, there is a special symbol for when two things are isomorphic, and it’s often \cong. But \cong is annoying to write, and it really just means “is the same as” the same way equality does. So mathematicians often drop the squiggle and use =.

Plus, there are a million kinds of isomorphism. Groups, graphs, vector spaces, rings, fields, modules, algebras, rational functions, varieties, Lie groups, *breathe* topological spaces, manifolds of all stripes, sheaves, schemes, lattices, knots, the list just keeps going on and on and on! No way are we making up a symbol for each one of these and the hundreds of variations we might come up with. And moreover, when you say two things are isomorphic, that gives you absolutely no indication of how they are isomorphic. It fact, it can be extremely tedious to compute isomorphisms between things, and it’s even known to be uncomputable in extreme cases! What good is equality if you can’t even check it?

But wait! You might ask, having read this blog for a while and knowing better than to not question a claim. All of these uses of equality are still equivalence relations, and x = x + 1 is not an equivalence relation!

Well, you got me there. Mathematicians love to keep equality as an equivalence relation. When mathematicians need to define an algorithm where the value of x changes in a nontrivial way, it’s usually done by setting x_0 equal to some starting value and letting x_{n} be defined as some function of x_{n-1} and smaller terms, like the good ol’ Fibonacci sequence x_0 = x_1 = 1 and x_n = x_{n-1} + x_{n-2}.

If mutation is so great, why do mathematicians use recursion so much? Huh? Huh?

Well, I’ve got two counterpoints. The first is that the goal here is to reason about the sequence, not to describe it in a way that can be efficiently carried out by a computer. When you say x = x + 1, you’re telling the computer that the old value of x need not linger, and you can do away with the space occupied by the previous value of x. To achieve the same result with recursion requires a whole other can of worms: memoization and tail recursive style and compiler optimizations to shed stack frames. It’s a lot more work to understand all that (to get to an equivalent solution) than it is to understand mutation! Simply stated, the goals of mathematics and programming are quite differently aligned. The former is about understanding a thing, and the latter is more often about describing a concrete process under threat of limited resources.

My second point is that mathematical notation is so flexible and adaptable that it doesn’t need mutation the same way programming languages need it. In mathematics we have no stack overflows, no register limits or page swaps, no limitations on variable names or memory allocation, our brains do the continuation passing for us, and we can rewrite history ad hoc and pile on abstractions as needed to achieve a particular goal. Even when you’re describing an algorithm in mathematics, you get the benefits of mathematical abstractions. A mathematician could easily introduce = as mutation in their work. Nothing stops them from doing so! It’s just that they rarely have a genuine need for it.

But of course, none of this changes that languages could use := or “let” instead of = for assignment. If a strict adherence to asymmetry for asymmetric operations helps you sleep at night, so be it. My point is that the case when = means assignment is an extremely simple bit of context. Much simpler than the albatrossian mental burden required to understand what mathematicians really mean when they write A = B.

Postscript: I hope everyone reading this realizes I’m embellishing a bit for the sake of entertainment. If you want to fight me, tell me the best tree isn’t aspen. I dare you.

Postpostscript: embarrassingly, I completely forgot about Big-O notation and friends (despite mentioning it in the article!) as a case where = does not mean equality! f(n) = O(log n) is a statement about upper bounds, not equality! Thanks to @lreyzin for keeping me honest.

Linear Programming and Healthy Diets — Part 2

Previously in this series:

Foods of the Father

My dad’s an interesting guy.

Every so often he picks up a health trend and/or weight loss goal that would make many people’s jaw drop. For example, we once went on a 5-day, 50-mile backpacking trip in the Grand Tetons, and my dad brought one of these per day for dinner, and had vitamin tablets for the rest of his sustenance. The rest of us planned for around 3,000 calories per day. He’s tried the “high fat” and “no fat” diets, and quite a few others. He’s concerned with losing weight, but also living longer, so he’s into caloric restriction among other things.

Recently he asked me to help him optimize his diet. He described a scheme he was performing by hand to minimize the number of calories he consumed per day while maintaining the minimum nutrients required by the FDA’s recommendations. He had a spreadsheet with the nutrients for each food, and a spreadsheet with the constraints on each nutrient. He wanted to come up with a collection of meals (or just throw all the food into a blender) that taste within reason but meet these criteria.

He was essentially solving a linear program by hand, roughly as best as one can, with a few hundred variables! After asking me whether there was “any kind of math” that could help him automate his laborious efforts, I decided to lend a hand. After all, it’s been over three years since I promised my readers I’d apply linear programming to optimize a diet (though it was optimizing for cost rather than calories).

Though it never went beyond what linear programming can handle, pretty quickly my dad’s requests specialized beyond what would interest a general reader. Perhaps this is the nature of math consulting, but it seems when you give someone what they want, they realize that’s not what they wanted.

But the basic ideas are still relevant enough. My solution is a hundred-ish lines of python to set up the input, using Google’s open source operations research tools as the core solver. Disclaimer: I work for Google but I don’t work on the team that wrote this tool. Also nothing in this post represents the views of my employer. It’s just me, and the scale of this problem is laughable for Google to care about anyway.

So this post is half tutorial showing how to use the or-tools python wrapper (it’s only somewhat documented), and half showing a realistic use case for linear programming.

However, don’t let this post dissuade you from the belief that linear programming is useful beyond dieting. People use linear programming to solve all kinds of interesting problems. Here are a few I came across in just the last few weeks:

And that’s not even to mention the ubiquitous applications in operations research (network flow, production optimization, economics) that every large company relies on. The applications seem endless!

As usual, all of the code and data we use in the making of this post is available at this blog’s Github page.

Update 2018-01-01: With this code my dad had tried a few inadvisable cooking techniques: take all the ingredients and throw them in an omelet, or blend them all together in a smoothie. Something about cooking the food alters the nutritional content, so he claims he needed to eat them more or less raw. The resulting “meals” were so unpalatable that he appears to have given up on the optimization techniques in this post. It seems the extreme end of the taste/health tradeoff is not where he wants to be. This suggests an open problem: find a good way to model (or lean from data) what foods taste good together, and in what quantities. One might be able to learn from a corpus of recipes, though I imagine that can only go so far for lightly-cooked ingredients. But with hypothetical constraints like, “penalize/prefer these foods being in the same meal”, one might be able to quantify the taste/health tradeoff in a way that makes my dad happy. Having an easy way to slide along the scale (rather than just naively optimize) would also potentially be useful.

Refresher

If you remember how linear programs work, you can safely skip this section.

As a refresher, let’s outline how to model the nutrition problem as a linear program and recall the basic notation. The variables are food in 100 gram increments. So x_1 might be the amount of canned peas consumed, x_2 lobster meat, etc. All variables would have to be nonnegative, of course. The objective is to minimize the total number of calories consumed. If c_1 \geq 0 is the amount of calories in 100g of canned peas, then one would pay c_1x_1 in calories contributed by peas. If we have n different variables, then the objective function is the linear combination

\textup{cost}(\mathbf{x}) = \sum_{j=1}^n c_j x_j

We’re using boldface \mathbf{x} to represent the vector (x_1, \dots, x_n). Likewise, \mathbf{c} will represent the cost vector of foods (c_1, \dots, c_n). As we’ve seen many times, we can compactly write the sum above as an inner product \langle \mathbf{c}, \mathbf{x} \rangle.

Finally, we require that the amount of each nutrient combined in the stuff we buy meets some threshold. So for each nutrient we have a constraint. The easiest one is calories; we require the total number of calories consumed is at least (say) 2,000. So if a_j represents the number of calories in food j, we require \sum_{j=1}^n a_j x_j \geq 2000. We might also want to restrict a maximum number of calories, but in general having a diet with more calories implies higher cost, and so when the linear program minimizes cost we should expect it not to produce a diet with significantly more than 2,000 calories.

Since we have one set of nutrient information for each pair of (nutrient, food), we need to get fancier with the indexing. I’ll call a_{i,j} the amount of nutrient i in food j. Note that A = (a_{i,j}) will be a big matrix, so I’m saying that nutrients i represent the rows of the matrix and foods j represent the columns. That is, each row of the matrix represents the amount of one specific nutrient in all the foods, and each column represents the nutritional content of a single food. We’ll always use n to denote the number of foods, and m to denote the number of nutrients.

Finally, we have a lower and upper bound for each nutrient, which behind the scenes are converted into lower bounds (possibly negating the variables). This isn’t required to write the program, as we’ll see. In notation, we require that for every 1 \leq i \leq m, the nutrient constraint \sum_{j=1}^n a_{i,j} x_j \geq b_i is satisfied. If we again use vector notation for the constraints \mathbf{b}, we can write the entire set of constraints as a “matrix equation”

A \mathbf{x} \geq \mathbf{b}

And this means each entry of the vector you get from multiplying the left-hand-side is greater than the corresponding entry on the right-hand-side. So the entire linear program is summarized as follows

\displaystyle \begin{aligned} \textup{min } & \langle \mathbf{c} , \mathbf{x} \rangle  \\ \textup{such that } & A \mathbf{x}  \geq \mathbf{b} \\ & \mathbf{x}  \geq \mathbf{0} \end{aligned}

That’s the syntactical form of our linear program. Now all (!) we need to do is pick a set of foods and nutrients, and fill in the constants for A, \mathbf{c}, \mathbf{b}.

Nutrients and Foods

The easier of the two pieces of data is the nutrient constraints. The system used in the United States is called the Dietary Reference Intake system. It consists of five parts, which I’ve paraphrased from Wikipedia.

  • Estimated Average Requirements (EAR), expected to satisfy the needs of 50% of the people in an age group.
  • Recommended Dietary Allowances (RDA), the daily intake level considered sufficient to meet the requirements of 97.5% of healthy individuals (two standard deviations).
  • Adequate Intake (AI), where no RDA has been established. Meant to complement the RDA, but has less solid evidence.
  • Tolerable upper intake levels (UL), the highest level of daily consumption that have not shown harmful side effects.

While my dad come up with his own custom set of constraints, the ones I’ve posted on the github repository are essentially copy/paste from the current RDA/AI as a lower bound, with the UL as an upper bound. The values I selected are in a csv. Missing values in the upper bound column mean there is no upper bound. And sorry ladies, since it’s for my dad I chose the male values. Women have slightly different values due to different average size/weight.

Nutrient values for food are a little bit harder, because nutrient data isn’t easy to come by. There are a few databases out there, all of which are incomplete, and some of which charge for use. My dad spent a long time hunting down the nutrients (he wanted some additional special nutrients) for his top 200-odd foods.

Instead, in this post we’ll use the USDA’s free-to-use database of 8,000+ foods. It comes in a single, abbreviated, oddly-formatted text file which I’ve parsed into a csv and chosen an arbitrary subset of 800-ish foods to play around with.

Python OR Tools

Google’s ortools docs ask you to download a tarball to install their package, but I found that was unnecessary (perhaps it’s required to attach a third-party solver to their interface?). Instead, one can just pip install it.

pip3 install py3-ortools

Then in a python script, you can import the ortools library and create a simple linear program:

from ortools.linear_solver import pywraplp

solver = pywraplp.Solver('my_LP', pywraplp.Solver.GLOP_LINEAR_PROGRAMMING)

x = solver.NumVar(0, 10, 'my first variable')
y = solver.NumVar(0, 10, 'my second variable')

solver.Add(x + y <= 7) solver.Add(x - 2 * y >= -2)

objective = solver.Objective()
objective.SetCoefficient(x, 3)
objective.SetCoefficient(y, 1)
objective.SetMaximization()

status = solver.Solve()

if status not in [solver.OPTIMAL, solver.FEASIBLE]:
    raise Exception('Unable to find feasible solution')

print(x.solution_value())
print(y.solution_value())

This provides the basic idea of the library. You can use python’s operator overloading (to an extent) to make the constraints look nice in the source code.

Setting up the food LP

The main file diet_optimizer.py contains a definition for a class, which, in addition to loading the data, encapsulates all the variables and constraints.

class DietOptimizer(object):
    def __init__(self, nutrient_data_filename='nutrients.csv',
                 nutrient_constraints_filename='constraints.csv'):

        self.food_table = # load data into a list of dicts
        self.constraints_data = # load data into a list of dicts

        self.solver = pywraplp.Solver('diet_optimizer', pywraplp.Solver.GLOP_LINEAR_PROGRAMMING)
        self.create_variable_dict()
        self.create_constraints()

        self.objective = self.solver.Objective()
        for row in self.food_table:
            name = row['description']
            var = self.variable_dict[name]
            calories_in_food = row[calories_name]
            self.objective.SetCoefficient(var, calories_in_food)
        self.objective.SetMinimization()

We’ll get into the variables and constraints momentarily, but before that we can see the solve method

    def solve(self):
        '''
            Return a dictionary with 'foods' and 'nutrients' keys representing
            the solution and the nutrient amounts for the chosen diet
        '''
        status = self.solver.Solve()
        if status not in [self.solver.OPTIMAL, self.solver.FEASIBLE]:
            raise Exception('Unable to find feasible solution')

        chosen_foods = {
            food_name: var.solution_value()
            for food_name, var in self.variable_dict.items() if var.solution_value() > 1e-10
        }

        self.chosen_foods = chosen_foods

        nutrients = {
            row['nutrient']: self.nutrients_in_diet(chosen_foods, row['nutrient'])
            for row in self.constraints_table
        }

        return {
            'foods': chosen_foods,
            'nutrients': nutrients,
        }

Here nutrients_in_diet is a helper function which, given a dictionary of foods and a nutrient, outputs the nutrient contents for that food. This can be used independently of the solver to evaluate the nutrient contents of a proposed diet.

Next we have the method to create the variables.

    def create_variable_dict(self):
        '''
            The variables are the amount of each food to include, denominated in units of 100g
        '''
        self.variable_dict = dict(
            (row['description'], self.solver.NumVar(0, 10, row['description']))
            for row in self.food_table
        )

Each food must be present in a nonnegative amount, and I’ve imposed a cap of 10 (1kg) for any individual food. The reason for this is that I originally had a “water” constraint, and the linear program decided to optimize for that by asking one to drink 2L of red wine per day. I neglected to put in an alcohol nutrient (because it was not already there and I’m lazy), so instead I limited the amount of any individual food. It still seems like a reasonable constraint to impose that nobody would want to eat more than 1kg of any single food on one day.

Finally, we can construct the constraints. The core method takes a nutrient and a lower and upper bound:

    def create_constraint(self, nutrient_name, lower, upper):
        '''
            Each constraint is a lower and upper bound on the
            sum of all food variables, scaled by how much of the
            relevant nutrient is in that food.
        '''
        if not lower:
            print('Warning! Nutrient %s has no lower bound!'.format(nutrient_name))
            return

        sum_of_foods = self.foods_for_nutrient(nutrient_name)
        constraint_lb = lower <= sum_of_foods
        self.solver.Add(constraint_lb)
        self.constraint_dict[nutrient_name + ' (lower bound)'] = constraint_lb

        if not upper:
            return  # no upper bound in the data

        constraint_ub = sum_of_foods <= upper
        self.solver.Add(constraint_ub)
        self.constraint_dict[nutrient_name + ' (upper bound)'] = constraint_ub

This method is mostly bookkeeping, while foods_for_nutrient does the individual nutrient lookup. Note that one is not allowed to do a double-ended inequality like self.solver.Add(lower <= sum_of_foods <= upper). If you try, ortools will ignore one end of the bound.

    def foods_for_nutrient(self, nutrient_name, scale_by=1.0):
        # a helper function that computes the scaled sum of all food variables
        # for a given nutrient
        relevant_foods = []
        for row in self.food_table:
            var = self.variable_dict[row['description']]
            nutrient_amount = row[nutrient_name]
            if nutrient_amount > 0:
                relevant_foods.append(scale_by * nutrient_amount * var)

        if len(relevant_foods) == 0:
            print('Warning! Nutrient %s has no relevant foods!'.format(nutrient_name))
            return

        return SumArray(relevant_foods)

Here we are a bit inefficient by iterating through the entire table, instead of just those foods containing the nutrient in question. But there are only a few hundred foods in our sample database (8,000 if you use the entire SR28 database), and so the optimization isn’t necessary.

Also note that while ortools allows one to use the sum method, it does so in a naive way, because sum([a, b, c]) becomes ((a + b) + c), which is a problem because if the list is too long their library exceeds Python’s default recursion limit. Instead we construct a SumArray by hand.

Finally, though we omitted it here for simplicity, throughout the code in the Github repository you’ll see references to percent_constraints. This exists because some nutrients, like fat, are recommended to be restricted to a percentage of calories, not an absolute amount. So we define a mechanism to specify a nutrient should be handled with percents, and a mapping from grams to calories. This ends up using the scale_by parameter above, both to scale fat by 9 calories per gram, and to scale calories to be a percentage. Cf. the special function for creating percent constraints.

Finally, we have methods just for pretty-printing the optimization problem and the solution, called summarize_optimization_problem and summarize_solution, respectively.

Running the solver

Invoking the solver is trivial.

if __name__ == "__main__":
    solver = DietOptimizer()
    # solver.summarize_optimization_problem()
    solution = solver.solve()
    solver.summarize_solution(solution)

With the example foods and constraints in the github repo, the result is:

Diet:
--------------------------------------------------

  298.9g: ALCOHOLIC BEV,WINE,TABLE,WHITE,MUSCAT
 1000.0g: ALFALFA SEEDS,SPROUTED,RAW
   38.5g: CURRY POWDER
    2.1g: CUTTLEFISH,MXD SP,CKD,MOIST HEAT
   31.3g: EGG,WHL,CKD,HARD-BOILED
   24.0g: LOTUS ROOT,CKD,BLD,DRND,WO/SALT
  296.5g: MACKEREL,JACK,CND,DRND SOL
  161.0g: POMPANO,FLORIDA,CKD,DRY HEAT
   87.5g: ROSEMARY,FRESH
  239.1g: SWEET POTATO,CKD,BKD IN SKN,FLESH,WO/ SALT

Nutrient totals
--------------------------------------------------

    1700.0 mg   calcium                   [1700.0, 2100.0]
     130.0 g    carbohydrate              [130.0, ]
     550.0 mg   choline                   [550.0, 3500.0]
       3.3 mg   copper                    [0.9, 10.0]
      60.5 g    dietary fiber             [38.0, ]
     549.7 μg   dietary folate            [400.0, 1000.0]
    1800.0 kcal energy                    [1800.0, 2100.0]
      32.4 mg   iron                      [18.0, 45.0]
     681.7 mg   magnesium                 [420.0, ]
       7.3 mg   manganese                 [2.3, 11.0]
      35.0 mg   niacin                    [16.0, 35.0]
      11.7 mg   pantothenic acid          [5.0, ]
    2554.3 mg   phosphorus                [1250.0, 4000.0]
      14.0 g    polyunsaturated fatty acids  [1.6, 16.0]
    4700.0 mg   potassium                 [4700.0, ]
     165.2 g    protein                   [56.0, ]
       2.8 mg   riboflavin                [1.3, ]
     220.8 μg   selenium                  [55.0, 400.0]
    1500.0 mg   sodium                    [1500.0, 2300.0]
       2.4 mg   thiamin                   [1.2, ]
      59.4 g    total fat                 [20.0, 35.0]        (29.7% of calories)
    3000.0 μg   vitamin a                 [900.0, 3000.0]
      23.0 μg   vitamin b12               [2.4, ]
       2.4 mg   vitamin b6                [1.7, 100.0]
     157.6 mg   vitamin c                 [90.0, 2000.0]
     893.0 iu   vitamin d                 [400.0, 4000.0]
      15.0 mg   vitamin e                 [15.0, 1000.0]
     349.4 μg   vitamin k                 [120.0, ]
      17.2 mg   zinc                      [11.0, 40.0]

Unfortunately, this asks for a kilo of raw alfalfa sprouts, which I definitely would not eat. The problem is that alfalfa is ridiculously nutritious. Summarizing the diet with the print_details flag set, we see they contribute a significant amount of nearly every important nutrient.

1000.0g: ALFALFA SEEDS,SPROUTED,RAW
	18.8% of calcium (mg)
	16.2% of carbohydrate (g)
	26.2% of choline (mg)
	47.3% of copper (mg)
	31.4% of dietary fiber (g)
	65.5% of dietary folate (μg)
	12.8% of energy (kcal)
	29.7% of iron (mg)
	39.6% of magnesium (mg)
	25.6% of manganese (mg)
	13.7% of niacin (mg)
	48.2% of pantothenic acid (mg)
	27.4% of phosphorus (mg)
	29.3% of polyunsaturated fatty acids (g)
	16.8% of potassium (mg)
	24.2% of protein (g)
	45.1% of riboflavin (mg)
	2.7% of selenium (μg)
	4.0% of sodium (mg)
	31.9% of thiamin (mg)
	11.6% of total fat (g)
	2.7% of vitamin a (μg)
	13.9% of vitamin b6 (mg)
	52.0% of vitamin c (mg)
	1.3% of vitamin e (mg)
	87.3% of vitamin k (μg)
	53.5% of zinc (mg)

But ignoring that, we have some reasonable sounding foods: fish, sweet potato, rosemary (okay that’s a ton of rosemary), egg and wine. I bet someone could make a tasty meal from those rough ingredients.

Extensions and Exercises

No tutorial would be complete without exercises. All of these are related to the actual linear program modeling problem.

Food groups: Suppose you had an additional column for each food called “food group.” You want to create a balanced meal, so you add additional constraint for each food group requiring some food, but not too much, from each group. Furthermore, for certain foods like spices, one could add a special constraint for each spice requiring not more than, say, 20g of any given spice. Or else, as one can see, the linear program can produce diets involving obscenely large amounts of spices.

Starting from a given set of foods: Supposing you have an idea for a recipe (or couple of recipes for a day’s meals), but you want to add whatever else is needed to make it meet the nutritional standards. Modify the LP to allow for this.

Integer variations: The ortools package supports integer programming as well. All you need to do to enable this is change the solver type to CBC_MIXED_INTEGER_PROGRAMMING. The solver will run as normal, and now you can create integer-valued variables using solver.IntVar instead of NumVar. Using binary variables, one can define logical OR constraints (figure out how this must work). Define a new binary variable for each food, and define a constraint that makes this variable 0 when the food is not used, and 1 when the food is used. Then add a term to the optimization problem that penalizes having too many different foods in a daily diet.

(Harder) Sampling: Part of the motivation for this project is to come up with a number of different dishes that are all “good” with respect to this optimization problem. Perhaps there is more than one optimal solution, or perhaps there are qualitatively different diets that are close enough to optimal. However, this implementation produces a deterministic output. Find a way to introduce randomness into the program, so that you can get more than one solution.

Feel free to suggest other ideas, and extend or rewrite the model to do something completely different. The sky’s the limit!