Back in 2020, when I worked in the supply chain side of Google, I had a fun and impactful side project related to human-level explanations of linear programs.

A linear program is a mathematical model that defines some number of variables, linear constraints, and a linear objective function. When some variables are forced to be integer (ILPs), you can solve a lot of useful problems like scheduling, routing, and packing. That’s basically how all supply chain optimization works.

But when solvers solve models, the optimal solution may not be what you expect. Sometimes it’s a bug, sometimes it’s bad data, sometimes it’s just counter-intuitively correct. Understanding the behavior, even for a modeling expert, can be challenging and time consuming. Especially when your models have thousands or millions of variables and constraints. For someone on the business side, it’s as much of a black box as more sophisticated machine learning models.

The simple idea behind my explanation work was to interactively modify a linear model by adding or removing constraints, or tweaking the objective function or various coefficients, and then re-solving the model and diffing the results.

It turns out if you sprinkle a bunch of semantically meaningful metadata on the variables and constraints, (e.g., this variable is indexed by this location or product or time window), then you can smartly aggregate and diff the two solved models, and this is usually enough for an expert in the model to quickly debug a problem. It also helps during the modeling process, to iron out bugs in unit tests and such.

I had ambitions to go a lot further, for example to convert human-level queries to the structured queries my explanation system required. And then to do a causal analysis on the changes in the resulting solutions, rather than just a diff, and convert those back to human-level insights. Even more, to automate the process of exploring the space of changes to a model to surface insights about its sensitivity and robustness before a human thinks to ask about it.

Fast forward to today, when my colleague pointed me to this 2023 paper by Microsoft researchers Beibin Li, Konstantina Mellou, Bo Zhang, Jeevan Pathuri, and Ishai Menache, “Large Language Models for Supply Chain Optimization”. They basically did the same thing as me, but added an LLM to convert natural language queries to their structured query language. The colleague who pointed me to this was working on a similar idea.

But overall I think the natural-language-to-structured-query translation is a great fit for LLMs in this context. You help a mass of humans in a chaotic environment convert their confusion into structured queries (with a very narrowly scoped API) and then have a deep, robust, tested classical system evaluate the query and produce a table of results that the LLM can patiently summarize in words.

I also think it’s funny that, without LLMs, and despite my belief that this problem space is novel and impactful, it’s basically not publishable as research. It appears it was only the LLM aspect that gave this paper research-level potential, yet as far as I can tell this it has yet to be accepted at a conference or journal. But perhaps the application is so mundane to researchers that even using an LLM doesn’t make it novel.

To my mind, having a robust theory and practice around explaining simpler systems is required to even hope to be able to “explain” more complex AI systems. But besides me and these Microsoft researchers, I haven’t heard of anyone who cares about explaining ILPs. So basically they should make me a reviewer on that Microsoft paper and I can give it a “strong accept.”


Want to respond? Send me an email, post a webmention, or find me elsewhere on the internet.

This article is syndicated on: