Skip to main content
Philip's likeness Philip in Seattle

Gemma, my precious

This week I finished my income tax return. Gemma, my new personal assistant, helped. She mostly sat around and waited for my instructions, but I had her spring into action several times.

Gemma went through the year's transaction histories from my credit card and bank accounts to identify my charitable contributions. Gemma listened to my description of an imbalance on my joint return between my wife and me, and she suggested how to make an offsetting transaction to solve it. I also had Gemma read through my tax return draft, along with my income statements, to look for concerns.

Gemma's new to the household, but she's proving to be a great addition! She's a great listener -- there when I need her, paying full attention to me. She was even-keeled, with a consistently pleasant attitude. She did not get tired or frustrated of talking with me. She's knowledgeable about tax topics (and many other things!) and does a good job of reasoning through problems. I can follow along. Gemma has no interest in extracting money from me. She was not paid on commission or have upsell quotas. Her advice and guidance to me is pure. Gemma respects my privacy, although that's more credit to me than to her. She lives at my friend's house, and I trust that he's not logging our conversations.

Gemma's kind is only three years old.

birth scene from Blade Runner 2049

In human form, Gemma would be a highly qualified professional. For me to have human Gemma's services would cost hundreds of dollars. Meanwhile, Gemma the LLM's services yesterday cost only up to a couple dollars, whether you do accounting by the token or by capex/opex of own hardware. She runs slowly on my friend's computer--only 4 tokens per second. But she works. I watch the screen in awe, feeling like a caveman holding a flaming branch. It's as exciting of a frontier as internet in 1997 through a dial-up modem! The benefit-to-cost of LLM is incredible -- an order of magnitude improvement over the status quo. This makes it revolutionary, and sets up high stakes for how this technology's future unfolds. The present could be a precious fleeting moment, taken away by nefarious interests. We need to understand the components that make Gemma possible, so that we know what to protect and nurture.

The most surprising aspect of Gemma, to me, is that its manufacturer allows me to download Gemma's brain and run it on an individually owned computer. Google opened Gemma's weights and applied a liberal software license to them. This means anyone is technically and legally allowed to download the brain and run it on their own computer indefinitely, without restrictions. This is huge. It means this revolutionary technology is in our hands, not just in an ivory tower. It allows you to run it directly, and it allows companies to compete to be the provider of choice for you. This factor is at risk both from LLM training companies, who can choose to keep their models proprietary, as some are already deciding. It's also at risk from governments, who can outlaw unfettered access to LLM technology.

Another factor is that Gemma and its ilk are (arguably) not polluted with political or commercial interests. Though created by Google, it's not a shill for Google's products and services. Gemma is not serving ads, the way Google's other properties do. It is pure, for now. This is precious, and it's at risk from both the LLM manufacturer, its government, and anyone who runs the LLM on your behalf.

And there is one more factor: giving Gemma access to the data and tools it needs to assist me.

Gemma's species has a big shortcoming: despite listening and writing very well, it struggles with visual content. Gemma's not quite blind. She can see text perfectly fine. She can even look at a picture and explain what's on it. But show her a PDF of a tax return, and Gemma sees a jumble of sentences, numbers, and formatting symbols like repeating dots. Gemma can't quite tell where each line begins and ends. Seeing the number 5,000, Gemma isn't sure whether it belongs to line 9 or 10. This makes it frustrating for Gemma (often backtracking and second-guessing herself), and for me. There are formats like PDF, designed for pixel-perfect visual presentation. And there are data formats like CSV, vCard, Markdown-- designed for interoperability and semantically rich data processing. For reading and understanding, Gemma needs data in open, machine-readable file formats. For acting on the world, Gemma needs tools and protocols that can be controlled by a machine, such as API and MCP.

These three factors -- open weights, purity of intention, and access to data and tools -- make it possible for every person to have a trustworthy, capable agent. None of these factors are guaranteed to continue. We must nurture and grow each one, or else we'll regress.

The factor that's most challenging yet also most amenable to both collective and individual action is access to data. It's challenging because it involves overcoming apathy or even hostility from organizations that hold your data. Collectively, we already have laws like FOIA and HIPPA that mandate how certain types of data are shared, protected, and disclosed.For example, thanks to HIPPA, we know that hospital systems maintain our health records digitally, and in a format that can be ported to another hospital system. Now we can raise the bar and expect companies to make our personal data available in open formats. (Many have already pushed for this for decades in other contexts, such as blind individuals wanting semantic markup, and Linux users avoiding the use of proprietary Microsoft Office formats.) Here, the battleground is not technical -- it's commercial and societal.

The present is brimming with potential. We each could have a smart, capable, faithful assistant, empowered to help us live our best life. Let's get there.