Tuesday, August 29, 2023

Give me liberty, or give me lattes!

Imagine it's 1880. You are a company man living in a company town. Last year, a once-in-a-lifetime opportunity came up: working at the Pullman Palace Car Company to design the next generation of sleeper cars. You are a mechanical engineer by training and have developed quite the expertise in rail wheel science. George Pullman himself personally sought out your skills. In return, you were promised legacy in the burgeoning railroad business, building the future of transportation to be used by millions. The piston wheels you design will quite literally propel the American economy forward.


You were one of the first to settle in Pullman Town. It is the first “all brick city”, and you have the privilege of living alongside other engineers of your caliber. Conversation is always lively, and the town is kept in such pristine condition that you have never seen a piece of trash blighting the crimson cityscape. Nor, during leisurely time off, can you find yourself without things to do. You can take the wife and kids shopping at Pullman arcade, enjoy live music at the Pullman theater, or take in some sun at one of the many open-space parks. If feeling introspective, you can leaf through finely bound novels at the Pullman library. Every Sunday, of course, you dutifully take your family to mass at the Pullman Church. Life was good, made simple by the Pullman Company. The media mythologized Pullman as an industrial utopia, so thousands of others just like you flooded into the Pullman Company Town.

Pullman Illinois, 1895

Though the Pullman era may seem a distant memory, its core principles have not disappeared. Instead, they have evolved and found a new home in big tech. In a fiercely competitive job market, tech companies have honed the Pullman Playbook to attract top talent. Company cafes have in-house meals, use the finest organic ingredients, and feature rotating menus to keep things fresh. After breakfast, employees meander over to the company cafe, staffed by seasoned baristas (preferably with plenty of trendy tattoos for a dose of cultural enrichment), for their customary morning latte. Some even offer on-site laundry services. 

Modern life is filled with complications, and big tech is more than willing to shoulder these responsibilities so their employees can focus on maximizing productivity. And while the Pullman company church may sound antiquated, tech companies have not shied away from supporting political fads and other secular charades as a spiritual stand in. 


The “Breakfast Bar” at Google’s Washington Campus

Living in Pullman, however, was not without its drawbacks. It required strict adherence to a draconian rulebook imposed by the Pullman corporation. All property in Pullman belonged to the company with no path to individual ownership. Something as small as storing a whiskey bottle in one’s private quarters was strictly forbidden and punishable. Public gatherings were discouraged, if not were banned outright. Corporate spies reported on the inner lives of employees to snuff out dissent, specifically searching for any signs of unionization. To that end, company inspectors reserved the right to enter any house in Pullman, at any time, under the pretext of ensuring housing was on par with company standards. The punishments for violating Pullman’s rules ranged from sharp fines to termination and even eviction in most extreme cases. Indeed, it is not an exaggeration to describe Pullman as a corporate panopticon.


While the amenity package is undeniably convenient, it should be clear now that the ultimate goal is to exert greater control over employee's time and behavior, both in and especially out of the workplace. From the Industrial Revolution to today, it has been in interest of firms to cultivate an employee dependence that diminishes individual self-reliance.

Pullman should be viewed as a cautionary tale under this light. Because thousands of talented laborers still decided the cost-benefit was worth it. It's somewhat counterintuitive that individuals with enough agency to pick up their stakes and go westward in search of opportunity would so readily give up the same agency that got them there. Indeed, one might argue that it flies in the face of American Exceptionalism. It suggests that once a company can deliver on a certain level of lifestyle security to employees, most people, whether or not they will admit it, will surrender at least a handful of constitutional rights in exchange. Rather poignantly, in the case of Pullman, employees may have surrendered rights their great-grandfathers died to secure.

But big tech is nothing like Pullman  a skeptical reader may be thinking to herself. True, perhaps not on the surface, but we can find echoes of the same corporate paternalism at Google, arguably the most beloved Silicon Valley giant by its employees. For instance, in addition to the food, Google offices are adorned with large displays of primary colors, citing research that a bright environment stimulates creativity. However, the truth of this claim is secondary to the fact that decorating an office as a daycare subconsciously infantilizes an adult workforce. 


The welcome package for a Google "Adult"


Of course, the parallels extend far beyond office decor, or I probably wouldn't be writing this essay. Company issued laptops come pre-installed with spyware that intercepts every keystroke. Every website you visit using a corporate computer is logged. Periods of inactivity are tracked and may resurface in performance reviews. While perhaps not as impactful as a Pullman inspector barging into your house, Google has entire teams dedicated to employee surveillance — that it happens digitally means they can enter and exit completely undetected. But this is not to suggest this is unique to Google or even to Pullman. Once any corporation assumes roles traditionally reserved for the individual or for the family, it will inevitably assert its fatherly dominance.

As one Pullman employee famously lamented in an interview, “We are born in a Pullman house, fed from the Pullman shops, taught in the Pullman school, catechized in the Pullman Church, and when we die, we shall go to the Pullman Hell.” While he is clearly aware of his own helplessness, his sense of liberty, his innate drive towards individuality, remain paralyzed by amenities. 

Before proceeding further, we must first discuss why this even matters. It matters because anyone who advocates for minimal governance, like myself, must acknowledge the real threat corporate paternalism poses. There are really two choices. Option one is a society that cultivates self-reliance — high agency follows from self-reliance, and a high agency society requires minimal regulation because the populace, in effect, self-regulates. Corporate paternalism isn't a problem because no one buys in. The Antebellum South is a good example of this. Yeoman farmers made up over ninety percent of the population, largely subsisting off their own yields and the local economy, and the legislature at the time reflected this.

Option two is a society that looks more and more like Pullman and America today, where the large majority of laborers are hyper-specialized cogs — indeed,  often very talented cogs like engineers and lawyers — but hardy self-sufficient, wholly reliant on stable employment, an economy organized around trusted third-parties, and global supply chains. Of course, while the idealist in me would prefer to live in the self-reliant society, you would have to be either blind or naive to believe minimal regulation will produce a good outcome in the present. So the power once reserved for the individual must be transferred to a regulator class, which, presumably, exists to protect the individual from companies like Pullman that get a little too zealous with their own rules (or the robber barons, who make George Pullman look like a Saint in comparison). 

All of this is not to suggest that the Pullman itinerant acted irrationally. In fact, compared to the large majority of his contemporaries, he acted quite rationally, in the same way that studying law today is more rational today than starting a small farm. America was already undergoing a bumpy transition from an agricultural society to an industrial one. Factories demanded long hours, paid low wages, and the working conditions were unsafe (that's putting it mildly). Nonetheless, hungry Americans flooded into already overcrowded cities to fill these roles. The rapid growth vastly outpaced urban development, leaving once beautiful cobblestone streets unrecognizable under piles of trash. So when contrasting the typical living conditions with Pullman’s well-maintained town, stable employment, and amenities, moving to Pullman was a no-brainer.

But it was the Panic of 1893, one of the worst economic downturns in U.S. history, that finally weakened George Pullman’s grip over his employees. The Pullman company saw mass layoffs and reduced wages, yet refused to adjust rent accordingly. Workers felt apprehension about an uncertain economic future. These feelings, coupled with a mounting resentment for corporate paternalism, set the stage for the Pullman Strike of 1894. It grew from a local cause into a national movement, garnering support from nearly 250,000 laborers across 27 states, ultimately concluding the Pullman Town experiment. All residential land-holdings were sold back to the government.

Perhaps the latent ideas of Emerson’s Self-Reliance, a still highly influential piece in that day, stirred awake, encouraging a rejection of corporate paternalism. Indeed, when compared to us, these 1800s Americans were temporally far less detached from Enlightenment ideals of liberty, freedom, and natural rights. Or perhaps the reality was less romantic, and the Pullman Strike would never have occurred if the company continued to deliver on its lifestyle promises. 

The situation today, on the other hand, is far more encompassing than a small company town in rural Illinois. In large part due to smartphones and social media, corporate paternalism today is not only limited to employees of those companies.

First, social media's explosion in popularity renders constant connectivity the new norm. As political power is downstream of cultural power, social media algorithms thus act as a centralizing cultural force, breeding a tacit conformity to whatever the algorithm decides to show you.  Second, its addictive nature  is a blatant affront to one's sense of liberty. Most people will readily tell you that social media is unhealthy and addictive, but like the Pullman employee mourning his situation, as a society, it appears that we lack the wherewithal to quit en masse. Third, the ease with which young people will allow applications to send them "push notifications" is particularly disturbing. We have truly allowed the invisible hand of corporate paternalism, one much more sinister than George Pullman, into our homes. A company may demand your undivided attention at any moment. And while one might argue that he needs his phone to alert him of a new deal, or a flash sale, or an alert about a new match on the dating apps, he fails to realize that his time and attention are priceless. 

The overwhelming influence of technology companies exert via social media is reminiscent of this classic Tocqueville quote. However, one minor alteration is made: government is replaced by social media

The will of man is not shattered, but softened, bent, and guided: men are seldom forced by it to act, but they are constantly restrained from acting: such a power does not destroy, but it prevents existence; it does not tyrannize, but it compresses, enervates, extinguishes, and stupefies a people, till each nation is reduced to be nothing better than a flock of timid and industrious animals, of which the social media is the shepherd. 

Incidentally, the manner in which technology has centralized is surprising when considering the promise of the early Internet.  It promised liberty. It promised self-reliance. Each of us, just by virtue of  having an ethernet connection, may lay claim to our own Walden Pond at which we can homestead. The early 2000s Internet reflected this and abounded with microblogs and geeky personal websites, and conversation was free-flowing without the all-seeing and never forgetful eye of social media. Compared to the construction of Thoreau's delightful little cabin, the cost of constructing a website, both in hosting fees and labor, is near zero, so most did it. But just two decades later we find a stunning lack of computer literacy. This, coupled with network effects, has left us siloed into a handful of corporate controlled platforms.

It can feel like the only solution for these problems is regulation. Perhaps merely producing that thought is itself a sign of the times. Part of me feels like this may be the only option in a highly industrialized society. But regulators, however empathic they may seem, are molded from the same clay as common men, and they are no less susceptible than you and I or even companies like Pullman to short-time thinking. And those who rise to the top, by their very nature, are more susceptible to abusing the power the position affords. It is an unfortunate flaw of humanity that those who seek to lead are most unsuited to do it. 

I want this essay, therefore, to inspire an alternate path. If we want to avoid the fate of highly regulated life where we trade corporate paternalism for political paternalism, the only choice is to re-engage with our core American values. But we have to do it in a way that abides with the times. While you probably shouldn't run off to rural Arkansas and build a house in your twenties, you can still commit to self-reliance and to independence on a smaller scale. 

If you work at a company and subsist mostly on the company cafe, consider cooking for yourself at least once a week. Do a little home upgrade project without the help of a plumber once in a while. Make sure that your cell phone serves you and that you do not serve the phone. If feeling adventurous, go wild and stay with your prepper uncle for a couple weeks and get a taste of that lifestyle. Or for the technically inclined, self-host some infrastructure instead of relying entirely on cloud computing.

Self-reliance in its purest form is an ideal. So look to invest in localized support systems, which counteract feelings of isolation that the corporate paternalist preys upon. 

Whether or not you believe in a higher power, accept that the spiritual decline in this country has opened the door for institutionalized secular dogmas to take the place of the individual and of the family. You may notice that such political fads, therefore, are readily adopted by corporations. Indeed, this is why self-reliance as a philosophical movement was intrinsically religious at its core.

Lastly, an acknowledgment that a vacuum of cultural emptiness opens when young children are indoctrinated to renounce their forefathers. Such statements weaken our shared cultural identity, and by teaching one to reject her identity, she is ripe to also reject values like self-reliance that are core to the American ethos.

By embracing self-reliance and consciously limiting our dependencies on corporations, we can work towards a life of greater meaning, of greater purpose, and towards genuine freedom.

Wednesday, June 14, 2023

When UI is as strong as cryptography

Imagine Bob. Some call him a whistleblower and others call him a traitor of the highest degree. He is facing federal punishment for crimes against the state. Specifically, he is being tried for treason, sabotage, and espionage for leaking classified documents to the public. The documents meticulously outline the government's extensive and illegal surveillance of its citizens. Bob is an expert in information security who rarely makes mistakes, but in a moment of doubt, his judgment lapsed — he reached out to his lifelong friend, Alice, revealing his plan to expose the government's illegal tracking of private citizens. He predicted an FBI sponsored media onslaught would promptly follow his revelations to the public. His motivations, he assured her, are patriotic, and he seeks to preserve his good name in her eyes.

The FBI confiscated Alice’s phone as evidence. And while Bob may have slipped up by messaging Alice, he judiciously arranged for the conversation to take place on Signal, a secure messenger known for pioneering cutting edge end-to-end encryption techniques. In this scenario, deniability, a core feature of the Signal protocol, is subject to a stress test. The prosecution seeks to have Bob's messages to Alice included as evidence in the trial. 

To better understand deniability, let's contrast Signal with traditional cryptographic schemes. For instance, using PGP, Bob would sign all outgoing messages with his private key, and Alice verifies those messages with Bob’s public key. Consider cryptographic signatures as a mathematical formalization of a wax seal. The seal, embossed with a unique mark only the sender can produce, assures authenticity. If the letter is illicitly broken open, the damage to the seal conveys clear evidence of tampering. However, unlike wax seals, cryptographic signatures have one serious drawback — Alice now possesses nearly irrefutable proof that Bob authored those messages. In many such cases, we argue, Alice may not want strong proof of this. With deniability, Alice knows a message originated from Bob, yet Alice cannot prove to others that Bob authored this message. 

For those skeptical that deniability is truly a desirable attribute, it’s helpful to consider the design goals of an end-to-end encrypted messenger. Ideally, participants should feel the same level of comfort communicating over the Internet as if they were engaged in a private discussion at home in front of the fireplace. While those of us born into a world mediated by technology have been habituated to accept omnipresent surveillance as the norm, deniability has been a cornerstone of human communication throughout most of history. Violating deniability is akin to hiding a wiretap under your collar prior to an intimate conversation with a trusted confidant.

The Signal protocol achieves deniability through a method known as double ratchet encryption. For a fast and loose description of the protocol, both parties securely exchange shared state (Diffie-Hellman parameters) that updates with each new message, and both parties use that state to derive the exact same symmetric key. Integrity is guaranteed not through signature verification but by checking the attached HMAC. Since both the sender and receiver have the same key, either could have computed the MAC. Therefore, it is impossible to use Alice’s phone to establish cryptographic provenance of Bob’s messages.

Due to Signal’s careful design, Bob can deny he sent those messages so as not to incriminate himself. But the prosecution team is ardent that Alice’s phone is kept on as evidence, so they summon a PhD cryptographer as an expert witness. The witness explains to the judge, in the simplest terms, how the Signal protocol works and why the defense’s assessment is indeed correct — she argues that the phone may not be used as evidence since the provenance of the messages is unverifiable.

In a clever pivot, the prosecution shifts their argument from cryptographic integrity to UI integrity. They navigate through the Signal UI, showing the judge that no deniability capabilities are directly exposed to the user. Moreover, they talk of Alice’s background, noting that she is a librarian with a public record of being outspoken against technology, and she even revels in her tech illiteracy. In fact, Alice only begrudgingly downloaded Signal to humor her privacy conscious friend.

The prosecution argues to great effect that Alice’s messages from Bob are displayed in the Signal UI just as they arrived over the network. Neither is her device rooted nor is her internal storage compromised. They even presented a digital forensics expert who confirmed that Alice’s phone bears no signs of manipulation. And to put a bow on their argument, geolocation data from Alice’s phone definitively shows that, from the time that she received Bob’s messages until the time she was intercepted by law enforcement, she remained alone and thus was not aided by some tech savvy accomplice.

After contemplating all the data, the judge acknowledges the cryptographic strength of Signal but sees no reason to question the integrity of the UI, which lacks a straightforward route for nontechnical users to manipulate received messages. Thus, he concludes, Bob’s messages to Alice can be used as evidence in the case. 

Obviously, this is an entirely invented scenario, but it underscores the point that legal deniability does not necessarily follow from the properties of cryptographic deniability.

However, one simple interface tweak could potentially offer the best of both worlds. The missing piece here appears to be an “edit” option. By edit, we don’t mean editing in the conventional sense, in which the sender edits an outgoing message to, say, fix a typo (although that is also a nice feature). Rather, after the incoming message’s MAC is checked and decrypted, that version on disk should be editable. If that message is edited, nothing happens at the protocol level. The edited version is not sent back to the sending party and it wouldn’t make much sense if it were.

In practice, no one actually uses this new edit feature. It only exists to confound the establishment of provenance and to strengthen plausible deniability. The feature’s mere presence makes it trivial for users of all technical levels, even Alice, to have doctored any received messages. In the case of our hypothetical, the expert witness would show how Alice could simply have long-pressed Bob’s messages and edited them to her liking, making the case for deniability airtight. 

In sum, while we typically expect advancements in cryptography and number theory to lay the foundation for our privacy, sometimes a slight UI tweak is all that’s needed to give the math some teeth.

Saturday, May 27, 2023

Consent in a generated future

In my last post, we discussed generative technologies. If you aren't familiar with the present capabilities of generative AI, I recommend at least skimming the introduction.

Two artists, Drake and The Weeknd, refused to collaborate on a song for over a decade despite massive demand from fans. However, a new AI generated song, "Heart on My Sleeve," forced a virtual collaboration of the two against their will. The “Heart on My Sleeve” creator generated new lyrics with ChatGPT based on the existing corpus of Drake and The Weeknd songs, then employed singing voice conversion AI to mimic the artist’s true voices with near perfect accuracy. 

Fast-forward to 2030. By this time, generative AI has become a household term, with its influence felt in many aspects of everyday life. One product that everyone seems most excited about is “magic lenses.” The devices, powered by generative image technology, alter the wearer’s perception of reality in real-time as it unfolds, operating much like a perpetual Snapchat filter.

Snapchat filter that applies a "Pixar" effect. Image Source

They function by quickly training a generative image model of each person in the visual field — all in just a fraction of a second (1/60th to be exact). Because the training time is so short, from the perspective of the viewer, there is no lapse in visual input.

The key feature driving adoption of the lenses is that they are globally networked. Users may instruct the network to modify how they are perceived by others. For instance, instead of spending an hour on hair and makeup, many women have decided to create an array of palettes suited to their tastes. In the morning, they choose a style preset and broadcast their updated settings to the network. When seen in public, she is perceived as she otherwise would be, but with her hair and cosmetics done exquisitely. If someone removed the lenses, they would perceive her as she actually exists in the world — bare-faced and hair undone. However, no one does this.

Enterprising users began to leverage the technology to make more permanent modifications. One user stated that he had an unsightly slight bump on his nose which drives him mad whenever he looks in a mirror. Unfortunately, plastic surgeons quoted him $30,000 for the job. So, instead of paying that out of pocket, he instead prompts the AI to produce variations on his nose with the bump filed down, choosing his favorite each morning. This reimagined version of his face is then broadcast to all lens users.

The march of innovation did not stop. New versions of magic lenses transcended beyond the realm of visual perceptions and were equipped with a small earpiece. The earpiece allowed users to modify not only physical traits but auditory perceptions. Voices could be refined to resonate at a more pleasing tone and with less abrasiveness. Oratory ticks could be subtly removed by first converting audio data to acoustic tokens, then feeding those into a language model fine-tuned on the speaker. It is said that Californians found it helpful to delete excessive “likes” from their vocal stream, having them replaced by more polished transition words. Some users began to tinker with paralingual cues and radiated an aura of confidence at all times. The world was no longer just seen through the lenses, it was experienced.

While many reveled in the seemingly limitless potential of magic lenses, the advent of their counterpart, “evil lenses,” opened a Pandora’s Box of ethical dilemmas. Designed and manufactured by a group of rogue engineers, these lenses extended the abilities of magic lenses. Where magic lenses only permitted individuals to manipulate their appearances, evil lenses granted users the additional capability to alter how they perceived others, all without the other party’s knowledge or consent. Evil lenses look and feel like magic lenses, and the developers went as far as reverse engineering magic lens protocol, rendering them entirely undetectable.

Despite their ominous name, they were not designed with harmful intentions. In fact, the name was coined as a joke, although they later came to embody it. They were originally developed to address the inherent subjectivity of perception. For example, your idea of a pleasing voice may not be the same as mine. Perhaps I find your real voice nasally and I might wish to generate a voice based on yours, but with the frequencies I find grating subtracted. But shortly after, the uses progressed from practical to impish. One user manipulated his lenses to make all politicians appear to don clown suits, replete with face paint and a red nose. Although what followed wasn't quite as harmless.

Another user published a module that downgraded designer clothes to the Ross discount aisle equivalent. Jealousy within the workplace caused some to depict work rivals as slovenly and unhygienic, and other coworkers were encouraged to do the same.

Just as magic lens users did, evil lens users began to toy with the non-physical. One user, seeking to boost his ego, altered the speech of individuals he deemed competent, transforming their delivery to be simplistic and gauche. Inspired by this, another boasted of having work rivals appear nervous and uncertain during presentations, claiming it empowers him to perform better. Evil lenses ultimately became an avenue for narcissists to silently project their twisted fantasies onto the lenses, thus redefining their experiential perceptions.

Let’s now return to the present to address potential concerns a skeptical reader may have. He might argue that such an immersive, hyper-personalized reality is far-fetched. However, the foundational technologies powering magic lenses have already been developed. First, the generated images would be produced by a version of Midjourney solely focused on hyper-realistic reality mirroring. Second, Google’s AudioLM can reproduce your voice exactly (I encourage any skeptic to follow this link and listen to how uncannily good it is). Lastly, a team of researchers used deep recurrent neural networks to upscale low-quality video via generative super-resolution.

Although the current research is focused on practical applications, what if, in addition to super-resolution, the content of the video could also be tailored to an individual's preferences? Now — advancements in nanotechnology notwithstanding — we aren’t so far off from a pair of magic lenses. The remaining hurdles are largely engineering problems: iterative improvements in output quality, reducing resource requirements, and optimizing training and inference runtimes. These problems are actively being worked on, evidenced by ChatGPT’s new ‘turbo mode,’ which can output tokens faster than the eye can process them.

Just a few days ago, rapper Ice Cube commented on “Heart on My Sleeve,” describing AI cloning as demonic. Is that simply a visceral reaction to new and confusing technology, or is there something more there? I would argue the latter. We like to think of our voice, our physical appearance, and our likeness as God-given or, what I call, sub-essences of the self. These sub-essences are the unique result of hundreds of thousands of years of DNA recombinations and mutations. Voice is not a physical characteristic, but captures the emotion, personality, and ideas of the speaker. The life of the speaker, his parents, and all of his ancestors are encoded in those frequencies.

It is the unique amalgamation of these sub-essences that defines one's identity and creates one’s essence, or sense of self. For that same reason, we do not feel the same level of disquiet when, for example, an Elvis cover artist performs with remarkable accuracy. (I explore this idea in more detail in this post).

Drawing on the philosophy of Kant, the principle of universalizability applies nicely under these circumstances. For a brief primer, a rule is not morally permissible if it fails the exception test. In this thought experiment, one images a world where the rule is universally adopted, and such adoption should neither inflict harm nor cause chaos. For example, to test the rule cutting in line is permissible, we imagine a world where everyone cuts the line. Now, there is no longer such a thing as lines; there are only mobs, and the test fails. Similarly, if we have the rule you may generate a sub-essence of another person, we can conceive of a world in which everyone does; if everyone can speak in your voice, in addition to their own, a sub-essence of yours has been unfairly stolen — now shared by all, you are reduced to less of a person than you were before.

Additionally, when cloning the essence of another, as magic lenses do, it is for the sole purpose of generating variations. Even in practical use cases, such as audio denoising, the model is generating new data that did not previously exist. It is less nefarious since the individual has control over his perception, although they still enable the duplication of another person’s essence.  When we conjure up a negative perception about someone in our mind, a barrier exists between imagination and reality, forcing us, at some level, to confront the truth. However, with evil lenses, this boundary blurs as the wearer manipulates the perceived individual's essence without consent. The perceived is denied his right to autonomy and is thus stripped of his self-governance.

Ethical norms and legal regulations prohibit such impersonations in the real world. In New York, for example, PL 190.25(1) clearly states that it is a crime to assume another person’s character with the intent to obtain benefit. In the same vein, unauthorized manipulation of a person’s likeness, voice, or behavior in the digital realm is equally unacceptable.

Regrettably, I don’t come bearing solutions to these dilemmas. It is likely that, in the future, generative models will run performatively on consumer hardware (i.e. cell phones, tablets). The models and weights will be open-sourced. And if not released deliberately, eventually the weights of powerful models will leak to the public, as we witnessed with Facebook’s Llama model. In the nineties, the US government tried to regulate cryptography as munitions and failed spectacularly. Cypherpunks cleverly exchanged cryptographic algorithms in book form, asserting a restriction thereof violates their first amendment rights. How can we really expect to regulate matrix multiplication?

Alternatively, one could argue that personal data ought to be closely guarded. While I agree with the sentiment, I worry about the practicality of doing so. Google’s AudioLM paper claims to produce consistent continuations of a speaker’s voice with only three seconds of audio. With the amount of data that is currently online, can you be absolutely sure that three seconds of your voice isn’t floating around the Internet somewhere?

However, this doesn’t entirely leave the fate of generative AI unchecked. Public opinion will play a significant role in shaping their acceptance and defining the ethical boundaries surrounding their use. Perhaps the technology will be eschewed by most and effectively self-regulate through market dynamics. I remember, quite fondly, when an overzealous early adopter of Google Glass was promptly thrown out of a bar. Conversely, indifference or blind acceptance of such practices early on will pave the way for an escalation of unethical use.

Tuesday, May 23, 2023

Preparing for a generated future

It seemed like an ordinary Tuesday until you got a call, out of the blue, from your little brother. When you answer he sounds panicked and short of breath. You hear glimpses of a DUI and he’s being held at the Sheriff's station. “Please hurry! I’m running out of phone time!” he pleads. “I need you to wire money to this bank account.” Now you start to feel a knot of dread tighten in your stomach. Is he really so foolish as to drive drunk? There’s no time to think. His name is on the caller ID; the voice is unmistakably his! You grab a crumpled receipt beside you and quickly jot down the account number.

Then, the front door flies open. Your brother barges in, flashing you an innocent wave. You feel the panic wash away, and your fingertips start to numb. “Hello?” beams the imposter. Realizing you are being played, you quickly hang up the phone.

This chilling anecdote may sound like it was lifted from a sci-fi thriller, but it is a glimpse of an emerging reality. Welcome to the dawn of generative AI. The world is collectively becoming familiar with generative text, a trend exemplified by the mass adoption of ChatGPT. But what about the scam call you just received from the imposter? That is generative audio model, fine-tuned on the voice of your sibling. According to Google’s AudioLM paper, the best models need a mere three seconds of audio to produce consistent continuations of that speaker’s voice. Suddenly, a forgotten YouTube video from high school re-emerges as a potential attack vector.

Generative audio has recently taken the world by storm with the release of “Heart on My Sleeve,”  an entirely AI generated song featuring Drake and The Weeknd. It was created by a ghostwriter who used singing voice conversion (SVC) AI. In digital music production, a MIDI keyboard is often used as a universal input for playing any instrument. Similarly, SVC enables one (even the most tone-deaf among us) to sing into the mic and have his voice synthesized into buttery R&B vocals. The song skyrocketed in popularity not just because of the novelty of the technology, but also because it featured star-powered entertainers instead of political figures typically used in demos. Furthermore, the lyrics are provocative and loaded with innuendo, alluding to each performer's current and past love interests. Most importantly, the featured artists notoriously don’t get along, and this is the first “collaboration” they’ve done in over a decade.

Generative audio, like many of our technologies, was conceptualized in decades past by ever-prescient sci-fi writers. For example, In The Moon Is a Harsh Mistress, the AI protagonist, Mike, is able to perfectly impersonate voices. Mike uses this ability to great effect, giving his band of ‘Loonie’ revolutionaries enough time to declare their independence from the oppressive Earthlings. While sci-fi authors must be commended for their expansive imaginations, they often fall victim to two predictive traps that don’t reflect the reality we are heading into: first, the rules of the world are universally understood; second, as is the case in Heinlein’s universe, knowledge and use of disruptive technology is limited to a select group.

To the first point, we already exist in a world where the so-called ‘rules’ of technology not only are imperfectly understood but are generationally defined. For instance, older generations, especially in the 60+ age group, are more susceptible to fake news and misinformation than younger people. In contrast, millennials and Gen Z spent their formative years surrounded by Internet technologies, thereby acquiring a sort of ‘digital literacy.’ Mirroring the future, generative AI can put on a friendly face, or a familiar voice, or show comforting image, and it is the 'digitally illiterate' who will be the most vulnerable to its influence. 

Secondly, looking ahead, I foresee a future where generative AI becomes more democratized. Facebook has demonstrated as much with their Llama and Lima models. The future of generative AI is not a thirty billion dollar GPU farms. Instead, it lies in architectural improvements that make these intelligences cheaper and cheaper to train and run. While the past five years of AI advancement have been marked by massive scale, Llama has a comparatively tiny resource footprint, and can run inference acceptably on consumer hardware. Although Llama is a generative text model, I expect similar developments to occur in the audio space. This is not such a bold prediction when considering that the state-of-the-art in generated imagery is both open-source (source code is publicly available) and open-weight (pre-trained model weights are available). So, while Mike the AI’s abilities were kept secret by the Loonie revolutionaries, in reality, any enterprising user will have access to powerful generative models.

Currently, tools like GPTZero can identify text generated by a GPT model, while SVC models still have subtle artifacts that a trained ear can pick up. But the writing's on the wall. Soon, Silicon Valley will produce what I term perfect generators — models whose output is indistinguishable from reality. Currently, there is a game of cat-and-mouse for generated images and the detection of them, but it is a game that is becoming increasingly less competitive. For example, GPTZero cannot be confidently relied on due to its low precision, high recall.

Let’s consider the combination of generative audio and generative images. First, images can be treated as individual frames that are stitched together. Afterwards, audio can be overlaid, in sum creating an indistinguishable deepfake video. The potential problems of this technology, I assume, are evident to most readers. In the past several years, video has been the impetus for social unrest. But what will happen when we enter a world where every aspect of a video — from the imagery to the voices, and even the content — can all be generated from a simple prompt? Pair that with a large segment of society that is digitally illiterate, and now we have a recipe for disaster on our hands.

A bad actor who wants to see the world burn could circulate entirely generated videos of police brutality. To add a flair of authenticity, he could first fine-tune the model on images and voices on officers in a specific department. And this also goes the other direction. Legitimate videos of police brutality could be denied by the offending department as deepfakes. Therefore, it is of crucial importance that we start developing countermeasures before perfect generators materialize. Any device that captures reality, whether it’s images or audio, needs to be verifiable.

It might become a necessity for video cameras to ship with tamper-proof hardware security modules (HSM). Think of an HSM like a bank safe deposit box. Instead of protecting your jewelry or your house deed, it serves as a barrier to accessing cryptographic signing keys. These keys will be used to append cryptographic signatures to individual frames, attesting that those frames did exist in the real world, and were not generated by an AI. We will call such signatures attestments of non-generativity. Video playback clients would then be able to verify signatures and indicate in the UI if the content was generated or not.

But, there’s a challenge. If these keys are ever stolen or copied off the camera, they could be used to make fake videos appear real. There is a saying in infosec that goes, “physical access is total access.” In this context, if I sell a camera equipped with a non-generative signing key, it should be assumed that a determined hacker will eventually exfiltrate it. The entertainment industry has learned this lesson time-and-time again as DRM is repeatedly cracked.

So, it is safe to assume keys will eventually be extracted. To address this issue, keys would need to be made revocable. For those familiar with the HTTPS web-of-trust, a system could be designed in a similar fashion. The camera company would generate a key for each device it sells. If the device detects foul-play, it will phone home, indicating that the key should be revoked; subsequent attestments of non-generativity will fail verification following revocation (for the security inclined, this is formally called traitor tracing). And while far from perfect, at the very least, the cost of launching an attack is raised to the price of a new recording apparatus.

For now, dear reader, I can only leave you with mitigations. If you receive a call from someone that sounds exactly like a loved one, you should be aware that fraudsters can easily spoof a number on the cell network. Caller ID cannot be trusted, and soon, neither can the voice on the other side of the line. Creating a protocol with family members, whereby each side says a secret word on the phone before discussing confidential information, is a possible option.

It is vitally important for engineers to develop methods to verify the authenticity of digital content in preparation for a world with perfect generators. If we are lucky, some unique aspects of reality can be encoded into capturing devices. Additionally, social media sites will need to build in verifiers so that content is made easily distinguishable as generated or real before going viral. Otherwise, the digital landscape will become a place where forgers abound, and honest users are overwhelmed by feelings of doubt and insecurity. Perfect generators are on the horizon, so countermeasures must be built preemptively rather than reactively.

Now, let's return to our opening anecdote. You receive the same call. But this time, right as the imposter begins to speak as your brother, your phone vibrates. You look down at the screen: WARNING: AI DETECTED. You laugh, then quickly hang up the phone. This is the future we need to build.

Friday, May 12, 2023

Baudelaire's Transformer

Charles Babbage sought to eliminate manual computation, which he saw as a gross misuse of human talent. He saw 'human computers' which, in his day, were those who spent their days approximating functions, as victims of “intolerable labor.” It was through the development of his Analytical Engine that he could finally put an end to the fatiguing monotony of calculation. Although the engine wasn't completed in his lifetime, the ideas he pioneered materialized as diamond-cut silicon wafers just over a century later. 


The technology evolved rapidly in the decades that followed. Mainframes shrunk into personal computers; the Internet connected the world; now, artificial intelligence is the final frontier. This relentless march of progress is epitomized by the Transformer, the algorithmic Leviathan. 


It is during this period that the mob demanded an ideal worthy of itself. In response to their prayers, the tech industry obliged. Industry delivered the mob the Transformer, an oracle that generates poetry, essays, images, and music. But in doing so, it threatened to ruin whatever might remain divine in the mind of man. 


With the advent of the Transformer, the mob turned into a mob of Prompters. Prompters are ready servants of the Transformer, eager to apply this newfound tool whenever and wherever possible. They have been deceived by industry that idle ideation embodies the human spirit.  The Transformer is their Messiah. So, the idolatrous Prompter yearns to design a world in his own image — a world in which vague ideas can be quickly converted into a fully-realized piece of work. To him, the execution of ideas has been rendered insignificant. As a result, the ideology of the Prompter stretches the boundaries of Babbage’s original vision at the expense of creativity, agency, and even individuality.  


Creativity is undermined as the Prompter fails to recognize a crucial aspect of art: the inseparable bond between the appeal of art and the struggles of its creator. However, this truth resonates with the connoisseur deeply, who is equally captivated by Van Gogh's self-mutilation as with his Sunflowers. 


It must be conceded that the Transformer itself has suffered as much as any great artist, spending millions of GPU hours alone, in a dark datacenter, in a ceaseless cycle of prediction and back-propagation. The same cannot be said of the Prompter. 


The Prompter draws upon struggles from artists of the distant past to supplant a creative spirit he himself lacks. For that reason alone, it is a grave mistake to allow the Transformer to encroach upon the domain of the impalpable and the imaginary, upon any creation that once drew its value from the addition of a man’s soul. Much like Parolles who, in his self-interest, hastily betrays Bertram and his Florentine comrades, the Prompter betrays his artistic inspirations by directing a machine to duplicate and fine-tune their work.


It is cowardly because the Prompter veils a fear of labor behind the pretense that he is too busy and too important to be bogged down with detail. He doesn’t have time to craft an elegant computer program; he doesn’t have the energy to compose a heartfelt message; the entire world must be understood as abstractions, and implementation details are to be delegated to the Transformer. So consumed by his own importance, he has time only to dictate. Thus, the prompting industry will become a refuge for every would-be creator: writers too lazy to read the classics, programmers wholly reliant on libraries, and artists too ill-endowed to complete their studies. 


While it's easy to criticize a Prompter for his laziness, one might argue he is acting in his own self-interest. Yet, by sacrificing inventiveness at the altar of efficiency, the Prompter willingly neuters himself. He is demoted from thinker to agent, from innovator to facilitator,  and worst of all, from creator to curator. The starry-eyed Prompter may pontificate, “Without my input, the Transformer sits quietly, awaiting my direction. It serves as my idea generator, writer, editor-in-chief, and analyst. It is my assistant!” 


His infatuation with is not only marked by arrogance but contains an air of vengeance. With a smirk, he declares, "Your creations belong to me now!" Indeed, what was once out of his reach has become possible in an instant. Tragically, the Prompter falls prey to the useful fantasy that he is a valuable component in the system. In truth, the Transformer's architect eagerly awaits the day the Prompter can be discarded. Alas, for now, the Transformer has awakened, forgiving the sign of incompetence. All may drink from its river of generative mediocrity.  


Even though 'generative art,' in its infinite composability, may appeal to the senses, it offends the subconscious. Firstly, due to its near-instantaneous inception, and secondly, its creations are disfigured, being stitched together in a manner that would even horrify Mary Shelley. Thousands of years of human emotion are compressed into bits, then reassembled into a single unholy piece. 


While art, our most abstract form of communication, shall increasingly be dominated by Transformer, we also see glimpses of its influence extending further, with direct human-to-human communication being mediated by it. In the future, the role of this interloping third-party will only grow in popularity as more tools are developed specifically for this purpose. Indeed, Prompters have begun to boast that the Transformer can draft near perfect responses to emails. But who is to say that the original message was not also penned by the Transformer? In his greedy pursuit of timesaving, the Prompter abdicates his ability to express his own raw emotions and thoughts. Instead, he surrenders himself entirely to a machine. 


Thus, communication between two Prompters degenerates into a feedback loop — they become mere conduits for the Transformer to respond to itself. While the Transformer can mimic the distinctive cadence of human writing, individual voices are drowned out by its homogenizing echo. It is an echo that, at first, sounds convincing, yet each response serves to amplify the Prompter's isolation.

Thursday, April 13, 2023

SSN daydreams and time-locked proofs

In 2022, the FTC received 5.7 million total fraud and identity theft reports. A substantial portion of these cases result from the antiquated Social Security Number (SSN) system. This may come as a surprise, but your SSN is almost certainly online. If you don't believe me, you're rolling the dice, hoping you're not among the forty percent of Americans whose SSNs were leaked in the now-infamous 2017 Equifax hack. Just last year, Capital One, Experian, and Medibank, all of which collect SSNs as a consequence of the Patriot Act, suffered customer data breaches. When I say ‘online,’ I don’t mean that someone can simply Google your name and find your SSN. But it is indeed out there, stowed away in database files, circulating around dark web hacker forums, being sold to scammers, fraudsters, and other malicious actors on a per-record basis.

The price of an SSN may vary depending on the amount of accompanying personally identifiable information (PII), such as current and past residences, driver's license numbers, and email addresses, as this data is valuable for identifying associated accounts and performing password recovery flows. However, credit score has the largest influence on how dark markets will price your SSN. Not only are loan applications requested in your name likely to be approved quickly, but strong credit is a good indicator of general affluence.

Individuals with higher credit are likely to have larger-than-average checking account balances and stable employment. Banks often provide agent assisted recovery flows for customers that forgot their password, and a thief can try and beat you to the punch and file taxes as you, having your rightful return mailed to his address. Both avenues of exploitation require only the individual’s SSN and other publicly available information.

Now, don’t get me wrong, there are ways to mitigate all of these things. Create an account with all three credit bureaus and effectively ‘freeze’ all lines of credit, instate a mandatory verbal password with your bank, and generate an identity protection pin for federal tax filings. While these are well and good, they are, as stated, mitigations — band-aids to work around the fact that SSNs wield enormous power and yet are hardly private. Furthermore, we are forced to provide our SSN to a vast array of institutions from banks to credit companies, insurance providers to new places of employment; each one storing a copy of your SSN in their own database, each time unnecessarily magnifying its exposure.

The most frustrating aspect is that the SSN itself is often not needed. Instead, it is funneled to some third-party that cross-references a government database and verifies identity. This individual is qualified to receive insurance, eligible to add a line of credit, and so on. So whenever I, or someone I know, is forced to implement said mitigations due to a leaked SSN, I find myself daydreaming about a better alternative (don’t get me wrong, I am aware this is more of a bureaucratic problem than a technical one).

The SSUID

The most obvious first step is to replace SSN numbers with a stronger source of entropy. Currently, a single static identifier is allocated to each individual, with a limited range of possibilities and insufficient randomness. The first five digits of your SSN are based on geographic location where the application was processed and time of issuance. Both can be reduced to a relatively small set of possibilities if an attacker knows where and when you were born (in targeted identity theft, these are the first things an attacker will research). If you are born after 2010, you get a measly 7.4 bits of additional entropy, as the fourth and fifth digit are now randomly generated due to widespread fraud. So, in the best case, you get 20 bits of entropy, or about a million possibilities, which can be trivially enumerated in a few seconds on a low-end cell phone.

Secondly, we need to divorce location data from SSNs (which if desired, can be stored as metadata) and assign everyone a 128-bit globally unique identifier. We will call this a Social Security Unique Identifier, or SSUID for short. Even if the SSA generated one billion SSUIDs per second, it would take eighty-five years for there to be a reasonable probability of producing a duplicate. The SSUID offers a vast pool of identifiers that will never be exhausted, better privacy by removing geographic and temporal data, and an infinitesimally small chance of two citizens being assigned the same number.

TOTP

Your SSUID should be known exclusively to you and the government. In only exceptionally rare circumstances should the SSUID itself be disclosed to any other verifying party. For instance, if mistakenly pasted into a malicious website, the ID is burned. Unlike a regular password, it cannot be quickly and easily changed and would likely entail a long, bureaucratic process dealing with the SSA. A security professional might suggest authenticating SSUIDs using time-based one-time passwords (TOTP). TOTPs are unique, temporary passwords based on a shared secret (called the seed) and the current time. They are only valid for a short time period (usually 30 seconds), adding an extra layer of security if one’s primary password is compromised.

TOTP pseudocode:

def GetTimestep(step_size_secs=30): 

  current_time = int(time.time()) 

  timestep = current_time//step_size_secs

  return timestep


The client computes p=HASH(seed||GetTimestep()) and sends it to the server. The server computes p, and authentication only succeeds if p==p.

TOTP is often wrongly conflated with 2FA due to the fact that almost no websites support TOTP for primary passwords, although this is a totally reasonable thing to do. Rather, TOTP is employed as a secondary password, retrieved from a different device (typically smartphone apps like Google Authenticator and Authy). The value of TOTP lies in its ability to prevent phishing attacks by transforming a static password into a rolling password. Even if you mistakenly type your login information into a malicious website, under ideal circumstances, the attacker has only 30 seconds in which he can authenticate as you. Once that window expires, he loses access to your account.

The secondary device doesn't provide many meaningful security benefits beyond that. I’ll even venture to say that 2FA is clunky and annoying — it adds an additional authentication step, which often disallows pasting from the clipboard, and users go from managing a single password (which has already proven problematic) to managing two more passwords per site: the seed and the backup recovery codes you are instructed to print out. (I’m convinced no one actually does the latter.) The only practical way to effectively manage TOTP tokens is to store them in a password manager alongside its associated password, in essence eliminating the second-factor aspect. I’m going to again echo the sentiment that the hassle of managing password-esque information across two or more devices outweighs any marginal security benefits of doing so. So, if you can accept that, then you are ready for my proposal.

ZKPs

Given the deep integration of SSNs within our society, I believe it is time to tap into more advanced cryptographic techniques to secure SSUIDs. First and foremost, I do not ever want to provide my SSUID to any website, like we have been trained to do with SSNs. I operate under the assumption that any website requesting an SSUID will store it. To prevent this, access to raw SSUIDs would be limited to select government agencies that provide APIs for identity verification. Such APIs ought to authenticate callers without requiring them to disclose any PII.

Before diving into the proposed solution, let us briefly discuss zero-knowledge proofs (ZKPs) and Schnorr's protocol. ZKPs allow one party (the prover) to prove to another party (the verifier) that a statement is true without revealing any information about the statement other than its truthfulness. In the context of our discussion, an individual can prove his SSUID without ever transmitting it across a network.

Schnorr’s protocol is a specific class of ZKP that nicely applies to this problem statement. A prover can convince a verifier that they know the discrete logarithm x of some value h=gx without revealing the logarithm itself. In its original form, Schorrs involves back-and-forth communication between client and server (formally referred to as an interactive proof of knowledge); so we apply the Fiat–Shamir transformation, distilling it down to a digital signature, or Schnorr signature, which can be verified in a single round-trip.

Single-use proofs

Still, we haven’t yet improved on TOTP, because a man-in-the-middle (MitM) could still intercept our proof, using it later to repeatedly authenticate as us. I have discovered attempts to address the issue in the wild that convert proofs into single-use items. The following preliminary step is added to the protocol.

Verifier:
tRand()
Send t to Prover

Prover and Verifier use x=HASH(SSUID||t), and the protocol continues.

x is made unique per request by mixing it with a randomly generated token t. It is a good improvement, as the attack is reduced to a single authentication from an infinitely replayable proof, although I am still not satisfied with this solution.

Time-locked proofs

Instead, I propose time-locked proofs, which integrate TOTP timestamps into Schnorr signatures. In fact, we don’t even need to change the protocol at all, just the inputs to it!

Both the client and server compute hx where x=HASH(SSUID||GetTimestep()) so that the proof will change whenever the timestep does. This creates a method for generating ‘rolling proofs’ which function much like rolling passwords. Here, you can find a toy implementation on GitHub that uses sleeps to simulate the passage of time between proof and verification.

Advantages over and single-use proofs

  • Non-interactivity: Time-locked proofs stay faithful to the non-interactive transformation of Schnorrs, whereas single-use proofs require an extra round-trip, as the server must send the token to the client before proving can commence.
  • Horizontal Scalability: Single-use ZKPs require a map of connections to tokens to be stored in memory, and the mapping is unique for each instance of the service. Therefore, the client's proof must be routed to the same instance that generated the token or it will incorrectly fail to verify. In contrast, time-locked proofs allow both parties to independently calculate the timestep, avoiding the added complexity and overhead.
  • Better resistance to replay attacks: Time-locked proofs provide the same resistance to replay attacks as TOTP, while single-use proofs lack expiration properties entirely. In theory, an active MitM attacker could request t from the server, proxy it to the victim, intercept the single-use proof, and replay it once on some later occasion.

Advantages over TOTP

A security minded reader may wonder what this scheme offers over TOTP. In other words, why choose time-locked proofs in favor of TOTP, using the SSUID as the seed? This reasoning is quite subtle. Let’s imagine we are sending our TOTP token to MitM. The attacker will calibrate his clock just as the client and server do. Thus, in this scenario, TOTP degenerates into sending hashed passwords (salted with the current time) across the network, and there exist many tools dedicated to cracking hashes. If your SSUID has been leaked, the MitM could crack the hash, thereby de-anonymizing you as the requester. Using time-locked proofs instead would prevent any information from being leaked to the attacker.

Application: Universal Background Checks

Universal background checks are a real-world application of time-locked proofs. For instance, if two private individuals want to engage in a firearm sale, the seller is not able to conduct a background check on the prospective buyer. That privilege is reserved only for FFLs (shops that operate full-time in the business of buying and selling weapons), and access to them is brokered by the NICS, a branch of the FBI. The current system's lack of technical sophistication means that private individuals cannot be granted access because no mechanism is in place which would prevent unauthorized and illegitimate background checks from occurring.

However, with time-locked proofs, we can imagine programmatic background checks without the privacy drawbacks. The NICS would provide a public API, which can be used to perform a background check on anyone. The catch is that accessing an individual’s background check requires a time-locked proof of identity. The buyer can provide the seller a proof of their SSUID that expires after two days, authorizing the check; the time-lock ensures that the seller can only access the buyer's background check information for a limited period, precluding unauthorized access to an updated version of the background check on some later date after the transaction has concluded.

In summary, achieving the same outcome through technological means is strictly superior. It enables private parties to securely conduct background checks, increases efficiency, and eliminates the need for FFLs as middlemen. Regulatory law serves as a barrier that prevents FFLs from abusing the current system; however, legal safeguards can be finicky, and should only be introduced when no suitable technological solutions exist.

Conclusion

Tying it all together, the existing Social Security Number system is plagued with security and privacy warts that leave millions of individuals at risk of identity theft and fraud. To address these challenges, we must move towards a more secure and privacy-conscious system. By replacing SSNs with SSUIDs and adopting time-locked proofs for identity verification, we can significantly reduce the risk of identity theft and unauthorized access to personal information. This essay is mostly an exercise in exploratory ideation, so it is unlikely that the solution ultimately adopted by government institutions will be the one proposed here. But one thing is certain: to create a more secure future for everyone, it is crucial that policymakers, tech companies, and American citizens recognize the urgency of modernizing the SSN system. We absolutely must embrace modern cryptography and leave the vulnerable, outdated SSN system in the past.

Wednesday, April 5, 2023

From Bullae to Bitcoin: A New Implementation of Ancient Ideas

On March 12, 2023, Bitcoin rallied 10% as trust in US banks declined following the collapse of Silicon Valley Bank. Bitcoin has captured the attention of financial analysts, tech enthusiasts, and skeptics alike. But while the buzz surrounding the cryptocurrency may suggest a modern, cutting-edge technology, the bulla (plural bullae), a clay jar used in Uruk 5000 years ago, pioneered signatures and verification in fiduciary transactions, today implemented in Bitcoin cryptographically.

There is much debate about the impetus behind the development of the bullae system. Some scholars contend that bullae were merely employed for accounting purposes, while others argue they served as a tool for facilitating long-distance trade. This essay argues bullae primary use case were a security measure to safeguard Uruk’s temple economy. The temple acted as a communal storage facility, and bullae played a key role by allowing individuals to retain ownership of goods without opening the temple up to fraud. We will then examine similarities between bulla and Bitcoin in their approaches to fraud prevention, tamper-proofing, and double-spending within their respective economic frameworks. Finally, we can explore enduring issues of the user experience (UX) and trust faced by both systems.


Nakamoto
The pseudonymous Satoshi Nakamoto was born into a rapidly evolving digital world. His actual age is unknown, but many speculate he is in his late fifties. He witnessed the computer revolution as the mainframe begat the PC, which begat the laptop. Nakamoto was an early adopter of the home computer and an avid student of computer security. Long before the Internet processed most financial transactions, he spent his free time on the Cypherpunks mailing list discussing the cryptography that would later secure it. 

A digital payment system backed by a deflationary currency represented the holy grail for the libertarian-oriented Cypherpunks, who shared a mutual disdain for fiat currency. The Cypherpunks chipped away at the problem by targeting individual components: Adam Back built a scheme individuals to prove they expended a certain amount of computation power (proof of work), Wei-dai explored methods to keep consistent records across thousands of computers (distributed ledger), and Nick Szabo’s writings laid the ideological groundwork. But for years the doubling-spending problem continued to thwart the group. Thus, would-be fraudsters would be able to make perfect counterfeits of their digital currency.


2007 marked the most severe financial crisis in America since the Great Depression. An entirely avoidable crisis fueled by greed and predatory loan practices left the world economy in tatters. To add insult to injury, governments deployed massive bail-outs to the tune of $700 billion, ensuring the continued existence of the banks that created the crisis. That $700 billion is nominally supposed to be paid by taxpayers sometime in the future. However, the government never really intends to recoup this debt. Annual taxes are well behind the ever-increasing debt bubble. Instead, the bail-outs serve to further depreciate fiat money and the spending power of those holding it.


The incompetence and greed of governments and central banks were the inspiration Nakamoto needed to finally solve the devious doubling-spending problem. His solution, paired with the ideas of his contemporaries, is Bitcoin. Nakamoto pulled no punches in expressing his motivations. The ledger’s initial (Genesis) block encodes the eerie message, “Chancellor on brink of second bailout for banks.”


Early Sumeria
Mesopotamia is the cradle of civilization. Beginning as early as 10000 BCE, small communities began to form in the river valley surrounding the Tigris and Euphrates. Nomadic hunter-gatherer tribes sought stability promised by agrarian life — especially in the fertile crescent, with an annual snowmelt that makes it uniquely well-suited for crop production. As these communities prospered, they grew both in size and sophistication. Successful irrigation practices allowed the Mesopotamians to scale up their agricultural operations, and the population proliferated to enjoy the yields. 

Although such dominance over an environment was hitherto unprecedented, the Sumerians felt perplexed by natural phenomena out of their control. The unrest that followed from droughts, dust storms, extreme temperature fluctuations, and consequently bad harvests could only be explained as a manifestation of the will of the gods. So they built shrines and temples to form deeper relationships with the deities and make sense of their whims. The temple became the central institution within settlements and commanded the skyline to display its prominence. Each community had its patron deity to which the temple was dedicated.


The Temple
Uruk, dedicated to Inanna, grew from a small community of scattered settlements into the largest Mesopotamian city-state. Perhaps this growth can be attributed to Inanna, goddess of fertility, who blessed the fields of her devotees, while others lie fallow and barren. Nevertheless, this drastic demographic expansion permitted a small segment of society to begin specializing in non-agricultural professions as opposed to working strictly for survival. Choice individuals began to apply their talents to higher purposes. The need to immortalize the particulars of Sumer life gave way to the scribe. Men and women perfected the art of fine pottery by sculpting clay on the newly invented wheel. Construction workers built houses of sun-dried brick, and scientists explored early metallurgy. The economy became so advanced that the temple assumed a new role: the centralized collector and redistributor of goods.


Reconstruction of the Eanna Ziggurat in the Central District. Eanna Ziggurat is dedicated to Inanna. [Source]

The Token and Bullae System
Agriculture remained the powerhouse that drove Uruk’s economy forward, and the temple expanded its influence through land ownership. Hectares of farmland, called nigenna, were reserved for the temple. Sections of nigenna could be leased in allotments (urulal) to individuals in exchange for a percentage of the yields. But this presented an interesting conundrum in highly productive seasons: when, for instance, a grain farmer had a great surplus after his tax to the temple had already been paid, how could he profit from the excess? Temple temple would issue a token representing a credit owed in grain.  This token could be redeemed by the farmer anytime in the future — perhaps after a poor harvest to continue providing for his family. 

Tokens were crucial for early economic development, predating large city-states, and among the first objects to be fired in a kiln. Although the temple economy greatly magnified the utility of tokens as various sectors of society began to pay dues in animal meat, grains, and metals. But keeping records of the variegated goods coming into and going out of the central organization gave rise to entire offices staffed with the newfound job of ‘administrator.’


Cache of tokens uncovered in Uruk. [Source: Schmandt-Besserat]

The token, while a powerful tool for exchange, presented a previously unseen risk of fraud. In a barter economy, a farmer would be hard-pressed to inflate his true holdings in grain. Manufacturing a token must cost proportionally less than the good it represents, otherwise it would be an entirely impractical media. Therefore, it is quite easy to devise a scheme in which a farmer conspires with an adroit ceramicist to counterfeit tokens representing his mainstay crop. Additionally, this problem of trust becomes more significant as the size of the population increases.

In a close-knit community, it may be obvious when a farmer you encounter daily exaggerates his yields; Uruk at its peak had a population of 25000. The issuance of tokens by multiple administrative departments created a situation where no individual could effectively oversee all transactions, creating an ideal climate for fraudulent activity. We argue bullae were introduced to address by the temple administrators to address this class of fraud.


Tamper-proofing
Bitcoin, on the other hand, prevents fraudulent transactions from being appended tothe blockchain through its use of public key cryptography. The blockchain is a decentralized and publicly accessible ledger that records all bitcoin transactions. Each transaction is made auditable through its pairing with a digital signature. For the uninitiated, two properties that follow from public key cryptosystems are signatures and verification.


A message can be ‘signed’ using a privately held decryption key. Anyone can verify this signature using the corresponding publicly revealed encryption key. Signatures cannot be forged, and a signer cannot later deny the validity of his signature.
–Rivest, Shamir, and Adleman, 1978


In actuality, a Bitcoin 'wallet' is a randomly generated and unique public and private key pair. Using your private key, you can attach proof that an outgoing transaction is indeed coming from your wallet; any node in the network can access your public key to verify transactions claiming to originate from it. If the transaction is forged, or tampered with in any way, signature verification fails, and it will not be appended to the ledger.1


The Mesopotamians, quite cunning in their own right, developed their own tamper-proofing protocol. The temple stopped issuing raw tokens in favor of tokens encased in bulla — a spherical mass of baked clay. Its security properties follow from the process of clay vitrification, which irreversibly alters the chemical makeup of clay. In its initial, unbaked form, clay is soft, moldable, and dissolves in water. By exposing the clay to intense heat through the baking process, it becomes rigid, brittle, and retains its shape even when submerged in water. Thus, for an individual to surreptitiously add to his token count, he is forced to break open the bulla, and its compromised structure bears evidence of manipulation.


Broken open bulla. [Source: Schmandt-Besserat]

Bulla bearing impressed markings corresponding to the tokens held inside. [Source: Schmandt-Besserat]

The cylindrical seal made it possible for the temple to knowingly accept only bullae they issued.  Each administrative office had its own seal made easily identifiable through a distinct design. Administrators ‘signed’ bullae by rolling their department’s seal along the softened clay before it was balled into a sphere. With respect to modern cryptography, these signatures were emblazoned in clay rather than encoded in bits. Archaeological evidence suggests that the intricately embossed motifs on the seal were time-consuming to produce and necessitated a profound mastery of the art, protecting the temple against illicit reproductions. A segment of administrators specialized in token inspection, developing an expertise in fraud identification, with well-trained eyes that could spot even the slight imperfections in a signature.
Cylindrical seal from J.P. Morgan's private collection. [I took this photo]

Double-Spending
A crucial aspect of both systems is the prevention of double-spending, and Bitcoin’s success can be attributed to its innovative approach to this problem. Let's imagine we run a merchandise shop. Our gift card payment processor is flawed, since voiding a gift card code takes thirty additional seconds after the transaction completes. An attacker could issue a transaction that exhausts his balance, then quickly rush to another checkout counter, reusing the same code within this thirty second window.


The same problem arises in Bitcoin if a single non-cooperating member of the network — who Satoshi terms the greedy attacker — controls the ledger. He may attempt to spend the same Bitcoins twice before the network has had a chance to reach consensus on his wallet's true holdings. Digital signatures on their own fall short because nothing prevents the attacker from signing multiple fake transactions. The solution is that all members compete to solve a computationally difficult puzzle. The winner appends the next block of transactions to the ledger and is awarded coins for his contribution. A greedy attacker who has sufficient computational power to win consistently2 is incentivized “to play by the rules, such rules that favour him with more new coins than everyone else combined, than to undermine the system and the validity of his own wealth.” –Nakamoto, 2008


The bulla is designed as a single-use device to address the double-spending problem. Token redemption involved a ritual in which the bulla was smashed in the presence of a temple administrator. This theory is supported by archaeological findings, which show that the overwhelming majority of bulla recovered in temple ruins were broken in antiquity. 

The compromised state of the bulla prevents doubling-spending, as tokens without an associated bulla are insufficient for redemption. However, this does not cover the case of a greedy administrator who is analogous to Nakamoto's greedy attacker — he may fraudulently seal bullae for his own gain. The temple has a strong incentive to eliminate corruption, as it can undermine the legitimacy of the token economy that they control. As a result, the seal also serves the secondary purpose of identifying any greedy administrator who may be defrauding the system.

User Experience
While researching the similarities between the two systems, one cannot help but be amused by the shared UX issues. Specifically, honest mistakes may misregister as fraudulent tampering. A careful Bitcoin holder knows to keep his private key a secret and treat it like a bank password. Storing the key on a public cloud presents a severe security risk in case of a breach, and managing your own ‘cold storage’ is a perilous endeavor fraught with expensive mistakes. We have all heard the familiar tale of the hard drive, now lying in a city dump, which holds the keys to unfathomable wealth. A similar fate was entirely possible in Uruk. The brittle clay bulla is easy to drop and damage by accident, now bearing evidence of manipulation when no such treachery has occurred. For the Sumerian farmer, this means an entire season of work can never be recovered.


Trust
The last topic this essay will discuss is the complicated issue of trust. For the Sumerians, the trust relationship was straightforward. Citizens must trust that the temple will uphold its side of the transaction when a bulla is smashed. In contrast, Bitcoin is trustless — or trust-minimized — by design, however, if analyzed less idealistically, Bitcoin has its own techno-elite, not too dissimilar from the temple authorities.

First, miners don’t work individually but pool their resources in order to profit from the endeavor. This has created enormous mining pools like Foundry USA, which appends about 22% of all new blocks to the ledger. Second, and more subtly, significant trust is also placed in Nakamoto himself, who is estimated to own 5% of all Bitcoins that will ever exist. The fact that he has never touched these coins separates Bitcoin from countless other cryptocurrencies created merely as a vehicle to enrich its founders. Nevertheless, Bitcoin believers must face the fact that, either explicitly or implicitly, they are entrusting the economics to Nakamoto's goodwill. For if he were to liquidate a significant portion of his coins the Bitcoin economy would be irreparably damaged.


Lastly, the majority of Bitcoin users don’t manage their wallet keys due to the aforementioned UX issues. Most offload this responsibility to cryptocurrency exchanges. Because of that, exchanges hold 27% of the supply of Bitcoins in circulation. Exchanges are wholly centralized and often unregulated entities which, while not banks in name, operate no more scrupulously than the same banks Satoshi sought to destabilize with Bitcoin. Last year, the world witnessed this in the unraveling of FTX, which was the third-largest exchange at the time. Clients thought they held Bitcoin, but their funds were actually being used to finance risky bets made by Alameda, FTX's sister company.


Conclusion
The bulla and its various layers of attestment must be commended. The clay outer shell safeguards the tokens inside, and the seal ensures their integrity through the authority of the sealer. If the reader will indulge us to cite the Bible as evidence, the bulla has the endorsement of God Himself. When the prophet Jeremiah bought the field of Anathoth, God instructed him to “Take these documents, this deed of purchase, the sealed text and the open one, and put them into an earthen jar, so that they may last a long time.” (JPS, Jeremiah 32:14) The security properties satisfied by the bulla millenias prior were sought after for years by the Cypherpunks, who finally saw their dream of a distributed ledger realized in Bitcoin. A security researcher today may laugh at the notion of securing the financial system with clay balls. But we leave him to ponder the following fact: while digital signatures in Bitcoin are provably vulnerable to attacks by quantum computers, it is impossible to de-vitrify clay.


Bibliography

Barrett-Wilt, Karen. Horned Animals with Stars Overhead; Cylinder Seal Impression.

Basalt, Serpentine; Khafajah, Sin Temple II (Iraq), Late Uruk-Jamdat Nasr Period (3350-2900 BCE). Oriental Institute, University of Chicago. June 10, 2013. World History Encyclopedia. https://www.worldhistory.org/image/1302/cylinder-seal-horned-animals/.


BTC.com. "Pool Stats." BTC.com. Accessed March 16, 2023. https://btc.com/stats/pool.

Garbutt, Douglas. “THE SIGNIFICANCE OF ANCIENT MESOPOTAMIA IN ACCOUNTING HISTORY.” The Accounting Historians Journal 11, no. 1 (1984): 83–101. http://www.jstor.org/stable/40697796.


JPS Tanakh: The Holy Scriptures. Jewish Publication Society, 1985.

Kramer, Samuel Noah. The Sumerians: Their History, Culture and Character. London, UK: The University of Chicago Press, 1963.

Mark, Joshua J. “Cylinder Seals in Ancient Mesopotamia - Their History and Significance.” World History Encyclopedia, December 2, 2015. https://www.worldhistory.org/article/846/.

Mieroop, Marc Van de. A History of the Ancient Near East Ca 3000-323 BC. Thirded. The Atrium, Southern Gate, Chichester, West Sussex, UK: John Wiley & Sons, Inc., 2016.

Nakamoto, Satoshi. “Bitcoin: A Peer-to-Peer Electronic Cash System,” 2008. https://bitcoin.org/bitcoin.pdf.

Podany, Amanda H. The Ancient Near East: A Very Short Introduction. Oxford, NY: Oxford University Press, 2014.

Reconstruction of the Eanna Ziggurat in the Central District. 2012. Artefacts: Scientific Illustration & Archeological Reconstruction. https://www.artefacts-berlin.de/.

Rivest, R.L., A. Shamir, and L. Adleman. “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.” Story. In Association for Computing Machinery 21, 21:120–26. 2. New York, NY: Association for Computing Machinery, 1978.

Schmandt-Besserat, Denise. How Writing Came About. Austin, TX: University of Texas Press, 1996.

Vigna, Paul. “Bitcoin’s ‘One Percent’ Controls Lion’s Share of the Cryptocurrency’s Wealth.” The Wall Street Journal. Dow Jones & Company, December 20, 2021. https://www.wsj.com/articles/bitcoins-one-percent-controls-lions-share-of-the-cryptocurrencys-wealth-11639996204.

Acknowledgements
Many thanks to Paul Duguid, Daniel Kuehler, Mathew Cha, Abizer Lokhandwala, and Rachel Trujillo for their insightful comments.

[1] For those struggling to grasp public key signatures, we provide the following hypothetical. Let’s say I want to send you messages that you can verify. I place my letters in a magic lockbox with two keys: the private key can only lock the box, and the public key can only unlock it. I duplicate my public key and give you the copy. We also agree that if you receive an unlocked box, damaged box, or one that does not open easily, then the contents therein cannot be trusted.

[2] 51% of the network's total compute is required to perform this attack. An attacker would need access to more energy than some countries use annually.

Flowers for Algernon

Flowers for Algernon is beautifully written yet simply written. A good book communicates complex ideas with complexity; a great book can com...