Thursday, April 13, 2023

SSN daydreams and time-locked proofs

In 2022, the FTC received 5.7 million total fraud and identity theft reports. A substantial portion of these cases result from the antiquated Social Security Number (SSN) system. This may come as a surprise, but your SSN is almost certainly online. If you don't believe me, you're rolling the dice, hoping you're not among the forty percent of Americans whose SSNs were leaked in the now-infamous 2017 Equifax hack. Just last year, Capital One, Experian, and Medibank, all of which collect SSNs as a consequence of the Patriot Act, suffered customer data breaches. When I say ‘online,’ I don’t mean that someone can simply Google your name and find your SSN. But it is indeed out there, stowed away in database files, circulating around dark web hacker forums, being sold to scammers, fraudsters, and other malicious actors on a per-record basis.

The price of an SSN may vary depending on the amount of accompanying personally identifiable information (PII), such as current and past residences, driver's license numbers, and email addresses, as this data is valuable for identifying associated accounts and performing password recovery flows. However, credit score has the largest influence on how dark markets will price your SSN. Not only are loan applications requested in your name likely to be approved quickly, but strong credit is a good indicator of general affluence.

Individuals with higher credit are likely to have larger-than-average checking account balances and stable employment. Banks often provide agent assisted recovery flows for customers that forgot their password, and a thief can try and beat you to the punch and file taxes as you, having your rightful return mailed to his address. Both avenues of exploitation require only the individual’s SSN and other publicly available information.

Now, don’t get me wrong, there are ways to mitigate all of these things. Create an account with all three credit bureaus and effectively ‘freeze’ all lines of credit, instate a mandatory verbal password with your bank, and generate an identity protection pin for federal tax filings. While these are well and good, they are, as stated, mitigations — band-aids to work around the fact that SSNs wield enormous power and yet are hardly private. Furthermore, we are forced to provide our SSN to a vast array of institutions from banks to credit companies, insurance providers to new places of employment; each one storing a copy of your SSN in their own database, each time unnecessarily magnifying its exposure.

The most frustrating aspect is that the SSN itself is often not needed. Instead, it is funneled to some third-party that cross-references a government database and verifies identity. This individual is qualified to receive insurance, eligible to add a line of credit, and so on. So whenever I, or someone I know, is forced to implement said mitigations due to a leaked SSN, I find myself daydreaming about a better alternative (don’t get me wrong, I am aware this is more of a bureaucratic problem than a technical one).

The SSUID

The most obvious first step is to replace SSN numbers with a stronger source of entropy. Currently, a single static identifier is allocated to each individual, with a limited range of possibilities and insufficient randomness. The first five digits of your SSN are based on geographic location where the application was processed and time of issuance. Both can be reduced to a relatively small set of possibilities if an attacker knows where and when you were born (in targeted identity theft, these are the first things an attacker will research). If you are born after 2010, you get a measly 7.4 bits of additional entropy, as the fourth and fifth digit are now randomly generated due to widespread fraud. So, in the best case, you get 20 bits of entropy, or about a million possibilities, which can be trivially enumerated in a few seconds on a low-end cell phone.

Secondly, we need to divorce location data from SSNs (which if desired, can be stored as metadata) and assign everyone a 128-bit globally unique identifier. We will call this a Social Security Unique Identifier, or SSUID for short. Even if the SSA generated one billion SSUIDs per second, it would take eighty-five years for there to be a reasonable probability of producing a duplicate. The SSUID offers a vast pool of identifiers that will never be exhausted, better privacy by removing geographic and temporal data, and an infinitesimally small chance of two citizens being assigned the same number.

TOTP

Your SSUID should be known exclusively to you and the government. In only exceptionally rare circumstances should the SSUID itself be disclosed to any other verifying party. For instance, if mistakenly pasted into a malicious website, the ID is burned. Unlike a regular password, it cannot be quickly and easily changed and would likely entail a long, bureaucratic process dealing with the SSA. A security professional might suggest authenticating SSUIDs using time-based one-time passwords (TOTP). TOTPs are unique, temporary passwords based on a shared secret (called the seed) and the current time. They are only valid for a short time period (usually 30 seconds), adding an extra layer of security if one’s primary password is compromised.

TOTP pseudocode:

def GetTimestep(step_size_secs=30): 

  current_time = int(time.time()) 

  timestep = current_time//step_size_secs

  return timestep


The client computes $p=HASH(seed || GetTimestep())$ and sends it to the server. The server computes $p’$, and authentication only succeeds if $p’==p$.

TOTP is often wrongly conflated with 2FA due to the fact that almost no websites support TOTP for primary passwords, although this is a totally reasonable thing to do. Rather, TOTP is employed as a secondary password, retrieved from a different device (typically smartphone apps like Google Authenticator and Authy). The value of TOTP lies in its ability to prevent phishing attacks by transforming a static password into a rolling password. Even if you mistakenly type your login information into a malicious website, under ideal circumstances, the attacker has only 30 seconds in which he can authenticate as you. Once that window expires, he loses access to your account.

The secondary device doesn't provide many meaningful security benefits beyond that. I’ll even venture to say that 2FA is clunky and annoying — it adds an additional authentication step, which often disallows pasting from the clipboard, and users go from managing a single password (which has already proven problematic) to managing two more passwords per site: the seed and the backup recovery codes you are instructed to print out. (I’m convinced no one actually does the latter.) The only practical way to effectively manage TOTP tokens is to store them in a password manager alongside its associated password, in essence eliminating the second-factor aspect. I’m going to again echo the sentiment that the hassle of managing password-esque information across two or more devices outweighs any marginal security benefits of doing so. So, if you can accept that, then you are ready for my proposal.

ZKPs

Given the deep integration of SSNs within our society, I believe it is time to tap into more advanced cryptographic techniques to secure SSUIDs. First and foremost, I do not ever want to provide my SSUID to any website, like we have been trained to do with SSNs. I operate under the assumption that any website requesting an SSUID will store it. To prevent this, access to raw SSUIDs would be limited to select government agencies that provide APIs for identity verification. Such APIs ought to authenticate callers without requiring them to disclose any PII.

Before diving into the proposed solution, let us briefly discuss zero-knowledge proofs (ZKPs) and Schnorr's protocol. ZKPs allow one party (the prover) to prove to another party (the verifier) that a statement is true without revealing any information about the statement other than its truthfulness. In the context of our discussion, an individual can prove his SSUID without ever transmitting it across a network.

Schnorr’s protocol is a specific class of ZKP that nicely applies to this problem statement. A prover can convince a verifier that they know the discrete logarithm $x$ of some value $h=g^x$ without revealing the logarithm itself. In its original form, Schorrs involves back-and-forth communication between client and server (formally referred to as an interactive proof of knowledge); so we apply the Fiat–Shamir transformation, distilling it down to a digital signature, or Schnorr signature, which can be verified in a single round-trip.

Single-use proofs

Still, we haven’t yet improved on TOTP, because a man-in-the-middle (MitM) could still intercept our proof, using it later to repeatedly authenticate as us. I have discovered attempts to address the issue in the wild that convert proofs into single-use items. The following preliminary step is added to the protocol.

Verifier:
$t ← Rand()$
Send $t$ to Prover

Prover and Verifier use $x=HASH(SSUID || t)$, and the protocol continues.

$x$ is made unique per request by mixing it with a randomly generated token $t$. It is a good improvement, as the attack is reduced to a single authentication from an infinitely replayable proof, although I am still not satisfied with this solution.

Time-locked proofs

Instead, I propose time-locked proofs, which integrate TOTP timestamps into Schnorr signatures. In fact, we don’t even need to change the protocol at all, just the inputs to it!

Both the client and server compute $h^x$ where $x=HASH(SSUID || GetTimestep())$ so that the proof will change whenever the timestep does. This creates a method for generating ‘rolling proofs’ which function much like rolling passwords. Here, you can find a toy implementation on GitHub that uses sleeps to simulate the passage of time between proof and verification.

Advantages over and single-use proofs

  • Non-interactivity: Time-locked proofs stay faithful to the non-interactive transformation of Schnorrs, whereas single-use proofs require an extra round-trip, as the server must send the token to the client before proving can commence.
  • Horizontal Scalability: Single-use ZKPs require a map of connections to tokens to be stored in memory, and the mapping is unique for each instance of the service. Therefore, the client's proof must be routed to the same instance that generated the token or it will incorrectly fail to verify. In contrast, time-locked proofs allow both parties to independently calculate the timestep, avoiding the added complexity and overhead.
  • Better resistance to replay attacks: Time-locked proofs provide the same resistance to replay attacks as TOTP, while single-use proofs lack expiration properties entirely. In theory, an active MitM attacker could request $t$ from the server, proxy it to the victim, intercept the single-use proof, and replay it once on some later occasion.

Advantages over TOTP

A security minded reader may wonder what this scheme offers over TOTP. In other words, why choose time-locked proofs in favor of TOTP, using the SSUID as the seed? This reasoning is quite subtle. Let’s imagine we are sending our TOTP token to MitM. The attacker will calibrate his clock just as the client and server do. Thus, in this scenario, TOTP degenerates into sending hashed passwords (salted with the current time) across the network, and there exist many tools dedicated to cracking hashes. If your SSUID has been leaked, the MitM could crack the hash, thereby de-anonymizing you as the requester. Using time-locked proofs instead would prevent any information from being leaked to the attacker.

Application: Universal Background Checks

Universal background checks are a real-world application of time-locked proofs. For instance, if two private individuals want to engage in a firearm sale, the seller is not able to conduct a background check on the prospective buyer. That privilege is reserved only for FFLs (shops that operate full-time in the business of buying and selling weapons), and access to them is brokered by the NICS, a branch of the FBI. The current system's lack of technical sophistication means that private individuals cannot be granted access because no mechanism is in place which would prevent unauthorized and illegitimate background checks from occurring.

However, with time-locked proofs, we can imagine programmatic background checks without the privacy drawbacks. The NICS would provide a public API, which can be used to perform a background check on anyone. The catch is that accessing an individual’s background check requires a time-locked proof of identity. The buyer can provide the seller a proof of their SSUID that expires after two days, authorizing the check; the time-lock ensures that the seller can only access the buyer's background check information for a limited period, precluding unauthorized access to an updated version of the background check on some later date after the transaction has concluded.

In summary, achieving the same outcome through technological means is strictly superior. It enables private parties to securely conduct background checks, increases efficiency, and eliminates the need for FFLs as middlemen. Regulatory law serves as a barrier that prevents FFLs from abusing the current system; however, legal safeguards can be finicky, and should only be introduced when no suitable technological solutions exist.

Conclusion

Tying it all together, the existing Social Security Number system is plagued with security and privacy warts that leave millions of individuals at risk of identity theft and fraud. To address these challenges, we must move towards a more secure and privacy-conscious system. By replacing SSNs with SSUIDs and adopting time-locked proofs for identity verification, we can significantly reduce the risk of identity theft and unauthorized access to personal information. This essay is mostly an exercise in exploratory ideation, so it is unlikely that the solution ultimately adopted by government institutions will be the one proposed here. But one thing is certain: to create a more secure future for everyone, it is crucial that policymakers, tech companies, and American citizens recognize the urgency of modernizing the SSN system. We absolutely must embrace modern cryptography and leave the vulnerable, outdated SSN system in the past.

No comments:

Post a Comment

A unified file streaming API for local and remote storage

Oftentimes, we want a simple API for streaming IO that works seamlessly across multiple sources. I am looking for an interface that not only...