TOTP-HOTP explained

public: true
Titles
- How does HOTP/TOTP work?
HOTP
- https://tools.ietf.org/html/rfc4226
- being counter/sequence-based allows for usage in embeded devices like dongles, SIM cards, Smart cards
- designed to be relatively simple computationally
- designed to work for devices without no input method
- calculates the HMAC-SHA-1 of the counter value and the secret key
- validation
  - Server checks all codes starting from the one after the last accepted one, and ending N codes after that
    - Larger N means less security
    - Smaller N means that there are issues if the user is too out of sync
  - If the user code isn't the expected one, but within the look-ahead window, then the user skipped a code, not providing it to the server
    - Server might want to ask for multiple codes if the provided code was in the look-ahead window, to prevent guessing
  - edge cases
    - ```
    1         2                5         6
    345678 -> 123456 -> ... -> 234567 -> 123456
```
  - If the server last saw 1, but client last gave 5, then when client gives 6, server will think 3 will be next, but client will think 7 will be next. So the server won't figure out right away that we used a future code until 7. Not really an issue unless the server decides to change the look-ahead window.
TOTP
- https://tools.ietf.org/html/rfc6238
- Generated the same as HOTP, except the counter value is calculated differently
  - instead of going up by one each invocation, the counter value is floor(unixtime / 30)
    - 30 seconds is the standard time step, but the spec allows different values if you decide that's a good idea
- server and client need to have around the same time value
- validation
  - Slightly simpler than HOTP to validate, since no need to deal with a counter value that can change without the server's knowledge
  - Server checks provided code against all codes that could be generated within N of now
    - N is usually 1-2 minutes
  - Server stores timestamp of last provided code, then subsequent codes must be after the last timestamp
    - Prevents reusing the same code twice
    - Increases security slightly: if the user just authenticated with C, then we know that the user won't see C-1 (since that is in the past), so C-1 shouldn't be allowed
Can be used bidirectionally
- Client -> Server: code n
- Server -> Client: code n+1
SHA-1 usage
- Collisions don't matter in this case, since an attacker would want to try to guess what the generated hash would be, not find another sequence value that results in the same code
Provisioning (not defined by RFC)
- Google Authenticator has a URI format for provisioning the client
  - https://github.com/google/google-authenticator/wiki/Key-Uri-Format
  - like otpauth://totp/Example:alice@google.com?secret=JBSWY3DPEHPK3PXP&issuer=Example
  - secret: the HMAC secret, Base32 encoded
  - issuer: the issuer
  - algorithm: ignored but might allow different HMAC hash functions in the futuredigits: 6/8, number of output digits, ignored on Android /[BlackBerry]] (defaults to 6 on those platforms
  - digits: 6/8, number of output digits, ignored on Android /[BlackBerry]] (defaults to 6 on those platforms)
  - counter: initial counter for HOTP
  - period: for TOTP, the time period (default 30, currently unsupported)
```
word-count
```
- TOTP and HOTP are algorithms for generating 2-factor authentication codes. Let's take a look at how these work, starting with the simpler HOTP. First, you enable 2FA by scanning a QR code or tapping a link, to set it up in your 2FA code generator, such as the Google Authenticator app. Then, when you want to log in, you generate a new code, and enter it in. Neat! Here's what's going on throughout the process. The QR code you scan with your phone encodes a random secret, and some metadata. Your phone and the server both keep track of this secret, and a counter value, which starts at 0. Whenever you get a new code, your counter value goes up. When you send the code to the server, the server checks what the next code is. If your code matches, then it logs you in and updates it's counter value.
- If you generate a code, but don't send it to the server, then your counter value will be higher than the server's count. In this case, the server will check your code against the next counter value, and find it invalid. However, servers then check the next few counter values against the provided code, to account for situations like this. If there's a match, then the server jumps it's counter value forward to match yours, and logs you in. The look-ahead window used here is typically around 4-5 codes. If you advance your counter too far, then they'll be outside of the server's look-ahead window and you'll either have to manually change your counter value, or re-setup 2FA. Servers won't accept codes with a sequence value lower than one they have already accepted.
- But how exactly are codes generated? The HMAC-SHA-1 function is key here. HMAC functions are typically used for verifying the authenticity of a signed message. They have two parameters: a secret, and a message. HOTP just uses this HMAC function with the secret from the server, and the counter value as the message. Since HMAC functions really just end up using their underlying hash function - SHA-1 in this case, the output is 160 bits, or 20 bytes long. Since HOTP codes are typically 6 or 8 digits long, this long output needs to be shortened into a smaller number of digits. The SHA-1 hash is truncated and the numeric value is divided by 10 to the power of the number of digits, and the remainder of that is the current code value. Note that the SHA-1 hash is truncated in a needlessly, in a process the spec calls "dynamic truncation". There is no need for the dynamic part here, but it does add to the complexity of HOTP for no reason. It might have made sense at the time the spec was written, to defend against theoretical SHA-1 attacks, but no such attacks have surfaced.
- Next, let's take a look at TOTP, which is very similar to HOTP, except time based. It is very similar to HOTP, and is setup in authentication apps the same. The only difference is that the code is based on the current time, not a sequence value. The sequence value used for TOTP is derived from the current Unix time (which is the number of seconds since January 1st, 1970) and the gap between codes, which is typically 30 seconds. Divide the Unix time by the gap, round it down, and you get the current sequence value. Pass that sequence value into the same code-generation function used by HOTP, and you've got your code. Of course, authentication have to repeat this process to generate a new TOTP code every 30 seconds.
- TOTP codes are validated a little bit differently than HOTP. The authentication server checks if the provided code is within a certain window of the current time - typically a few minutes, and if it is within that time, it's allowed. To prevent code reuse, the server stores the time of the last valid code, and requires that any subsequent login attempts provide a code from a after that time. TOTP is better than HOTP for almost all use cases. Really the only reason to use HOTP nowadays is for older systems that don't support TOTP, and for embedded devices that don't have a clock so can't keep track of the current time.
- Finally, let's take a look at how the secrets are sent from the server to the authentication app. The simple, widely supported way is to just tell to user to enter the secret and parameters directly. However, Google Authenticator supports another method, where all the parameters and the secret can be combined into a single QR code or link, which looks like this. The QR code just encodes the link data. This link contains all of the needed OTP parameters. That's it, thanks for watching!