Snowflake IDs video
- public: true
- Rendered with Marend
const EPOCH = 1420070400000; // discord epoch const TIME_OFFSET = 1596691517291; const TIME_COLOR = "#008f26"; const WORKER_COLOR = "#3f008c"; const PROC_COLOR = "#fc00b9"; const INCR_COLOR = "#ed3300"; function bitColor(idx) { if (idx < 42) return TIME_COLOR; if (idx < 47) return WORKER_COLOR; if (idx < 52) return PROC_COLOR; return INCR_COLOR; } // work proc // time ID ID increment // YYYYYXXXXXZZZZZZZZZZZZ let bits = "0000000000000000000000000000000000000000001001101001000000000000".split("").map(x => parseInt(x)); let incr = 0; let bitsSprite = sprite(); bitsSprite.x = 0.07; bitsSprite.y = 0.57; let base10Part = sprite(); base10Part.x = 0.24; base10Part.y = 0.765; // arrows sprite(latex(String.raw`\color{${TIME_COLOR}}\downarrow`, 0.1), 0.2, 0.215); sprite(latex(String.raw`\color{${TIME_COLOR}}\downarrow`, 0.1), 0.2, 0.435); sprite(latex(String.raw`\color{${WORKER_COLOR}}\Bigg\downarrow`, 0.26), 0.66, 0.28); sprite(latex(String.raw`\color{${INCR_COLOR}}\Bigg\downarrow`, 0.26), 0.878, 0.28); sprite(latex(String.raw`\color{${PROC_COLOR}}\downarrow`, 0.1), 0.73, 0.435); sprite(latex(String.raw`\downarrow`, 0.1), 0.5, 0.633); // labels sprite(latex(String.raw`\color{${WORKER_COLOR}}\text{Worker ID}`, 0.05), 0.54, 0.195); sprite(latex(String.raw`\color{${PROC_COLOR}}\text{Process ID}`, 0.045), 0.7, 0.365); sprite(latex(String.raw`\color{${INCR_COLOR}}\text{Sequence}`, 0.055), 0.82, 0.22); function updateId() { bitsSprite.shape = latex(String.raw`\tt{${"" + bits.map((bit, i) => String.raw`{\color{${bitColor(i)}}${bit}}`).join("")}}`, 0.03); base10Part.shape = latex(BigInt("0b" + bits.join("").padStart(64, "0")).toString(), 0.07); } updateId(); let timeSprite1 = sprite(latex(String.raw`\color{${TIME_COLOR}}\text{August 6, 2020}`, 0.06), 0.1, 0.07); let timeSprite2 = sprite(); timeSprite2.x = 0.1; timeSprite2.y = 0.15; let timeSprite3 = sprite(); timeSprite3.x = 0.1; timeSprite3.y = 0.34; function updateTime(t, adjT) { const dateObj = new Date(t); timeSprite2.shape = latex(String.raw`\color{${TIME_COLOR}}\text{${dateObj.getUTCHours()}:${dateObj.getUTCMinutes()}:${dateObj.getUTCSeconds()}.${dateObj.getUTCMilliseconds().toString().padStart(3, "0")}}`, 0.05); timeSprite3.shape = latex(String.raw`\color{${TIME_COLOR}}${adjT.toString()}`, 0.065); } let moveTime = true; eachFrame(frame => { incr = (frameNum - 1) % 4096; if (moveTime || (frameNum === 1)) { let time = TIME_OFFSET + (16.66666 * frameNum); let adjTime = Math.floor(time - EPOCH); // account for epoch //time = Math.floor(time); bits = adjTime.toString(2).padStart(42, "0").split("").map(n => parseInt(n)).concat(bits.slice(42)); updateTime(time, adjTime); } bits = bits.slice(0, -12).concat(incr.toString(2).padStart(12, "0").split("").map(x => parseInt(x))); updateId(); }); async function scene1() { // just generate IDs as normal } async function scene2() { moveTime = false; } // Scenes: // 1 - Normal, generates IDs at 1/frame // 2 - Generates 4096 frames in a second scene1();
- Changes to make in editing
- Add lines over each ID section so colorblind people can see the sections
- Overlays for each section
- Render
- 21600 frames of s1
- 10000 frames of s2
- Notes
- mention IDs being generated on a server
- Twitter epoch: 1288834974657
word-count
- Discord uses Snowflake IDs to uniquely identify things such as accounts, messages, and channels without the need for a central ID generation system. Twitter also uses a slightly different form of Snowflakes to generate IDs for things such as Tweets. I'll explain what's different about Twitter Snowflakes after I explain the Discord variant, though. Let's take a look at what Snowflakes are, and how they're generated. While Snowflakes are usually used as base-10 numbers, it's easier to understand what's going on if we look at the raw 64 bits going into that base-10 number.
- The first 42 bits represent the timestamp. It's the number of milliseconds since the Discord epoch, which is the beginning of January 1 2015 in UTC. The next 5 bits are the worker ID, which is a value assigned to each worker thread by their parent process. Threads are assigned a worker ID by their parent process, and the worker IDs are unique among the threads spawned by that parent process. After that is 5 bits for the process ID, which is a unique value assigned to each server process. This results in a maximum of 32 threads per process, and a maximum of 32 processes. Due to this low number of processes in comparison to the number of Discord users, I'm pretty sure Discord uses servers specifically dedicated to generating Snowflakes (and same for Twitter). Finally, the remaining 12 bits are the sequence number. It's incremented for each ID generated, and loops around every 4096 IDs. Discord's implementation appears to reset the sequence number to zero every millisecond, unlike the implementation I am showing you now, although this is really just an implementation detail, and doesn't have any really impact on uniqueness, since the timestamp is also changed every millisecond.
- So, how does this ensure uniqueness among workers? The sequence number is the key here. In order for the same ID to be generated twice, they would need to share all four components. The two IDs would need to be generated at the same millisecond, by the same worker thread, and with the same sequence number. Since there are 4096 possible sequence values, this would mean that a duplicate ID could only be generated if a worker thread generated 4096 IDs in the same millisecond. If such a case were to happen, Discord's Snowflake generating code would likely notice, and stop generating IDs until the next millisecond.
- Now, let's look at how Twitter implements this. Twitter's implementation differs in a few ways. Firstly, the leftmost bit is reserved, instead of being a part of the timestamp. Secondly, the process ID and the worker ID are combined into a single 10 bit value. And thirdly, Twitter uses a different epoch: November 3, 2010 at 1:42 AM UTC, 974 milliseconds into the second. My guess is that this oddly specific date is the time at which someone at Twitter was developing their Snowflake implementation. That's it, thanks for watching.
- Snowflake mentions:
mentions [[Snowflake IDs]]