A5/1 Soft Version Explained ISome useful hints and preventions that should be takenBy Juan Chamero, juan.chamero@intag.org, as of April 2006
Source: GSM-Security Net, A Pedagogical Implementation of A5/1 (1999), from Marc Briceno, Ian Goldberg, and David Wagner, based on a A5/1 (chip) Reverse Engineering performed by Briceno. This algorithm has been emulated and optimized by other authors but the whole logic remained pretty much the same until now. However some cares should be taken when applied to GSM and related cryptanalysis work. This should be considered what’s announced, a pedagogical implementation where differences with working algorithms could be significant in its mechanic and in its overall efficiency.
Steps S(0) to S(64) ó Kc is bit per bit XORed to all Registers (Linear) Steps S(65) to S(86) ó Fn is bit per bit XORed to all Registers (Linear) Steps S(87) to S(186) ó 100 steps to randomize well before mask generation starts (Non Linear) Steps S(187) to S(414) ó 228 steps to generate the 228 bits Cipher Mask (Non Linear) S(86): The Initial State, the interface that separates Linear from Non Linear behavior. If somehow inferred Kc could be retrieved by GF2 Matrix Algebra Emerald ó BB only Lineal Gray ó Non Lineal added Black ó Output bit generation Red ó Test of algorithm. You need to activate everything ! Black ó Small main, activated here for test after its compilation.
#include <stdio.h>/* Masks for the three shift registers */#define R1MASK 0x07FFFF /* 19 bits, numbered 0..18 */#define R2MASK 0x3FFFFF /* 22 bits, numbered 0..21 */#define R3MASK 0x7FFFFF /* 23 bits, numbered 0..22 */ /* Middle bit of each of the three shift registers, for clock control */#define R1MID 0x000100 /* bit 8 */#define R2MID 0x000400 /* bit 10 */#define R3MID 0x000400 /* bit 10 */ /* Feedback taps, for clocking the shift registers. * These correspond to the primitive polynomials * x^19 + x^5 + x^2 + x + 1, x^22 + x + 1, * and x^23 + x^15 + x^2 + x + 1. */#define R1TAPS 0x072000 /* bits 18,17,16,13 */#define R2TAPS 0x300000 /* bits 21,20 */#define R3TAPS 0x700080 /* bits 22,21,20,7 */ /* Output taps, for output generation */#define R1OUT 0x040000 /* bit 18 (the high bit) */#define R2OUT 0x200000 /* bit 21 (the high bit) */#define R3OUT 0x400000 /* bit 22 (the high bit) */ typedef unsigned char byte; typedef unsigned long word; typedef word bit; /* Calculate the parity of a 32-bit word, i.e. the sum of its bits modulo 2 */bit parity(word x) { x ^= x>>16; x ^= x>>8; x ^= x>>4; x ^= x>>2; x ^= x>>1;return x&1; }/* Clock one shift register */word clockone(word reg, word mask, word taps) { word t = reg & taps; reg = (reg << 1) & mask; reg |= parity(t);return reg; }/*THIS IS THE NON LINEAL PART – IGNORE IF LINEAL TILL END*//* The three shift registers. They're in global variables to make the code * easier to understand. * A better implementation would not use global variables. */word R1, R2, R3; /* Look at the middle bits of R1,R2,R3, take a vote, and * return the majority value of those 3 bits. */bit majority() { int sum; sum = parity(R1&R1MID) + parity(R2&R2MID) + parity(R3&R3MID); if (sum >= 2) return 1; else return 0;}/* Clock two or three of R1,R2,R3, with clock control * according to their middle bits. * Specifically, we clock Ri whenever Ri's middle bit * agrees with the majority value of the three middle bits.*/void clock() { bit maj = majority(); if (((R1&R1MID)!=0) == maj) R1 = clockone(R1, R1MASK, R1TAPS); if (((R2&R2MID)!=0) == maj) R2 = clockone(R2, R2MASK, R2TAPS); if (((R3&R3MID)!=0) == maj) R3 = clockone(R3, R3MASK, R3TAPS);} /* Clock all three of R1,R2,R3, ignoring their middle bits. * This is only used for key setup. */void clockallthree() { R1 = clockone(R1, R1MASK, R1TAPS); R2 = clockone(R2, R2MASK, R2TAPS); R3 = clockone(R3, R3MASK, R3TAPS);} /*IF NOT OUTPUT GENERATION IGNORE THIS PART ALSO!*//* Generate an output bit from the current state. * You grab a bit from each register via the output generation taps; * then you XOR the resulting three bits. */bit getbit() { return parity(R1&R1OUT)^parity(R2&R2OUT)^parity(R3&R3OUT);}/* Do the BB key setup. This routine accepts a 64-bit key and * a 22-bit frame number. */void keysetup(byte key[8], word frame) { int i;bit keybit, framebit; /* Zero out the shift registers. */ R1 = R2 = R3 = 0; /* Load the key into the shift registers, * LSB of first byte of key array first, * clocking each register once for every * key bit loaded. (The usual clock * control rule is temporarily disabled.) */ for (i=0; i<64; i++) { clockallthree(); /* always clock */ keybit = (key[i/8] >> (i&7)) & 1; /* The i-th bit of the key */ R1 ^= keybit; R2 ^= keybit; R3 ^= keybit; } /* Load the frame number into the shift * registers, LSB first, * clocking each register once for every * key bit loaded. (The usual clock * control rule is still disabled.) */ for (i=0; i<22; i++) { clockallthree(); /* always clock */ framebit = (frame >> i) & 1; /* The i-th bit of the frame # */ R1 ^= framebit; R2 ^= framebit; R3 ^= framebit;} /*IF NOT ENTER INTO NON LINEARITY IGNORE THIS PART ALSO!*/ /* Run the shift registers for 100 clocks * to mix the keying material and frame number * together with output generation disabled, * so that there is sufficient avalanche. * We re-enable the majority-based clock control * rule from now on. */ for (i=0; i<100; i++) { clock(); } /* Now the key is properly set up. */} /* Generate output. We generate 228 bits of * keystream output. The first 114 bits is for * the A->B frame; the next 114 bits is for the * B->A frame. You allocate a 15-byte buffer * for each direction, and this function fills * it in. */void run(byte AtoBkeystream[], byte BtoAkeystream[]) { int i; /* Zero out the output buffers. */ for (i=0; i<=113/8; i++) AtoBkeystream[i] = BtoAkeystream[i] = 0; /* Generate 114 bits of keystream for the * A->B direction. Store it, MSB first. */ for (i=0; i<114; i++) { clock(); AtoBkeystream[i/8] |= getbit() << (7-(i&7)); } /* Generate 114 bits of keystream for the * B->A direction. Store it, MSB first. */ for (i=0; i<114; i++) { clock(); BtoAkeystream[i/8] |= getbit() << (7-(i&7)); }}/* Test the code by comparing it against * a known-good test vector. */void test() { byte key[8] = {0x12, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF}; word frame = 0x134; byte goodAtoB[15] = { 0x53, 0x4E, 0xAA, 0x58, 0x2F, 0xE8, 0x15, 0x1A, 0xB6, 0xE1, 0x85, 0x5A, 0x72, 0x8C, 0x00 }; byte goodBtoA[15] = { 0x24, 0xFD, 0x35, 0xA3, 0x5D, 0x5F, 0xB6, 0x52, 0x6D, 0x32, 0xF9, 0x06, 0xDF, 0x1A, 0xC0 }; byte AtoB[15], BtoA[15]; int i, failed=0; keysetup(key, frame); run(AtoB, BtoA); /* Compare against the test vector. */ for (i=0; i<15; i++) if (AtoB[i] != goodAtoB[i]) failed = 1; for (i=0; i<15; i++) if (BtoA[i] != goodBtoA[i]) failed = 1; /* Print some debugging output. */ printf("key: 0x"); for (i=0; i<8; i++) printf("%02X", key[i]); printf("\n"); printf("frame number: 0x%06X\n", (unsigned int)frame); printf("known good output:\n"); printf(" A->B: 0x"); for (i=0; i<15; i++) printf("%02X", goodAtoB[i]); printf(" B->A: 0x"); for (i=0; i<15; i++) printf("%02X", goodBtoA[i]); printf("\n"); printf("observed output:\n"); printf(" A->B: 0x"); for (i=0; i<15; i++) printf("%02X", AtoB[i]); printf(" B->A: 0x"); for (i=0; i<15; i++) printf("%02X", BtoA[i]); printf("\n"); if (!failed) { printf("Self-check succeeded: everything looks ok.\n"); return; } else { /* Problems! The test vectors didn't compare*/ printf("\nI don't know why this broke; contact the authors.\n"); exit(1); }}int main(void) { test(); return 0;} /*END*/ |