A Cryptographic Tour and Todo List of Sage
Yesterday someone showed up on [sage-devel] and wrote: “I have been developing software and doing research in the areas of: mathematics, cryptography algorithms, encryption, and would like to contribute my time and effort to the Sage project. I would like any of you to get me started in the right direction, any info would be appreciated.”
This is the edited/polished version of my reply. I am posting it here in case anyone else wonders how to contribute to Sage for cryptographic research.
- David Kohel wrote an introductionary book to cryptography. He uses Sage in the book and wrote a fair amount of code to make that happen. The relevant module is sage.crypto. For example, it implements a linear feedback shift register:
It seems the code needs more documentation and also some areas are not implemented yet, e.g. block ciphers.
sage: FF = FiniteField(2) sage: P.<x> = PolynomialRing(FF) sage: E = LFSRCryptosystem(FF); E LFSR cryptosystem over Finite Field of size 2 sage: IS = [ FF(a) for a in [0,1,1,1,0,1,1] ] sage: g = x^7 + x + 1 sage: e = E((g,IS)) sage: B = BinaryStrings() sage: m = B.encoding("THECATINTHEHAT") sage: e(m) 0010001101111010111010101010001100000000110100010101011100001011110010010000011111100100100011001101101000001111
- Sage ships PyCrypto which implements many standard cryptographic algorithms. The docstring level documentation is horrible:
It is not really meant for research/education/playing around but for production code but maybe something could be done to have easier access to it from within Sage. Here is an example how to use it:
sage: import Crypto.Cipher.IDEA sage: Crypto.Cipher.IDEA? x.__init__(...) initializes x; see x.__class__.__doc__ for signature
sage: from Crypto.Hash import MD5 sage: m = MD5.new() sage: m.update('abc') sage: m.digest() '\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr' sage: m.hexdigest() '900150983cd24fb0d6963f7d28e17f72'
- Finite fields are basic building blocks in cryptography. Sage uses several finite field implementations for prime fields and extension fields of various sizes. The FiniteField_ext_pari implementation for finite extension fields of order $\ge 2^{16}$ should be replaced by two implementations using NTL’s ZZ_pE and lzz_pE depending on the size of the characteristic. This should be relatively straight-forward because there is an implementation for characteristic 2 using NTL’s GF2E already. To get a feeling about the possible speed improvements:
sage: k.<a> = GF(next_prime(2^65)^27) sage: e = a^30 sage: f = a^40 sage: %timeit e*f 1000 loops, best of 3: 557μs per loop sage: c = ntl.ZZ_pEContext(ntl.ZZ_pX(list(k.polynomial()),k.characteristic())) sage: e = c.ZZ_pE(list(e.polynomial())) sage: f = c.ZZ_pE(list(f.polynomial())) sage: %timeit e*f 10000 loops, best of 3: 154μs per loop
- Sage isn’t exactly kicking ass when it comes to elliptic and hyperelliptic curves over finite fields. As these are quite important in asymmetric cryptography it might be worth looking into this. John Cremona added to this: “That is fair. Apart from an SEA point-counting implementation — only over prime fields — the rest (for elliptic curves) is definitely only designed to work at sub-crypto field sizes.”
- algebraic techniques received some attention for the cryptanalysis of symmetric cryptographic primitives recently. In these attacks the cryptanalyst expresses the cipher as a large set of multivariate polynomial equations and attempts to solve the system. The most common case over $\mathbb{F}_2$ is handled by PolyBoRi. This library is the backbone of BooleanPolynomialRing and friends. This class needs testing, documentation, extension and bugfixes. Basically someone should sit down and add all the methods of MPolynomial[Ring]_libsingular to BooleanPolynomial[Ring] which make sense, add a ton of doctests and test the hell out of the library to make sure no SIGSEGVs surprise the user.
sage: F,s = sr.polynomial_system() sage: R = F.ring() sage: B = BooleanPolynomialRing(R.ngens(),R.variable_names(),R.term_order()) sage: F = [B(f) for f in F if B(f) != 0] sage: F = mq.MPolynomialSystem(B,F); F Polynomial System with 68 Polynomials in 36 Variables sage: gb = F.groebner_basis() sage: gb[-1] k003 sage: s[R("k003")] 0
- The module sage.crypto.mq is also relevant for algebraic cryptanalysis in symmetric cryptography. It implements
- small scale AES equation system generators over $\mathbb{F}_2$ and $\mathbb{F}_{2^n}$
- a class to represent multivariate polynomial systems
- an S-box class to analyse … well … S-boxes.
sage: S = mq.SBox(7, 6, 0, 4, 2, 5, 1, 3); S (7, 6, 0, 4, 2, 5, 1, 3) sage: S.polynomials() [x0*x2 + x1 + y1 + 1, x0*x1 + x1 + x2 + y0 + y1 + y2 + 1, x0*y1 + x0 + x2 + y0 + y2, x0*y0 + x0*y2 + x1 + x2 + y0 + y1 + y2 + 1, x1*x2 + x0 + x1 + x2 + y2 + 1, x0*y0 + x1*y0 + x0 + x2 + y1 + y2, x0*y0 + x1*y1 + x1 + y1 + 1, x1*y2 + x1 + x2 + y0 + y1 + y2 + 1, x0*y0 + x2*y0 + x1 + x2 + y1 + 1, x2*y1 + x0 + y1 + y2, x2*y2 + x1 + y1 + 1, y0*y1 + x0 + x2 + y0 + y1 + y2, y0*y2 + x1 + x2 + y0 + y1 + 1, y1*y2 + x2 + y0] sage: S.difference_distribution_matrix() [8 0 0 0 0 0 0 0] [0 2 2 0 2 0 0 2] [0 0 2 2 0 0 2 2] [0 2 0 2 2 0 2 0] [0 2 0 2 0 2 0 2] [0 0 2 2 2 2 0 0] [0 2 2 0 0 2 2 0] [0 0 0 0 2 2 2 2]
- Univariate polynomials over $\mathbb{F}_2$ are still implemented via NTL’s ZZ_pX rather than GF2X.
Furthermore there is gf2x a drop-in replacement library for NTL’s GF2X which is expected to be five times faster than NTL. Though, a formal vote is needed to get it into Sage.
sage: k = GF(2) sage: P.<x> = PolynomialRing(k) sage: f = P([k.random_element() for _ in range(10000)]) sage: %timeit f**2 100 loops, best of 3: 11.1 ms per loop sage: f = ntl.GF2X(f.list()) sage: %timeit f**2 100000 loops, best of 3: 2.02μs per loop
- At the end of the day everything boils down to linear algebra. So if one improves that, everybody wins. Sparse linear algebra over $\mathbb{F}_p$ is still too slow (Ralf-Phillip Weinmann did some work here wrapping code from eclib), there is
no special implementation for sparse linear algebra over $\mathbb{F}_2$ (both blackbox and reduced echelon forms), dense linear algebra over $\mathbb{F}_2$ lacks Strassen multiplication/reduction and dense linear algebra over $\mathbb{F}_{2^n}$ should probably get a specialised implementation. Note that up until a few thousand rows/columns that Sage’s dense linear algebra over $\mathbb{F}_2$ is actually pretty fast.
sage: A = random_matrix(GF(2),7000,7000) sage: time B = A.echelon_form() CPU times: user 2.91 s, sys: 0.04 s, total: 2.96 s Wall time: 3.01 sage: AM = A._magma_() sage: t = magma.cputime() sage: BM = AM.EchelonForm() sage: magma.cputime(t) 2.6800000000000002
I hope this list isn’t totally useless.

