header

Categories::

Projects::

SAGE
M4RI
Code Snippets
ECrypt II
iliketotallyloveit

Stuff::

Junge Linke (de)
Battrock (de)

MiniMe::

BitBucket
Flickr
Mon, 19. Oct 2009

Sage Development Visualisation by Alex Ghitza

In the first few seconds were Gonzalo does all the work, he imports all of Sage into the revision control system DARCS. Before then we were not using revision control at all.

Sage code swarm from Alex Ghitza on Vimeo PS: Vimeo for XHTML.

posted at: 14:03 :: permanent link

Sat, 27. Jun 2009

LQUP vs. PLUQ

At SD16 Clément Pernet and myself have been working on improving the asymptotically fast PLUQ factorisation over GF(2) in M4RI. As mentioned earlier, one of the main problems is that column swaps are pretty expensive compared to many other operations we do. Eventually, we settled for LQUP over PLUQ since it has fewer column swaps overall since it does not compress U. We also improved the base case both w.r.t. to sparse matrices and in general (more Gray code tables are used now) and the column swap performance overall (cf. SD 16 Wiki). The result is noticeable, but we are not quite there yet:

M4RI r284 vs. r292

There are still some places which could be improved so this should get better eventually. Also, we might have another strategy to deal with these sparse-ish/structured matrices. Anyway, the new PLUQ code is at least as fast as M4RI for the structured HFE examples on the M4RI website on my Core2Duo 2.33Ghz notebook (and of course much faster on random examples and on other platforms) The new code is available on BitBucket.

posted at: 15:22 :: permanent link

Thu, 18. Jun 2009

More OpenMP Experiments with M4RI

Motivated by a thread on [mpir-dev] I played around with OpenMP again today. The performance does not scale linearly … but hey it scales at all. I guess eventually I’ll have to get serious about this and sit down to make this proper. Anyway, here are the timings (on geom.math.washington.edu)

Computing the reduced row echelon form of an $n \times n$ random dense matrix over $\mathbf{F}_2$.
n M4RI
1 thread
PLUQ
1 thread
M4RI
4 threads
PLUQ
4 threads
M4RI
16 threads
PLUQ
16 threads
PLUQ 16 threads
cutoff=2048
10,000 1.72 0.85 1.03 0.86 0.58 0.80 0.77
16,384 13.75 5.76 4.78 4.23
20,000 27.02 5.45 7.35 5.48 3.27 3.68
32,000 112.74 21.96 30.51 22.02 13.78 13.91 12.95
64,000 227.80 157.03 104.94 75.95 66.54
100,000 1078.72 429.32 869.43 596.51 428.08 260.99 231.01

For some reason which I don’t understand yet is PLUQ slower for 16,384 than 20,000 on this machine. The code is on bitbucket.

posted at: 22:44 :: permanent link

Wed, 20. May 2009

F4-Style F5 — Next Attempt

Since the F4-style F5 I mentioned a while ago wasn’t really that “F4-style” I’ve pushed my next attempt into the public “algebraic attack” bitbucket repository. This version swaps the two outer loops, i.e. it proceeds by degree in the outer loop instead of the index of the polynomials. Of course, we are talking about a toy implementation here to understand the algorithm and not an attempt to implement F5 efficiently.

posted at: 20:47 :: permanent link

Tue, 07. Apr 2009

Funny

It might very well be just a conincidence, but superficially it looks like Mathematica is targeting Sage users with ads on Google.

Mathematica ad

Source: [sage-devel] … scroll down to see Harald Schilly debunk the whole thing. Still, it’s funny.

posted at: 13:08 :: permanent link

Sun, 22. Mar 2009

Windows Sage 0.3

As you might have heard, there is a Windows port of Sage in the making. While the current version doesn’t look like Sage proper, it already ships quite a few building blocks. Since, incidently the Information Security Group is now part of the “Microsoft Developer Academic Alliance” — which grants students free access to their products — I gave it a quick spin. See below.

screenshot of Windows Sage 3.0

Installation was straight forward once you have Visual Studio 2008 installed, just call build.bat and wait an hour or so.

posted at: 16:24 :: permanent link

Fri, 20. Mar 2009

M4RI API Change and Big Matrices

Finally, I found some time to work on M4RI again: I changed the internal representation of matrices to support more than one malloc() call per matrix, i.e. each matrix is split into blocks of some maximal size. This allows to deal with much bigger matrices under Linux because the kernel often won’t just give you 8GB mapped to consecutive addresses but it will give you 1GB eight times. This limitation bugged me quite some time now, since this also limited the kind of systems I could solve using PolyBoRi with M4RI enabled. The result is available at http://bitbucket.org/malb/m4ri/ but bear in mind that the API changed quite a bit for this (e.g., I renamed the packedmatrix to mzd_t) on the way).

64-bit Ubuntu, Xeon X7400 @2.66GHz
Matrix
Dimension
Memory
(expected)
M4RI/M4RI
(64-bit, 1 core)
M4RI/PLUQ
(64-bit, 1 core)
100,000 x 100,000 > 1.16 GB 1078.72 s 429.32 s
200,000 x 200,000 > 4.65 GB 2298.30 s
256,000 x 256,000 > 7.63 GB 8979.33 s 3709.25 s

The above table gives the time to compute the rank of random matrices of the given dimension using the given algorithms on http://geom.math.washington.edu.

posted at: 12:16 :: permanent link

Mon, 02. Feb 2009

Sage hits Debian/unstable

As jzmer just pointed out on #sage-devel, Sage hit Debian/unstable today-ish. Of course, this also means that the wealth of math software Sage ships (e.g., Singular) is now also available in Debian/unstable. Well done Tim!

Update:Carl — also on IRC — points out that there are issues with the binary shipped with Debian as of now. Hopefully, this will get resolved quickly.

posted at: 00:30 :: permanent link

Sun, 25. Jan 2009

F4-Style F5

When I asked Jean-Charles Faugère a while back why he doesn’t publish his F4-style F5 he answered that there would be nothing to publish since it is straight forward. I don’t know about that but I was actually quite surprised how quickly John Perry and I could come up with an F4-style F5 here at Sage Days 12 (pictures). Btw. Till Stegers calls this F4.5 but I don’t know how Jean-Charles Faugère refers it.

I’ve uploaded the toy implementation to bitbucket. It seems to behave like the polynomial F5 implementation in the same file, so at least it is bug by bug compatible with that. Speaking of behaviour: When computing Cyclic-6 over $\mathbb{F}_{32003}$ w.r.t. degrevlex I noticed that my F4 implementation only goes up to degree 16 while my F5 implementations need to consider degree 18. Although no matrix is constructed for degree 18, the F4-style F5 does construct and eliminate a matrix for degree 17 … puzzling.

posted at: 22:45 :: permanent link

Fri, 16. Jan 2009

Bitslicing and the Method of the Four Russians over Larger Finite Fields

Tom Boothby’s and Robert Bradshaw’s paper on the “Method of the Four Russian” multiplication algorithm over $\mathbb{F}_3$, $\mathbb{F}_5$, $\mathbb{F}_7$, $\mathbb{F}_{2^2}$ and $\mathbb{F}_{2^3}$ is available as pre-print on the arXiv. If you’re into fast exact linear algebra I highly recommend reading it since it has some really nice ideas in it and is well written.

Abstract. “We present a method of computing with matrices over very small finite fields of size larger than 2. Specifically, we show how the Method of Four Russians can be efficiently adapted to these larger fields, and introduce a row-wise matrix compression scheme that both reduces memory requirements and allows one to vectorize element operations. We also present timings which confirm the efficiency of these methods and exceed the speed of the fastest implementations the authors are aware of.”

posted at: 13:14 :: permanent link

Mon, 05. Jan 2009

M4RI-20090105 Released

Sources are available for download. See release notes for details about what changed since the last version and some timings (1, 2, 3, 4) to get an idea of the performance.

posted at: 17:44 :: permanent link

Tue, 23. Dec 2008

M4RI’s and MMPF’s Sensitivity to Density

The last couple of days I’ve been working on improving libM4RI for sparse-ish matrices. The matrices I am talking about here are still represented as dense matrices but have non-uniformly distributed entries. While PLUQ factorisation is still very very (very very) slow for e.g. half rank matrices, things are getting better for M4RI matrix elimination.

r219 vs. r221

The updated sources are available on bitbucket. I will probably cut a new release very early next year.

posted at: 15:07 :: permanent link

Wed, 12. Nov 2008

“Efficient Multiplication of Dense Matrices over GF(2)”

We describe an efficient implementation of a hierarchy of algorithms for multiplication of dense matrices over the field with two elements (GF(2)). In particular we present our implementation - in the M4RI library - of Strassen-Winograd matrix multiplication and the “Method of the Four Russians” multiplication (M4RM) and compare it against other available implementations. Good performance is demonstrated on on AMD’s Opteron and particulary good performance on Intel’s Core 2 Duo. The open-source M4RI library is available stand-alone as well as part of the Sage mathematics software.

In machine terms, addition in GF(2) is logical-XOR, and multiplication is logical-AND, thus a machine word of 64-bits allows one to operate on 64 elements of GF(2) in parallel: at most one CPU cycle for 64 parallel additions or multiplications. As such, element-wise operations over GF(2) are relatively cheap. In fact, in this paper, we conclude that the actual bottlenecks are memory reads and writes and issues of data locality. We present our empirical findings in relation to minimizing these and give an analysis thereof.”

Related News: My shiny new version of Magma 2.14-17 seems to perform better than Magma 2.14-14 for matrix multiplication over $\mathbf{F}_2$ on the Core 2 Duo. So I updated the performance data on the M4RI website. However, the changelog doesn’t mention any improvements in this area. Btw. searching for “Magma 2.14” returns the M4RI website first for me, which feels wrong on so many levels. Finally, M4RI is being packaged for Fedora Core.

posted at: 17:43 :: permanent link

Thu, 06. Nov 2008

Yet Another Talk on Sage

Today, I gave a talk on Sage to the ISG PhD seminar at Royal Holloway. I think it went alright, although people around here just don’t get excited about computation that much. Anyway, I’ve uploaded the slides and the demo (pdf, worksheet).

posted at: 15:40 :: permanent link

Sun, 26. Oct 2008

Matrix F5

I finally fixed my Matrix $F_5$ implementation. I’ll also give a talk about Matrix $F_5$ to the PhD seminar at ISG on November 27th. I’ll the post the slides around then for those interested.

posted at: 19:58 :: permanent link

Mon, 20. Oct 2008

Bits and Pieces

I spent last week at Sage Days 10 (pictures and more pictures) in Nancy, France which turned out to be a very nice event. I (together with Simon King, Michael Brickenstein and Ludovic Perret) spent most of my time working on various toy implementations of F5 in order to understand the algorithm (better). We also conversed with John Perry who’s pseudcode, Singular code and description of F5 was incredibly helpful (and motivated me to work on this project in the first place, btw). My toy implementation of the polynomial version of F5 is available online and so is Simon King’s Cython implementation for Sage. John Perry now also provides a non-homogeneous version of F5 based on my Sage implementation.

I also implemented a toy version of F5/Matrix which indeed avoids a fair number of zero reductions and returns a Groebner basis if $d_{max}$ is big enough. I don’t think it does avoid as many reductions as the polynomial version which indicates a problem in my code. Note, that F5/Matrix is not described in detail in English literature (if you speak French and want to translate some short notes on this algorithms to English, please let me know).

I didn’t work on M4RI during Sage Days 10 but Clement Pernet and Jean-Guillaume Dumas did. We will probably release a new version with tried and tested TRSM code soon. Btw. I also gave a contributed talk about M4RI at Sage Days 10. My project for a train ride right after Sage Days 10 was to improve univariate polynomials over $\mathbb{F}_2$ in Sage to improve modular composition of polynomials.

I’m off to Santander on Wednesday for the 2nd Workshop on Mathematical Cryptology and once I’m back I’ll give a talk at the Graduate Studies Elsewhere Open Afternoon in Cambridge on “Algebraic Attacks against Block Ciphers”.

One last thing: The DES generator is broken due to a bug in Sage, a fix is available on Trac.

Update:fixed a bug in the F5/Matrix code and removed a nonsense statement about the rank of the matrices.

posted at: 19:04 :: permanent link

Fri, 22. Aug 2008

Gröbner Bases over $ZZ$

Vanilla Sage does not compute Gröbner bases over $ZZ$ by any definition. However, this feature has been requested several times. The earliest account I could find quickly is this post by Joe Wetherell. Below is a list of options for Gröbner bases over $ZZ$ in Sage.

  1. Singular’s upcoming release will feature Gröbner bases over rings. In fact, the feature is present in the current Singular release but not enabled by default. An SPKG with that functionality enabled can be found here. ring r = integers,(x,y),lp; declares a ring over the integers where some things work and some things don’t. Note that this ring declaration is not final, i.e. the name integers may change. Also, this SPKG has issues and crashes on me for some operations. We’re working on tracking that issue down.
  2. Macaulay2 has support for Gröbner bases over rings and a decent Sage interface supporting that functionality. Macaulay2 1.1 is available as an experimental SPKG. First, one needs to install boehm_gc-7.1.p0 and gdbm-1.8.3. Then, since the version in experimental didn’t compile for me, try my new SPKG.
  3. If you are lucky enough to have Magma installed, I.groebner_basis(“magma:GroebnerBasis”) does the job. If you don’t have Magma installed try magma_free.
  4. Ginv (also available as a optional package) also supports Gröbner bases over $ZZ$. However, for the example Joe gave in his e-mail it crashes on me and I’ve contacted upstream about it.
  5. Last and least: I have a toy implementation of the $d$-Gröbner basis algorithm from the Becker-Weispfenning. Don’t hold your breath, it is dead slow.

Hopefully, due to the upcoming Singular release the situation will improve soon and we’ll finally have Gröbner bases over $ZZ$ in Sage.

posted at: 14:39 :: permanent link

Mon, 18. Aug 2008

Parallel Matrix Elimination

I released a new version of M4RI today which contains a parallel implementation for matrix elimination. Below I reproduce some timings for this code to give a rough idea of the performance of this code.

64-bit Debian/GNU Linux, 2.6Ghz Opteron (Virtualised)
Matrix
Dimension
Magma 2.14-13
(64-bit, 1 core)
M4RI
(64-bit, 1 core)
M4RI
(64-bit, 4 cores)
10,000 x 10,000 3.283 2.509 1.064
16,384 x 16,384 11.204 10.741 3.918
20,000 x 20,000 16.911 19.776 7.216
32,000 x 32,000 57.761 86.071 32.420
64,000 x 64,000355.477 640.742 307.213

The examples hfe25_5, hfe30_5 and hfe35_5 from the M4RI website take 1.44, 9.29 and 51.56 seconds respectively.

Note that this is work in progress and that the algorithm still has worse complexity than the one implemented in Magma. Also note that the speed-up is far from linear and that the speed-up decreases with the size. This is probably because each thread falls out of L2 more often and the threads clog each other.

posted at: 22:56 :: permanent link

Mon, 11. Aug 2008

GCC 4.3 and -O3

I recently upgraded an Opteron server to Debian/Lenny to get GCC 4.3 for OpenMP reasons. It turns out that my code, namely matrix multiplication as implemented in the M4RI library, ran much slower than when compiled with GCC 4.1. For instance, to multiply two $20,000 \times 20,000$ random matrices took 18.38 seconds with GCC 4.1 but 21.00 seconds with GCC 4.3.1 and to multiply two $32,000 \times 32,000$ random matrices took 70.24 seconds with GCC 4.1 but 80.00 second with GCC 4.3.1. Eventually, I checked the highlevel changelog and found: “The -ftree-vectorize option is now on by default under -O3. In order to generate code for a SIMD extension, it has to be enabled as well: use -maltivec for PowerPC platforms and -msse/-msse2 for i?86 and x86_64.” However, we don’t use SSE2 on the Opteron since it is slower than the standard instruction set for this application. Passing -no-tree-vectorize to the compiler fixed the problem. However, to my surprise -O2 didn’t come with a speed penalty either, so I settled for this. The final timings on my Opteron server are:

64-bit Debian/GNU Linux, 2.6Ghz Opteron (Virtualised)
Matrix
Dimension
M4RI GCC 4.3
(64-bit, 4 cores)
M4RI GCC 4.3
(64-bit, 1 core)
M4RI GCC 4.1
(64-bit, 1 core)
Magma 2.14-13
(64-bit, 1 core)
20000x200006.3617.8118.3818.35
32000x3200026.6568.0170.2468.01

I suppose the moral of the story is: -O3 isn’t necessarily better than -O2 just because 3>2.

posted at: 16:00 :: permanent link

Tue, 08. Jul 2008

Scapy and Sage

Scapy is a powerful interactive packet manipulation program. It is able to forge or decode packets of a wide number of protocols, send them on the wire, capture them, match requests and replies, and much more. It can easily handle most classical tasks like scanning, tracerouting, probing, unit tests, attacks or network discovery (it can replace hping, 85% of nmap, arpspoof, arp-sk, arping, tcpdump, tethereal, p0f, etc.). It also performs very well at a lot of other specific tasks that most other tools can’t handle, like sending invalid frames, injecting your own 802.11 frames, combining technics (VLAN hopping+ARP cache poisoning, VOIP decoding on WEP encrypted channel, …)”

At the end of the day Scapy is one (one!) Python file so it couldn’t be easier to use it from within Sage. As an example let’s assume we have sniffed an SSH connection establishment including a Diffie-Hellmann Group Exchange as described in RFC 4419. Scapy can do live packet capture and injection but that would require root privileges, so I’m working with a pcap file in this example:

from scapy import rdpcap, TCP, IP

SSH2_MSG_KEX_DH_GEX_GROUP = 31

# read packets
packets = [p[IP] for p in rdpcap("/home/malb/example.pcap") \
               if p[TCP] and len(p[TCP]) > 32]

# find correct package & payload
for packet in packets:
    try:
        pl = [ord(e) for e in packet[TCP].payload.load]
        if pl[5] == SSH2_MSG_KEX_DH_GEX_GROUP:
            break
    except AttributeError:
        pass

def get_uint(pl, length):
    # this is not as generic as it should be since it doesn't work
    # with negative numbers
    value = ZZ(0)
    for i in range(length):
        value += pl[i] * 2**(8*(length - i - 1))
    return value, pl[length:]

packet_length, pl = get_uint(pl, 4)
padlen, pl = get_uint(pl, 1)
packet_type, pl = get_uint(pl, 1)
assert(packet_type == SSH2_MSG_KEX_DH_GEX_GROUP)

# p
p_length, pl = get_uint(pl, 4)
p, pl = get_uint(pl, p_length)

# g
g_length, pl = get_uint(pl, 4)
g, pl = get_uint(pl, g_length)

assert(len(pl) == padlen)
assert(p.is_prime())

Zp = GF(p)
g = Zp(g)
e = g**ZZ.random_element(0,p)

e.log(g) # yeah, right ;-)

Happy hacking.

posted at: 17:34 :: permanent link

Fri, 20. Jun 2008

XOR for Fun and Profit

I just gave a talk on linear algebra over GF(2), optimisation techniques and applications to algebraic cryptanalysis. Slides are available online.

posted at: 22:02 :: permanent link

libM4RI in Debian Unstable

malb@XXX:~$ apt-cache search m4ri  
libm4ri-dev - Method of the Four Russians library, development files
libm4ri0 - Method of the Four Russians library, shared library

Big thanks to Tim for making that happen!

posted at: 06:37 :: permanent link

Fri, 13. Jun 2008

Fraction Free Gauss-Jordan Errata

I’m at Sage’s dev1 right now and so I have the pleasure of meeting Arne Storjohann. In his thesis he presented a fraction free asymptotically fast matrix elimination algorithm which unfortunately has some typos in it. Below I replicate the correct algorithm he explained/provided me yesterday:

def GaussJordan(A, k=-1, d0=None):
  if d0 == None:
    d0 = A.base_ring()(1)
  n = A.nrows()
  m = A.ncols()
  I = MatrixSpace(A.base_ring(),A.nrows(),A.nrows())(1)
  for i in xrange(k+1,n):
    if any(A[i,j] for j in xrange(m)):
      break
  else:
      U,P,r,h,d = d0*I, I, 0, n-k, d0
      return (U,P,r,h,d)

  if m == 1:
    i = min([i for i in xrange(k+1,n) if A[i,0] != 0])
    P = copy(I)
    P.swap_rows(i,k+1)
    r,h,d = 1, n-i, (P*A)[k+1,0]
    U = d*I
    for j in range(n):
      U[j,k+1] = -(P*A)[j,0]
    U[k+1,k+1] = d0
  else:
     m1,m2 = m//2, m-m//2
     A1 = A.matrix_from_columns(range(m1))
     B = A.matrix_from_columns(range(m1,m))
     U1, P1, r1, h1, d1 = GaussJordan(A1, k, d0)
     A2 = d0**(-1)*U1*P1*B
     U2, P2, r2, h2, d = GaussJordan(A2, k+r1, d1)
     U = d1**(-1) * U2*(P2*(U1 - d1*I) + d1*I)
     P,r,h,d = P2*P1, r1+r2, min(h1,h2),d
  return U,P,r,h,d

Note that this is not how one would actually implement this algorithm in practice: it is pseudo-code that happens to run in Sage. For a practical implementation check the IML library.

posted at: 17:56 :: permanent link

Tue, 13. May 2008

M4RI Website

I finally put together the website for the M4RI library. For those who don’t know M4RI:

“M4RI is a library for fast arithmetic with dense matrices over $\mathbb{F}_2$. It was started by Gregory Bard and is now maintained by Martin Albrecht and Gregory Bard. The name M4RI comes from the first implemented algorithm: The “Method of the Four Russians” inversion algorithm published by Gregory Bard. This algorithm in turn is named after the “Method of the Four Russians” multiplication algorithm which is probably better referred to as Kronrod’s method. M4RI is used by the Sage mathematics software and the PolyBoRi library. M4RI is available under the General Public License Version 2 or later (GPLv2+).

Features of the M4RI library include:

Performance-wise it is doing okay but not great. On Intel’s Core2Duo it seems to compare favourably to Magma 2.13. Though, I don’t have access to Magma 2.14 yet which improves dense linear algebra over $\mathbb{F}_2$. However, on AMD’s Opteron it is way behind Magma 2.13. This is possibly due to the 1MB L2 cache of the Opteron vs. 4MB L2 cache of the Core2Duo.

posted at: 12:26 :: permanent link

Fri, 21. Mar 2008

A Cryptographic Tour and Todo List of Sage

Yesterday someone showed up on [sage-devel] and wrote: “I have been developing software and doing research in the areas of: mathematics, cryptography algorithms, encryption, and would like to contribute my time and effort to the Sage project. I would like any of you to get me started in the right direction, any info would be appreciated.”

This is the edited/polished version of my reply. I am posting it here in case anyone else wonders how to contribute to Sage for cryptographic research.

I hope this list isn’t totally useless.

posted at: 14:06 :: permanent link

Sun, 16. Mar 2008

Yet Another Talk on Sage

I gave a brief talk yesterday at the Open Knowledge Conference (OKCon) here in London. The slides were also discussed on [sage-devel] last week. I have to admit that I underappreciated David Joyner’s comments about the expected audience. My impression is that the majority of the audience couldn’t care less about the actual mathematics implemented in Sage. I suppose we still made a good impression but I had to skip most of the examples I care about due to time constraints and preceived lack of interest. After the talks I had some neat discussions with other participants, e.g. Gaël Varoquaux from the MayaVi2 project.

posted at: 18:44 :: permanent link

Thu, 28. Feb 2008

Plotting Timing Experiments

Like any other person I regulary need to run experiments to check how fast or slow a particular algorithm/implementation is for a given problem. The natural choice is to plot the data. This way you at least get some more or less pretty picture out of the tendious experience of having to wait for the experiment to finish. I used to write crappy code to generate these pictures myself and I could not convince myself to remember the appropriate commands for matplotlib and R. Today I sat down and learned the five lines of code necessary to have decent plots for my experiments. I’m putting examples here for no good reason except maybe to show off Sage’s new HNF code which I use as a showcase.

First lets compare how long it takes to compute the Hermite Normal Form for a given random $n \times n$ matrix with (possibly negative) integer entries of size bounded absolute by $2^{16}$.

n = 10
b = 16
st =[]
mt = []
x = [20*i for i in range(n)]
for i in range(n):
  A = random_matrix(ZZ,20*i,20*i, x=-2**b, y=2**b)
  t = cputime()
  E = A.echelon_form()
  st.append(cputime(t))

  AM = A._magma_()
  t = magma.cputime()
  EM = AM.EchelonForm()
  mt.append(magma.cputime(t))

import pylab

pylab.clf() # clear the figure first

pylab.figure(1)

# plot some data and add a legend
pylab.plot(x,st,label="Sage") 
pylab.plot(x,mt,label="Magma") 
pylab.legend() # print the legend

pylab.title("HNF for Random Matrices with $%d$-bit Integer Entries: Sage vs. Magma"%b)
pylab.ylabel("execution time $t$") # label the axes
pylab.xlabel("n for n x n matrix")

pylab.savefig('foo.png',dpi=72) # fire!
sage and magma hnf

Now lets use R to see how the runtimes vary for random $160 \times 160$ matrices with (possible negative) integer entries bounded absolute by $2^{10}$.

b = 10
st = []
for i in range(500):
  A = random_matrix(ZZ,160,160, x=-2**b, y=2**b)
  t = cputime()
  E = A.echelon_form()
  st.append(cputime(t))

from rpy import r

r.png('histogram.png',width=640,height=480)
r.hist(st,r.seq(1.2,3.7,0.02),main="SAGE HNF Histogram",col="lightblue", prob=True, xlab="seconds")
r.lines(r.density(st,bw=0.05),col="black")
r.rug(st)
r.dev_off()
hnf histogram

Neat, isn’t it? Btw. Pygments is also neat, thanks rpw.

posted at: 23:08 :: permanent link

Wed, 20. Feb 2008

Impressions from FSE 2008

impression
If you don’t get it, don’t worry, it is not really funny.

posted at: 14:17 :: permanent link

Thu, 29. Nov 2007

Les Trophées du Libre 2007

Sage is among the finalists of this year’s “free software awards” competition in the science category. The other two finalists in that category are Giac/XCas (slides, session) and Getfem++. I am representing Sage in 25 minutes and I uploaded my slides and the demo worksheet (PDF).

posted at: 11:53 :: permanent link

Wed, 07. Nov 2007

$GF(2^n)$ arithmetic speed

Since version 2.8.10 Sage’s finite extension fields of characteristic 2 and degree $\ge 16$ are implemented via NTL’s GF2E rather than Pari. For some more or less random reason I timed how fast multiplying two random elements is now.

timings plot for GF(2^n)

The red line show the time it takes Magma 2.13-5 to multiply two random elements a million times for a given degree $n$. The green line shows the same calculation using Sage 2.8.12 with the default modulus and a Python loop. The blue line uses a Cython loop (== C loop) and the function good_modulus (see below) to generate a “good” modulus. The default modulus used by Sage is either the conway polynomial or - if we don’t know the conway polynomial - a random irreducible polynomial. I took the idea of using a “good” modulus from Michael Scott’s slides for his talk at the SPEED workshop. My attempt is not as sophisticated as his but naively searches for trinomials and pentanomials with low degree terms.

def good_modulus(n):
  P = GF(2)['x']
  x = P.gen()
  for a in xrange(1,n):
    f = x**n + x**a + 1
    if f.is_irreducible():
      return f
  for N in range(0,n,10):
    for a in xrange(1,N+1):
      for b in xrange(a+1,N+1):
        for c in xrange(b+1,N+1):
          f = x**n + x**c + x**b + x**a + 1
          if f.is_irreducible():
            return f
  # fall back to default if nothing was found
  return GF(2**n,'a').polynomial()

Some comments:

  1. Up until $2^{15}$ we use Zech logarithms as they are implemented in Givaro. Magma uses Zech logarithms up to $2^{20}$ and we should do the same. If we use a Cython loop (i.e. remove the overhead of the loop) Sage’s arithmetic is as fast as Magma’s.
  2. I don’t know why there is that peak around $n=2$ for Magma. Bug? My bad?
  3. Magma scales quite nicely wordwise, as you would expect.
  4. Surprisingly enough we beat Magma starting at $2^{100}$ up until at least $2^{128}$ using the “good” moduli.
  5. What is going on with NTL between $2^{16}$ and $2^{64}$?

It seems we should internally - at least for large degrees - represent elements w.r.t. to a “good” modulus even if we know the conway polynomial.

posted at: 18:36 :: permanent link

Thu, 01. Nov 2007

Yet Another Talk on Sage

May I point the reader’s attention to the slides of my most recent talk about Sage for the ISG Student Seminar.

posted at: 19:40 :: permanent link

Fri, 05. Oct 2007

More Pictures/SAGE Days 5

I’ve uploaded my pictures from the “Tools for Cryptanalysis 2007” and “SAGE Days 5” workshops to flickr. At SD5 I

The SAGE inheritance tree in 3D:
class hierarchy 3d

posted at: 15:53 :: permanent link

Sun, 16. Sep 2007

The Sorry State of Sparse Linear Algebra over Finite Fields

By sparse linear algebra I actually only mean computing the (reduced) row echelon form. Surprisingly, there aren’t much implementations out there, not much I am aware of at least.

The other day I kinda liked the idea to apply William Stein’s integer matrix rational-echelonize-via-solve algorithm to this case. The (adapted) algorithm is as follows (most of it is due to William Stein):

  1. Compute $r = rank(A)$. This is cheaper than Gaussian elimination because we can use “Symbolic Reordering”.
  2. Compute the pivot columns of the transpose $A^t$ of $A$. We can convince “Symbolic Reordering” to give us these as well.
  3. Let $B$ be the submatrix of $A$ consisting of the rows corresponding to the pivot columns found in the previous step. Note that, aside from zero rows at the bottom, $B$ and $A$ have the same reduced row echelon form.
  4. Compute the pivot columns of $B$. Again, we may do this using “Symbolic Reordering”.
  5. Let $C$ be the submatrix of $B$ of pivot columns. Let $D$ be the complementary submatrix of $B$ of all all non-pivot columns. Use a solver (such as Wiedemann) to find the matrix $X$ such that $C X=D$ . I.e., solve a bunch of linear systems of the form $ Cx = v$ , where the columns of $ X$ are the solutions.
  6. Return the matrix $I || X$ where $I$ is the identity matrix of rank $r$.

This algorithm has complexity of two “Symbolic Reordering” applications, and $ncols - r$ applications of a matrix-vector solver. If “Symbolic Reordering” significantly outperforms Gaussian elimination (speed and memory-wise) and if $ncols - r$ is small and the solver fast, this might outperform straight-forward Gaussian elimination. The algorithm in SAGE notation is:

def echelon_form_via_solve(A):
    r = A.rank() # Step 1: Compute the rank

    if r == self.nrows():
        B = A
    else:
        # Steps 2 and 3: Extract out a submatrix of full rank.
        P = A.transpose().pivots()
        B = A.matrix_from_rows(P)

    # Step 4: Now we instead worry about computing the reduced row echelon form of B.
    pivots = B.pivots()

    # Step 5: Apply solver
    C = B.matrix_from_columns(pivots)
    pivots_ = set(pivots)
    non_pivots = [i for i in range(B.ncols()) if not i in pivots_]
    D = B.matrix_from_columns(non_pivots)
    X = C.solve_right(D, algorithm="LinBox:Blackbox")

    R = self.parent()()
    for i in range(len(pivots)): R[i,pivots[i]] = 1
    for i in range(X.nrows()):
        for j in range(X.ncols()):
            R[i,non_pivots[j]] = X[i,j]
    return R

However, as we have to call the solver repeatetly (or find a good matrix-matrix solver) I lost interest in implementing this thing. What is missing, is to hack LinBox to return the pivot columns when performing InPlaceLinearPivoting which is “Symbolic Reordering”. As a side product of my attempts some operations on sparse matrices over GF(p) are way faster now in SAGE (See #655). Also, if you - my dear reader - know about any fast implementation for that problem, please let me know.

posted at: 22:00 :: permanent link

Sun, 06. May 2007

Pretty Pictures

A nifty feature which is going to be in SAGE 2.5 is matrix “visualization”. That means SAGE produces an image for your matrix indicating which parts are sparse and which are dense. Though, this is a trival 20 liner it is a pretty useful tool to discover structure.

As an example look at some matrices occuring during F4 if applied on Cyclic6 over $GF(127)$ or $CTC_{3,3,3}$ as shown below.
F4 matrices

I have uploaded a set of images for all matrices occuring during the F4 computations against that CTC instance to flickr. Also, I have uploaded some photos from the ECRYPT PhD Summerschool 2007 from which I am returning when writing this. Finally, I have uploaded the slides of a short talk about SAGE I gave at that Summerschool.

posted at: 22:13 :: permanent link

Wed, 14. Mar 2007

Westcoast Wrap-Up

In case anybody wonders, this is what I have done during my stay at the US westcoast.

posted at: 09:59 :: permanent link

Wed, 14. Feb 2007

Talks, Talks, Talks

posted at: 20:55 :: permanent link

Thu, 08. Feb 2007

Random Bits and Pieces

posted at: 09:12 :: permanent link

Thu, 14. Dec 2006

SAGE 1.5.0.2 releases

This blog would be even more boring if every SAGE release was properly announced. But this release is a really big one. SAGE 1.5 features a total rewrite of much of the basic arithmetic to make it both faster and better to understand. All matrix classes are now written in SageX/Pyrex/C such that they are much faster now. Givaro’s finite extension fields were included as the default implemention which also means a significant speed improvement. Also, SAGE now has some graph theory support.

SAGE may be downloaded here and tried online here.

posted at: 12:20 :: permanent link

Wed, 11. Oct 2006

SAGE Days 2.2-2.4

SAGE Days 2 are over and it was great fun. Also we have made quite a bit of progress: Addition of integers was used to test-bed performance/architecture improvements and as a result for large integers (i.e., greater than word size) SAGE now seems only 50% slower than MAGMA and for small integer SAGE integers are twice or more as slow (but there is room for improvements). Python integers (up to word size) however are way faster than MAGMA integer if we didn’t get all the benchmarks wrong. An alpha interface to Axiom was also written and some solution for an implementation problem with p-adic numbers has been found which I know nothing about, the SAGE notebook will not be a spammers/cross-site-scripters heaven anymore in the near future, and tons of other stuff got wrapped/implemented/discussed. Slides/mp3s should be/appear at sage.math.

posted at: 08:09 :: permanent link

Sun, 08. Oct 2006

SAGE Days 2.1

There have been lots of talks including both of mine. J. Voight btw. told me (and the audience) that he sat down with Allan Steel and asked him about all his secrets about making his F4 implementation such fricking fast. Allan Steel responded that his implementation was very slow at first and that all his speed-ups where due to implementation tricks and not mathematical ideas. So there might be hope for my implementation.

Elsewhere: We are putting together a comprehensive Wiki page on making Pyrex code fast which might be interesting to the general Pyrex audience and not only SAGE developers. We are focusing on SAGE, though.

posted at: 18:39 :: permanent link

Valid XHTML 1.0 Strict Valid CSS! blosxom