M4RI-20091101 Released
I just tagged the new release. Get it at http://m4ri.sagemath.org or from http://bitbucket.org/m4ri/malb. We did not change that much but this release finally has our improvements from Sage Days 16 where we improved matrix elimination quite a bit.
While there is still some work to do (see the bump in the plots above), this release might be a first candidate where it makes sense to switch to LQUP/PLUQ by default for matrix elimination (e.g. in PolyBoRi).
Fri, 09. Jan 2009“What Every Programmer Should Know About Memory”
is the title of an interesting paper by Ulrich Drepper of Rad Hat, Inc. The paper is — as you might have guessed — about the memory architecture in modern CPUs and what you — dear programmer — can do to improve performance of your code based on this knowledge. The paper has a section on tools for performance tuning which also discusses OProfile. OProfile is an interface to the CPU’s event counters (which vary across platforms). These counters can be queried to investigate the behaviour of a given piece of code on a given CPU. Intel’s Intel 64 and IA-32 Architectures Optimization Reference Manual suggests a list of event counters to look at. For instance the Intel engineers write:
“Clocks Per Instruction Retired Ratio (CPI): CPU_CLK_UNHALTED.CORE / INST_RETIRED.ANY. The Intel Core microarchitecture is capable of reaching CPI as low as 0.25 in ideal situations. But most of the code has higher CPI The greater value of CPI for a given workload indicate it has more opportunity for code tuning to improve performance. The CPI is an overall metric, it does not provide specificity of what microarchitectural sub-system may be contributing to a high CPI value.”
To take it for a spin I chose the obvious target: the libM4RI library. Below is the CPI for bench_elimination 20000 20000 pluq.
== UNIFORMLY RANDOM ==
symbol CPI % of overall time
------------------------------------------------------
_mzd_mul_m4rm 0.94 (49.68)
mzd_process_rows 1.76 (13.68)
.plt 0.62 (10.47)
mzd_process_rows2_pluq 1.88 (10.07)
mzd_make_table 0.76 ( 8.31)
mzd_copy 1.85 ( 1.80)
_mzd_mul_naive 1.10 ( 1.38)
mzd_apply_p_right_trans 0.99 ( 1.14)
_mzd_trsm_lower_left_even 0.75 ( 0.94)
mzd_apply_p_right 0.81 ( 0.93)
mzd_combine 3.17 ( 0.61)
Note that mzd_process_rows() and mzd_process_rows2_pluq() (which are both called from the PLUQ MMPF base case) have a rather high CPI compared to the rest of the code, indicating that I should sit down and implement using more than two tables (multiplication uses eight tables). I don’t understand the high CPI for mzd_copy() though. Half rank matrices behave quite differently, but surprisingly the CPI for mzd_apply_p_right_trans() is rather low.
== HALF RANK ==
symbol CPI % of overall time
------------------------------------------------------
_mzd_mul_m4rm 0.93 (40.01)
mzd_apply_p_right_trans 0.79 (10.05)
mzd_apply_p_right 0.78 ( 7.44)
mzd_make_table 0.75 ( 7.26)
mzd_find_pivot 1.44 ( 6.00)
.plt 0.68 ( 5.64)
_mzd_pluq_mmpf 1.58 ( 4.58)
_mzd_pluq_submatrix 0.90 ( 3.88)
mzd_col_swap 1.20 ( 3.58)
mzd_process_rows 1.76 ( 2.85)
mzd_row_add_offset 1.61 ( 2.24)
mzd_process_rows2_pluq 1.88 ( 2.10)
_mzd_mul_naive 1.19 ( 1.49)
mzd_copy 2.17 ( 1.32)
mzd_combine 3.31 ( 0.89)
_mzd_trsm_lower_left_even 0.75 ( 0.25)
_mzd_trsm_upper_left_even 0.67 ( 0.13)
mzd_init_window 1.24 ( 0.07)
My preliminary understanding of those high .plt values is that compiling the benchmarketing software with -static would improve the timings (or that I should inline more?).
Sat, 03. Jan 2009New M4RI Release Coming Up
I will probably cut a new M4RI release within a week or so. The main changes are:
- Asymptotically Fast PLUQ Factorisation [mzd_pluq() and _mzd_pluq_mmpf()] due to Clément Pernet and myself: This enables asymptotically fast rank computation, row echelon forms and rank profiles. However, don’t hold your breath yet because the code is not really optimised yet. While it is faster than M4RI for random dense matrices it is slower for sparse-ish and structured matrices (see image below).
- System Solving [mzd_solve_left()]: Jean-Guillaume Dumas wrote a high-level wrapper around the PLUQ and TRSM routines to provide linear system solving.
- M4RI Performance Improvement [mzd_echelonize_m4ri()]: A bug was fixed in M4RI which resulted in poor performance of M4RI for sparse-ish matrices (see my blog post for details).
- Refactoring: A few functions where added, deleted and renamed. This release will break source compatibility.
On a related note: Marco Bodrato came up with a new Strassen-like sequence for multiplying and squaring matrices which provides a small (linear) speed-up for squaring. He also provides a drop-in strassen.c replacement for M4RI-20080521 which implements his new sequence.
Sat, 13. Dec 2008Bit Bucket
About three weeks ago I discovered Bitbucket: a web service (with free and paid-for plans) for hosting Mercurial repositories. So far it has everything I would want:
- a free plan with 150MB of disk space, one private repository and as many public repositories as you like;
- the offer to host your open-source project even if it is bigger than 150MB;
- optional issue trackers and wikis (which are also under revision control) for each repository;
- convenient online source code browsing, viewing/comparison of changesets, downloads;
- push/pull via SSH (with public keys) and HTTPS;
- straight-forward management of read/write access control for each repository;
- and all kinds of third party service integrations (twittering your changesets and such).
I’m now hosting the main M4RI repository on bitbucket and so far it is a very smooth experience. Speaking of M4RI, it now contains an implementation of PLUQ factorisation inspired by Greg’s M4RI algorithm dubbed “Method of Many People Factorisation” in brilliantrussian.c. While this implementation seems to do its job, we still have iron out a couple of bugs in the recursive PLUQ factorisation codebase.
Finally, some random code snippets I posted on this blog are now also available on bitbucket (e.g., F4, F5, Matrix F5, DES equation system generators, Present equation system generators, an ANF to CNF converter).
Mon, 09. Apr 2007SuperKarambaR
Again, you have to write some code during EasterHegg. As I was way too lazy to do actual work I geeked out and played with Superkaramba - the widget engine for the KDE desktop. So I hacked together two widgets to display flickr images on your desktop. Karambrrr displays one image at a time and PhotoStackr displays - you guess it - a stack of images.
Btw. when I say “hacked together” I mean it, so things may go wrong for you … and, yes, I am aware that these names are ridicilous.
Tue, 21. Nov 2006Memory Profiling
One thing SAGE is lacking is the ability to profile memory usage of a calculation. There are some tools available for Python allowing to inspect the used memory at a given point in time, however nothing seems to exist to draw pretty pictures of mem usage during the calculation. A workaround is to use a shell script sampling the SAGE process’ RAM usage every couple of seconds and write that information to some file. Such a script was posted to this forum and after some slight patching + wrapping. the memory usage of my F4 implementation computing a lexicographical Gröbner basis for a CTC ideal with B=4 and Nr=3 looks like this.
The graph doesn’t say much - because the Python garbage collector might interfear - besides that the memory consumption is 3.5 GB max for this Gröbner basis calculation. Still, I guess 2.x GB after the calculation finished points to a memory leak somewhere.
Elsewhere: Looks like Uni-Bremen is benchmarker’s heaven.
Mon, 23. Oct 2006MAGMA Online Calculator
In case you haven’t noticed there is a free MAGMA Online Calculator (which is based on code by SAGE’s William Stein btw.). Also in case you haven’t noticed Python has pretty neat http support built-in. So some lines like
import urllib cmd = urllib.quote(cmd) url = "http://magma.maths.usyd.edu.au/calc/?input=%s"%cmd fp = urllib.urlopen(url) answer = fp.read()
allow you to use that online calculator in your Python program if you are willing to accept the horrible lag and the 20s timeout. If you put a couple of pre-/post-processing lines around it you end up with something somehow usable. Of course, you could also use it as a fall back to interface (a stripped down) MAGMA from SAGE if there is no local MAGMA installation available. But that would probably only cause bad vibes with the MAGMA people.
Tue, 17. Oct 2006The Need for Speed
First, I am aware that the current version of MAGMA is v2.13-5 and that MAGMA’s F4 implementation was heavily optimized in version 2.11-8 as e.g. shown here. That said, I had the chance to play with MAGMA v2.11-2 a bit and was actually a bit surprised how well the now truly open-source Singular 3-0-2 performed compared to it.
# construct a random CTC ideal with blocksize=10 and number of round = 1
sage: F,s = ctc_random_MQ(Nr=1,B=10,order=”degrevlex”)
sage: time _ = F.ideal().groebner_basis(“magma:GroebnerBasis”)
CPU times: user 2.60 s, sys: 0.17 s, total: 2.77 s
Wall time: 124.8 # MAGMA’s GroebnerBasis
sage: time _ = F.ideal().groebner_basis()
CPU times: user 2.36 s, sys: 0.12 s, total: 2.49 s
Wall time: 42.01 # Singular’s groebner
Again, I know that the most recent version of MAGMA is much faster than the most recent version of Singular when it comes to Gröbner bases and I know that Allan Steel did an amazing job with his F4 implementation. However, the benchmarks shown above indicate that the open-source world is not as far off as I used to believe. Feel free to criticize my benchmarks, if they are wrong, I’ll correct them as soon as possible.
Elsewhere: You might be interested in this document which lists tips and tricks how to make your Pyrex code faster and should be of interest beyond SAGE. Also my quick comparison of malloc replacements could be of interest outside of SAGE - now that omalloc is GPL’d.
Thu, 14. Sep 2006SAGE Days 2
are coming up and I’m going to be there. I’ll give a talk on Gröbner bases in SAGE and two smaller, informal talks: One on Pyrex and one on Algebraic Attacks on Block Ciphers. Spoiler-warning: These links point to my beta or alpha quality slides.
Speaking of slides: I adapted the nice Oxygen Beamer theme to sport some SAGE references:
You may find the respective sources here, here, and here.
Mon, 29. May 2006sage.math.washington.edu
“This is computer for very open collaboration among SAGE developers and testing of intense SAGE calculations. It is a special-purpose 64-bit computer built by Western Scientific that has 64GB of RAM and 16 AMD Opteron cores.” (sage.math.washington.edu) … and I have shell access. You may browse everybody’s home directory which of course includes mine.
Wed, 26. Apr 2006SAGE AJAX Calculator
William Stein released a new AJAX based SAGE calculator for those desiring to try out SAGE without installing any extra software. It is currently a bit slow as every character does a complete round trip between the server and the client before being displayed.
Elsewhere: Check this neat comparison of computer algebra systems.
Fri, 14. Apr 2006Native Compiler Toolchain on OpenZaurus
These are some tips and tricks on how to setup a native compiler toolchain right on your Zaurus (SL-5500) running the OpenZaurus distribution. I found it to be a painful process and will try to ease the pain for you a bit. All this applies to OpenZaurus 3.5.3., the situation might be better with 3.5.4.
Some packages are available via the OpenZaurus feed for your platform and the quickest way to add the missing ones is to use the Debian ARM stable repository. You can install any debs like you would do with ipkgs, however some postinstall scripts might fail: Ignore or remove them.
- My host system is Debian Etch x86.
- I used the “free-trial” VirtualMhz emulator for this which is a very nice but proprietarian product by the way. You will need to provide them with an e-mail address and you’ll need libwx_gtk-2.4.so.0 which you’ll need to compiler yourself if not running Sarge and place in e.g. /usr/local/lib
- I recommend a really big CompactFlash card or NFS mount to hold all the data, let’s call it’s mount point /media/nfs.
- I recommend you move all the /var /tmp and /usr/lib/ipkg to /media/nfs and setup the correct symlinks.
- Packages found in the Debian ARM distribution: perl-base, linux-kernel-headers, and m4.
- Packages found in the OpenZaurus feed: g++, cpp, gcc, binutils, libc6-dev, libstd++-dev, and libstdc++6, so: ipkg install g++ cpp gcc binutils libc6-dev libstd++-dev libstdc++6.
- all the “gcc”, “ld” etc. binaries have names like “arm-linux-gcc” so you need to set some symbolic links so e.g. “gcc” can be found.
- Add /media/nfs/usr/bin to your $PATH.
- Add /media/nfs/usr/lib to /etc/ld.so.conf and call ldconfig afterwards.
- /usr/lib/libc.sois broken, comment out line 4 (the one with “BUG” in it), seriously.
- libstdc++.la is broken as well, move it out of the way.
- You might have to add some symlinks to some libs which I cannot remember, so do so when the compiler complaints.
- ar as provided by BusyBox might not work as expected by some ./configure scripts so make sure to correct the symlink to the ar implementation provided by binutils.
- I surly missed something important and your system will break, good luck!
Sun, 09. Apr 2006
GnuPG 1.4.2.2 and 1.4.3 for OSX
As the MacGPG project does not offer binaries for the critical GnuPG security update (1.4.2.2) but only instructions how to build the new package I offer binaries of GnuPG version 1.4.2.2. (critical bugfix) and the new version 1.4.3. I couldn’t believe it myself but there seem to exist Mac users without a working C compiler installed. So here is what you get:
- GnuPG 1.4.2.2 Tar.gz (sig) archive containing the binary which may be copied over the existing one. It was built by Till on his G4 PowerBook running Mac OSX 10.4.6 and has IDEA included.
- GnuPG 1.4.3 Tar.gz (sig) archive containing the binary which may be copied over the existing one. It was built by me on BigMac which I believe is running Mac OSX 10.4.5.
- GnuPG 1.4.3. Pkg (sig) which may be installed as Mac users are used to. It was built by Till on his G4 PowerBook running Mac OSX 10.4.6
Please get the signature (.asc) as well and verify (as the 1.4.2.2 fixes a critical bug which makes this check useless you actually cannot verify the signature) the signature by typing:
gpg --verify 'TAR.GZ or PKG file'
Please note that these signatures only state that I had possession of these files. So you see the same files as I did when uploading them. Neither do I know what Till did to his binaries nor do I fully control BigMac to rule out any attack.
Finally there is no reason to trust me so build it yourself.
Final words: Backup everything before using these binaries as I do not know if they work at all and please report any bugs you stumble across back to me. Don’t blame me if your computer explodes.
Fri, 07. Apr 2006Howto use your favorite Python debugger with SAGE
Even though SAGE or iPython has a built in command line debugger here is how to use DDD or Eric3 with pieces of SAGE written in Python or scripts written in SAGE:
DDD
Use a shell script like this or type into a shell:
#!/bin/bash # change this to your SAGE_ROOT export SAGE_ROOT="Your SAGE ROOT here" # setup environment source $SAGE_ROOT/local/bin/sage-env # make sure to add pydb the enhanced Pyton Debugger export PYTHONPATH=$PYTHONPATH:/usr/share/python-support/pydb # and run DDD ddd --debugger "python /usr/bin/pydb"
Now you should be able to open a SAGE source file (use “Open Source” for that), press “Run” and “Continue” and your SAGE script should be running. I checked with this a tiny example like this:
from sage.all import * r=MPolynomialRing(GF(127),2,'x') r.gens()
Breakpoints, displays (only string representation) and backtraces worked.
Eric3
You will need working sage-pyton and sage scripts somewhere in your $PATH for this one. Write a shell script like this:
#!/bin/bash
export SAGE_ROOT=/opt/sage/
source $SAGE_ROOT/local/bin/sage-env
sage-python ${*}
Configure Eric3 to use this script as a special Python interpreter. Make sure to choose “without Qt” as the “Debug Client” and it should work. Additionally you may want to generate an .api file for Eric3 by calling the gen_python_api.py script found at the Eric3 website with the appropriate environment variables set up (use sage-env for this). Or use this one for SAGE v1.1.2.
Sat, 01. Apr 2006Current state of the kopete-silc-plugin
There have been some nice bugfixes lately as stesie and I picked up the kopete-silc-plugin again.
- It builds correctly with newer kopete versions.
- WHOIS and WATCH work much better.
- The online status plugin does not kill the plugin anymore.
- You may add contacts via their nicknames.
- Depends more on fingerprints than on nicknames for identification.
- Some WHOIS information is displayed int the user info dialog.
At least my planned features are:
- MIME Support,
- channel configuration if you are the operator,
- more verbatim WHOIS information in user info dialog,
- NAT detection, and
- zero SEGFAULTs.
If something is missing, let me know. Use it and report bugs.
Tue, 24. Jan 2006Software for Algebra and Geometry Experimentation
Speaking of SAGE: It’s a pretty cool open-source computer algebra system.
Instead of being written from scratch it uses many existing special purpose systems (Singular, GAP, PARI, NTL) to
get to the juice. But I guess the best thing is that it is written in Python, i.e. it is using a real object-oriented programming language with all bells and whistles. No for loops starting with 1 but an object-oriented interface to mathematics. Download it, read the reference manual, or try it online.
Python never, ever frees memory
Just found out that Python doesn’t release any memory it has
allocated, but keeps it for later use. It is designed to do so. So if
you - let’s say - use SAGE to construct
a MPolynomialRing with 20000 variables for some reason and you’re happy enough to have 2GB of RAM to support that, you’ll have a sage.bin process hanging around never releasing those 2GB until you kill it. See Evan Jones’ website for some more details.
kopete_silc Development
As reported elsewhere I’ve joined stesie in the kopete_silc team. We are developing a Kopete plugin for the beautiful SILC protocol. SILC is basically a chat protocol with heaps of good crypto poured in so encrypted group chats, file transfers and so on are possible. The kopete_silc plugin so far supports (group) chat, basic file transfer, some channel management, and some public key handling: It is usable even though there are SEGFAULTs. You might want to report some unknown on the mailinglist.
Stesie provides public WebCVS access and some Debian packages for Sid. I’m going to provide packages for Debian Sarge.
Wed, 04. Jan 2006ValueKonverter 0.1.3 released
Now the Widget actually scales when the size of the application is changed. It took me a while to figure out how to do it. It can be found at the usual places.
Wed, 14. Dec 2005ValueKonverter 0.1.2 releases
Again two bugs have be squashed:
- layout of main window should display correctly for more people now
- 0x200 doesn’t get cut off in 160-bit hex field anymore
Grab sources here and Debian packages here (sarge) and here (sarge + KDE 3.4).
If you’re running any other Debian distribution a apt-get build-dep valuekonverter followed by a apt-get -b source valuekonverter should do the trick.
Wed, 19. Oct 2005ValueKonverter 0.1.1 released
Two bugs have been removed:
- integer focus problem is no more
- main window is resizeable now
The libdisasm bug has not yet been solved but the debian package features some accurate build dependencies and conflicts now. Grab the source here and debian packages either in the Sarge repository or the custom one.
Mon, 10. Oct 2005ValueKonverter released
Even though there are binary and source packages for the ValueKonverter available, there are some issues left to be solved
- libdisasm doesn’t set the access bits correct
- integer input has some focus problems


