Not really blogging

A. Makelov

Category Archives: Google Summer of Code 2012

Google Summer of Code 2012: Week 13

Hi all, here’s a brief summary of my 13th (and last) week of GSoC.

  • I continued my work on centralizers, improving normal closure, derived in lower central series, etc. My most recent pull request containing these additions just got merged and can be found here. This week I spent a lot of time on writing better tests and developing some new test practices. The group-theoretical algorithms in the combinatorics module are getting more and more complicated, so better, cleverer and more thorough tests are needed. I came up with the following model for verification:
    – since the results of the tests are very hard to compute by hand, some helper functions are needed that find the wanted object in a brute-force manner using only definitions. For example, we often look for a subgroup with certain properties. The most naive and robust approach to this is to:
    – list all group elements, go over the list and check each element for the given property.
    – Then, make a list of all the “good” elements and compare it (as a set) with the list of all elements of the group the function being tested returns.
    Hence, a new file was created, sympy/combinatorics/testutil.py, that will host such functions. (Needless to say, they are exponential in complexity, and for example going over all the elements of SymmetricGroup(n) becomes infeasible for n larger than 10.)
    – The presence of functions being used to test other functions gets us in a bit of a Quis custodiet ipsos custodes? situation, but this is not fatal: the functions in testutil.py are extremely straightforward compared to the functions in perm_groups.py that they test, and it’s really obvious what they’re doing, so it’ll take less tests to verify them.
    – In the tests for the new functions from perm_groups.py, I introduced some comments to indicate what (and why) I’m testing. Another practice that seems to be good is to verify the algorithms for small groups (degrees 1, 2, 3) since there are a lot of corner cases there that seem to break them.
  • I started work on improving the disjoint cycle notation, namely excluding singleton cycles from the cyclic form; however, there are other changes to handling permutations that are waiting to be merged in the combinatorics module here, so I guess I’ll first discuss my changes with Christopher. Currently, I see the following two possibilities for handling the singleton cycles:
    – add a _size attribute to the Permutation class, and then, when faced with something like Permutation([[2, 3], [4, 5, 6], [8]]), find the maximum index appearing in the permutation (here it’s 8) and assign the size of the permutation to that + 1. Then it remains to adjust some of the other methods in the class (after I adjusted mul so that it treats permutations of different sizes as if they leave all points outside their domain fixed, all the tests passed) so that they make sense with that new approach to cyclic forms.
    – more ambitious: make a new class, ExtendedArrayForm or something, with a field _array_form that holds the usual array form of a permutation. Then we overload the __getitem__ method so that if the index is outside the bounds of self._array_form we return the index unchanged. Of course, we’ll have to overload other things, like the __len__ and __str__ to make it behave like a list. Then instead of using a list to initialize the array form of a permutation, we use the corresponding ExtendedArrayForm. This will make all permutations behave as if they are acting on a practically infinite domain, and if we do it that way, we won’t have to make any changes to the methods in Permutation – everything is going to work as expected, no casework like if len(a) > len(b),... will be needed. So this sounds like a rather elegant approach. On the other hand, I’m not entirely sure if it is possible to make it completely like a list, and also it doesn’t seem like a very performance-efficient decision since ExtendedArrayForm instances will be created all the time. (see the discussion here).
  • Still nothing on a database of groups. I looked around the web for a while but didn’t find any resources… the search continues. Perhaps I should ask someone more knowledgeable.

That’s it for now, and that’s the end of my series of blog posts for the GSoC, but I don’t really feel that something has ended since it seems that my contributions to the combinatorics module will continue (albeit not that regularly : ) ). After all, it’s a lot of fun, and there are a lot more things to be implemented/fixed there! So, a big “Thank you” to everyone who helped me get through (and to) GSoC, it’s been a pleasure and I learned a lot. Goodbye!

 

Advertisements

Google Summer of Code 2012: Week 12

Hi all, here’s a brief summary of the 12th week of my GSoC:

  • Centralizers got some more attention since there were several bugs in the implementation from last week; this also exposed a bug in .subgroup_search() as it is on sympy/master right now. Fortunately, I located it and fixed it earlier today, so the fix for .subgroup_search() will be contained in my next pull request. In fact, it is just three more lines that should be added. Namely,
    # line 29: set the next element from the current branch and update
    # accorndingly
    c[l] += 1
    element = ~(computed_words[l - 1])
    

    should be replaced with

    # line 29: set the next element from the current branch and update
    # accorndingly
    c[l] += 1
    if l == 0:
        element = identity
    else:
        element = ~(computed_words[l - 1])
    

    since we might be at the bottom level with l=0. In this case, python doesn’t yell at you for looking up computed_words[-1] since negative indices wrap around the list in python. Yet another silly mistake that’s incredibly hard to track down! I hope that it will work properly from now on, and I’ll have to include some more tests to it.

  • The description of the algorithm for finding the center in polynomial time given in [1] didn’t really make sense to me, so instead a straightforward one,
    def center(self):
        return self.centralizer(self)
    

    was used. This can be updated later when I (or someone else) figures out the polynomial-time algorithm.

  • A new, faster algorithm for finding normal closures: this one uses the incremental version of Schreier-Sims, and some randomization. It’s described in [1].
  • Some applications of normal closure: the derived series, lower cenral series, the commutator of two subgroups of a group, nilpotency testing. Now we have things like this:
    In [68]: from sympy.combinatorics.named_groups import *
    In [69]: S = SymmetricGroup(4)
    In [70]: ds = S.derived_series()
    In [71]: len(ds)
    Out[71]: 4
    In [72]: ds[1] == AlternatingGroup(4)
    Out[72]: True
    In [73]: ds[2] == DihedralGroup(2)
    Out[73]: True
    In [74]: ds[3] == PermutationGroup([Permutation([0, 1, 2, 3])])
    Out[74]: True
    

    demonstrating the well-known normal series of groups e < K_4 < A_4 < S_4 that solves the symmetric group on 4 letters. Note that the normal closure algorithm was already there thanks to the work of Mario, I just improved it a bit and added some applications.

  • Moved DirectProduct() to a new file, group_constructs.py, that is planned to hold functions that treat several groups equally (for one other example, the commutator of two groups in the full symmetric group) rather than treating them in some sort of subgroup-supergroup relationship (such as .centralizer()).

I wrote docstrings for the new stuff, and my current work can be found on my week10 branch. There will be some comprehensive test following the new additions (and I’ll need GAP to verify the results of some of them, probably). It seems that Todd-Coxeter won’t happen during GSoC since there’s just one more week; instead, I plan to focus on improving disjoint cycle notation and group databases.

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

Google Summer of Code 2012: Week 11

Hi all, here’s a brief summary of the 11th week of my GSoC.

  • Yay! Subgroup searching now works with the use of .stabilizer(), as I discussed in my previous blog post. Surprisingly, the running time is similar to that of the flawed version using .baseswap() (whenever the one using .baseswap() works), you can play around with the two versions on my week6 (has a bug, using .baseswap()) and week9 (seems to work, using .stabilizer()) branches.
  • Consequently, I made a new pull request containing the incremental version of Schreier-Sims, the remove_gens utility for getting rid of redundant generators in a strong generating set, and the new (working) subgroup_search algorithm. You’re most welcome to help with the review!
  • I worked on several applications of subgroup_search() and the incremental Schreier-Sims algorithm. Namely, the pointwise stabilizer of a set of points (via the incremental Schreier-Sims algorithm):
In [4]: from sympy.combinatorics.named_groups import *
In [5]: A = AlternatingGroup(9)
In [6]: G = A.pointwise_stabilizer([2, 3, 5])
In [7]: G == A.stabilizer(2).stabilizer(3).stabilizer(5)
Out[7]: True

(this is much faster than the naive implementation using .stabilizer() repeatedly), and the centralizer of a group H inside a group G:

In [11]: from sympy.combinatorics.named_groups import *
In [12]: S = SymmetricGroup(6)
In [13]: A = AlternatingGroup(6)
In [14]: C = CyclicGroup(6)
In [15]: S_els = list(S.generate())
In [16]: G = S.centralizer(A)
In [17]: G.order()
Out[17]: 1
In [18]: temp = [[el*gen for gen in A.generators] == [gen*el for gen in A.generators] for el in S_els]
In [19]: temp.count(False)
Out[19]: 719
In [20]: temp.count(True)
Out[20]: 1
In [21]: G = S.centralizer(C)
In [22]: G == C
Out[22]: True
In [23]: temp = [[el*gen for gen in C.generators] == [gen*el for gen in C.generators] for el in S_els]
In [24]: temp.count(True)
Out[24]: 6

(it takes some effort to see that these calculations indeed prove that .centralizer() returned the needed centralizer). The centralizer algorithm uses a pruning criterion described in [1], and even though it’s exponential in complexity, it’s fast for practical purposes. Both of the above functions are available (albeit not documented yet) on my week10 branch.

  • The next steps are an algorithm for the centre in polynomial time, and an algorithm to find the intersection of two subgroups! And after that, I hope to be able to implement the Todd-Coxeter algorithm…

That’s it for now!

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

Google Summer of Code 2012: Week 10

Hi all,

here’s a brief summary of what I’ve been doing during the 10th week of my GSoC.

  • Though I fixed a bug in the SUBGROUPSEARCH function during the week, I ran some more comprehensive tests as I had planned to, and some of them broke the function. If you’re particularly interested, something like that will work:
    In [87]: S = SymmetricGroup(5)
    In [88]: prop_fix_3 = lambda x: x(3) == 3
    In [89]: %autoreload
    In [90]: S.subgroup_search(prop_fix_3)
    ---------------------------------------------------------------------------
    StopIteration                             Traceback (most recent call last)
    <ipython-input-90-6b85aa1285b8> in <module>()
    ----> 1 S.subgroup_search(prop_fix_3)
    
    /home/alexander/workspace/sympy/sympy/combinatorics/perm_groups.py in subgroup_search(self, prop, base, strong_gens, tests, init_subgroup)
    2660
    2661                 # this function maintains a partial BSGS structure up to position l
    -> 2662                 _insert_point_in_base(res, res_base, res_strong_gens, l, new_point, distr_gens=res_distr_gens, basic_orbits=res_basic_orbits, transversals=res_transversals)
    2663                 # find the l+1-th basic stabilizer
    2664                 new_stab = PermutationGroup(res_distr_gens[l + 1])
    
    /home/alexander/workspace/sympy/sympy/combinatorics/util.py in _insert_point_in_base(group, base, strong_gens, pos, point, distr_gens, basic_orbits, transversals)
    423     # baseswap with the partial BSGS structures. Notice that we need only
    424     # the orbit and transversal of the new point under the last stabilizer
    --> 425     new_base, new_strong_gens = group.baseswap(partial_base, strong_gens, pos, randomized=False, transversals=partial_transversals, basic_orbits=partial_basic_orbits, distr_gens=partial_distr_gens)
    426     # amend the basic orbits and transversals
    427     stab_pos = PermutationGroup(distr_gens[pos])
    
    /home/alexander/workspace/sympy/sympy/combinatorics/perm_groups.py in baseswap(self, base, strong_gens, pos, randomized, transversals, basic_orbits, distr_gens)
    2472             # ruling out member of the basic orbit of base[pos] along the way
    2473             while len(current_group.orbit(base[pos])) != size:
    -> 2474                 gamma = iter(Gamma).next()
    2475                 x = transversals[pos][gamma]
    2476                 x_inverse = ~x
    
    StopIteration:
    
    

    The reason is certainly the change of base performed on line 11 in the pseudocode (this is also indicated in my code on my local week6 branch here ). The use of the function BASESWAP there is what gets us into trouble. It is meant to be applied to  base and a strong generating set relative to it, switch two consecutive base points and change the generating set accordinly.  However, in subgroup_search the goal is to change a base (b_1, b_2, \ldots, b_l, \ldots, b_k) to (b_1, b_2, \ldots, b_l', \ldots, b_k) where b_l' is a new point. The book ([1]) mentions that this is done by using BASESWAP but doesn’t provide any details. My strategy is the following: I cut the base so that it becomes (b_1, b_2,\ldots, b_l) and cut the correponding data structures – the strong generators strong_gens, the basic_orbits,  the transversals, and the strong generators distributed according to membership in basic stabilizers distr_gens (I know, I still have to rename this to strong_gens_distr). Then I append the point b_l' so that the base is (b_1, b_2, \ldots, b_l, b_l') and calculate an orbit and transversal for $b_l’$ under the stabilzier of b_1, b_2, \ldots, b_l. Finally I apply BASESWAP to this new base in order to switch the two rightmost points. Then I go back to (b_1, b_2, \ldots, b_l', \ldots, b_k) by appending what I had cut in the start and calculating a transversal/orbit for b_{l+1} under the stabilizer just found, that of b_1, \ldots, b_l'. Obviously, the resulting BSGS structures are valid only up to position l, and that’s all the information we can acquire without another application of baseswap or finding another stabilizer ( and in general, finding a stabilizer is a computationally hard task relative to calculating orbits/transversals). The entire purpose of this use of BASESWAP in SUBGROUPSEARCH is to obtain generators for the stabilizer of b_1, b_2, \ldots, b_l' and maintain a base/strong generating set that are valid up to a certain position. There are many such base changes performed on the same base throughout the course of the function and something goes wrong along the way. I still have to figure out why and where.

  • The good news: There is a straightforward alternative to using BASESWAP: maintain a list of generators for each of the basic stabilizers in (b_1, b_2, \ldots, b_k) and change it accordingly as the base is changed, using the function stabilizer() in sympy/combinatorics/perm_groups.py. For each base change we have to calculate one more stabilizer, so that’s not terrible. It is also sort of suggested in “Notes on Computational Group Theory” by Alexander Hulpke (page 34). The problem with this approach is that stabilizer() tends to return a group with many generators, and repeated applications keep increasing this number. However, using this removed the bug from SUBGROUPSEARCH. As before, more comprehensive tests are on the way : )
  • Yet another alternative : we can use the incremental Schreier-Sims algorithm with the new base (b_1, \ldots, b_l', \ldots, b_k) and the strong generating set for (b_1, \ldots, b_l, \ldots, b_k). There will likely be redundant generators after that, and it will probably involve more computation than finding a single stabilizer. However, in the long run (since there are many base changes performed) this might perform faster (due to the increasing number of generators that stabilizer() tends to create). I have not tried that approach yet.
  • Other than that, I had my latest major pull request merged! Thanks a lot to Stefan and my mentor David for the review! That was the largest one so far…
  • I started reading about some of the applications of subgroup search; subgroup intersection seems to be the easiest to implement, so I’ll probably go for it first.

That’s it for now : )

Google Summer of Code 2012: Week 9

Hi all, here’s a brief summary of what I’ve been doing for the 9th week of my GSoC.

This week saw (and still has to see) some exciting new additions:

I. The incremental Schreier-Sims algorithm.

This is a version of the Schreier-Sims algorithm that takes a sequence of points B and a generating set S for a group G as input, and extends B to a base and S to a strong generating set relative to it. It is described in [1], pp.87-93. The default value of B is [], and that of S is \text{G.generators}. Here’s an example:


In [41]: S = SymmetricGroup(5)
In [42]: base = [3, 4]
In [43]: gens = S.generators
In [44]: x = S.schreier_sims_incremental(base, gens)
In [45]: x
Out[45]:
([3, 4, 0, 1],
[Permutation([1, 2, 3, 4, 0]),
Permutation([1, 0, 2, 3, 4]),
Permutation([4, 0, 1, 3, 2]),
Permutation([0, 2, 1, 3, 4])])
In [46]: from sympy.combinatorics.util import _verify_bsgs
In [47]: _verify_bsgs(S, x[0], x[1])
Out[47]: True

The current implementation stores the transversals for the basic orbits explicitly (the alternative is to use Schreier vectors to describe the orbits – this saves a lot of space, but requires more time in order to compute transversal elements whenever they are needed. This feature is still to be implemented, and this probably won’t happen in this GSoC). The current implementation of the Schreier-Sims algorithm on the master branch uses Jerrum’s filter (for more details and comparisons of the incremental version and the one using Jerrum’s filter, go here) as an optimization, and also stores the transversals explicitly. The incremental version seems to be asymptotically faster though. Here’s several comparisons of the current version on the master branch and the incremental one which can be found on a local branch of mine which is somewhat inadequately called week6):

For symmetric groups:


In [50]: groups = []
In [51]: for i in range(20, 30):
....:     groups.append(SymmetricGroup(i))
....:
In [52]: for group in groups:
....:     %timeit -r1 -n1 group.schreier_sims()
....:
1 loops, best of 1: 590 ms per loop
1 loops, best of 1: 719 ms per loop
1 loops, best of 1: 981 ms per loop
1 loops, best of 1: 1.35 s per loop
1 loops, best of 1: 1.66 s per loop
1 loops, best of 1: 2.19 s per loop
1 loops, best of 1: 2.74 s per loop
1 loops, best of 1: 3.37 s per loop
1 loops, best of 1: 4.28 s per loop
1 loops, best of 1: 5.37 s per loop
In [53]: for group in groups:
....:     %timeit -r1 -n1 group.schreier_sims_incremental()
....:
1 loops, best of 1: 612 ms per loop
1 loops, best of 1: 737 ms per loop
1 loops, best of 1: 927 ms per loop
1 loops, best of 1: 1.15 s per loop
1 loops, best of 1: 1.41 s per loop
1 loops, best of 1: 1.72 s per loop
1 loops, best of 1: 2.1 s per loop
1 loops, best of 1: 2.52 s per loop
1 loops, best of 1: 3.02 s per loop
1 loops, best of 1: 3.58 s per loop

For alternating groups:


In [54]: groups = []
In [55]: for i in range(20, 40, 2):
....:     groups.append(AlternatingGroup(i))
....:
In [56]: for group in groups:
%timeit -r1 -n1 group.schreier_sims()
....:
1 loops, best of 1: 613 ms per loop
1 loops, best of 1: 1.03 s per loop
1 loops, best of 1: 1.77 s per loop
1 loops, best of 1: 2.65 s per loop
1 loops, best of 1: 3.51 s per loop
1 loops, best of 1: 5.31 s per loop
1 loops, best of 1: 7.71 s per loop
1 loops, best of 1: 11.1 s per loop
1 loops, best of 1: 15.3 s per loop
1 loops, best of 1: 19.1 s per loop
In [57]: for group in groups:
%timeit -r1 -n1 group.schreier_sims_incremental()
....:
1 loops, best of 1: 504 ms per loop
1 loops, best of 1: 787 ms per loop
1 loops, best of 1: 1.23 s per loop
1 loops, best of 1: 1.9 s per loop
1 loops, best of 1: 2.8 s per loop
1 loops, best of 1: 3.99 s per loop
1 loops, best of 1: 5.48 s per loop
1 loops, best of 1: 7.45 s per loop
1 loops, best of 1: 10 s per loop
1 loops, best of 1: 13.2 s per loop

And for some dihedral groups of large degree (to illustrate the case of small-base groups of large degrees):


In [58]: groups = []
In [59]: for i in range(100, 2000, 200):
....:     groups.append(DihedralGroup(i))
....:
In [60]: for group in groups:
%timeit -r1 -n1 group.schreier_sims()
....:
1 loops, best of 1: 29.6 ms per loop
1 loops, best of 1: 108 ms per loop
1 loops, best of 1: 278 ms per loop
1 loops, best of 1: 527 ms per loop
1 loops, best of 1: 861 ms per loop
1 loops, best of 1: 1.29 s per loop
1 loops, best of 1: 1.83 s per loop
1 loops, best of 1: 2.39 s per loop
1 loops, best of 1: 3.06 s per loop
1 loops, best of 1: 3.83 s per loop
In [61]: for group in groups:
%timeit -r1 -n1 group.schreier_sims_incremental()
....:
1 loops, best of 1: 20.8 ms per loop
1 loops, best of 1: 52.8 ms per loop
1 loops, best of 1: 121 ms per loop
1 loops, best of 1: 223 ms per loop
1 loops, best of 1: 365 ms per loop
1 loops, best of 1: 548 ms per loop
1 loops, best of 1: 766 ms per loop
1 loops, best of 1: 1 s per loop
1 loops, best of 1: 1.25 s per loop
1 loops, best of 1: 1.51 s per loop

In addition to this algorithm I implemented a related function _remove_gens in sympy.combinatorics.util which removes redundant generators from a strong generating set (since there tend to be some redundant ones after schreier_sims_incremental() is run):


In [68]: from sympy.combinatorics.util import _remove_gens
In [69]: S = SymmetricGroup(6)
In [70]: base, strong_gens = S.schreier_sims_incremental()
In [71]: strong_gens
Out[71]:
[Permutation([1, 2, 3, 4, 5, 0]),
Permutation([1, 0, 2, 3, 4, 5]),
Permutation([0, 5, 1, 2, 3, 4]),
Permutation([0, 1, 2, 3, 5, 4]),
Permutation([0, 1, 2, 4, 3, 5]),
Permutation([0, 1, 3, 2, 4, 5]),
Permutation([0, 1, 2, 5, 4, 3]),
Permutation([0, 1, 5, 3, 4, 2])]
In [72]: new_gens = _remove_gens(base, strong_gens)
In [73]: new_gens
Out[73]:
[Permutation([1, 0, 2, 3, 4, 5]),
Permutation([0, 5, 1, 2, 3, 4]),
Permutation([0, 1, 2, 4, 3, 5]),
Permutation([0, 1, 2, 5, 4, 3]),
Permutation([0, 1, 5, 3, 4, 2])]
In [74]: _verify_bsgs(S, base, new_gens)
Out[74]: True

II. Subgroup search.
This is an algorithm used to find the subgroup K of a given group G of all elements of G satisfying a given property P. It is described in [1], pp.114-118 and is quite sophisticated (the book is right when it says “The function SUBGROUPSEARCH is rather complicated and will require careful study by the reader.”). On the other hand, it is one of the most interesting additions to the groups module to date since it can do so much. The idea is to do a depth-first search over all group elements and prune large parts of the search tree based on several different criteria. It’s currently about 150 lines of code and works in many cases but still needs debugging. It can currently do some wonderful stuff like this:


In [77]: S = SymmetricGroup(6)
In [78]: prop = lambda g: g.is_even
In [79]: G = S.subgroup_search(prop)
In [80]: G == AlternatingGroup(6)
Out[80]: True

to find the alternating group as a subgroup of the full symmetric group by the defining property that all its elements are the even permutations, or this:


In [81]: D = DihedralGroup(10)
In [82]: prop_true = lambda g: True
In [83]: G = D.subgroup_search(prop_true)
In [84]: G == D
Out[84]: True

to find the dihedral group D_{10} as a subgroup of itself using the trivial property that always returns \text{True}; or this:


In [106]: A = AlternatingGroup(4)
In [107]: G = A.subgroup_search(prop_fix_23)
In [108]: G == A.stabilizer(2).stabilizer(3)
Out[108]: True

to find the pointwise stabilizer of \{2,3\}. And so on and so on. What is more wonderful is that you can specify the base used for G in advance, and the generating set returned for K will be a strong generating set with respect to that base!


In [119]: A = AlternatingGroup(5)
In [120]: base, strong_gens = A.schreier_sims_incremental()
In [121]: G = A.subgroup_search(prop_fix_1, base=base, strong_gens=strong_gens)
In [122]: G == A.stabilizer(1)
Out[122]: True
In [123]: _verify_bsgs(G, base, G.generators)
Out[123]: True

The bad news is that the function breaks somewhere. For example:


In [125]: S = SymmetricGroup(7)
In [126]: prop_true = lambda g: True
In [127]: G = S.subgroup_search(prop_true)
In [128]: G == S
Out[128]: False

This needs some really careful debugging, but overall it looks promising since it works in so many cases – so the bug is hopefully small : ).

So, that’s it for now!

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

Google Summer of Code 2012: Week 8

Hi everyone, here’s a brief summary of what I’ve been doing for the 8th week of my GSoC:

  • The issue with the BASESWAP function on page 103 of [1] that I discussed here is now resolved: one of the authors, Professor Derek Holt at Warwick, replied to me that this is indeed a typo and added it to the errata page here.
  • I studied the SUBGROUPSEARCH algorithm described in [1] in more depth. It takes as input a group G with a BSGS, a subgroup K < G with a BSGS having the same base as that of G, a property P such that P(g) for g \in G is either true or false, P(g) is always true for $g \in K$, and the elements of G satisfying P form a subgroup H, and tests \text{TEST}(g, l) used to rule out group elements (i.e., make sure they don’t satisfy P) based on the image of the first l base points of G, the so-called partial base image. It modifies K by adding generators until K = H, and returns a strong generating set for H. It performs a depth-first search over all possible base images (which by the definition of a base determine uniquely every element of G), but uses several conditions to prune the search tree and is said to be fast in practice. This algorithm is the basis for finding normalizers and centralizers and intersections of subgroups, so it’s pretty fundamental. One of its features is the frequent change of base for K: at level l in the search tree we want to make sure that the base for K starts with the current partial base image (i.e., the image of the first l points in the base). In [1] it is said that this requires only one application of BASESWAP (which swaps two neighbouring base points). This was confusing me for a while. However, since we want to only change the l-th base point at any base change, and the base after the l-th point doesn’t matter at level l, it seems that we can do the following. Treat the partial base image, denote it by c_1 c_2 \ldots c_l, as a base, and then run BASESWAP on c_1 c_2 \ldots c_l c, interchanging the last two elements, where c is the new l-th point in the base. Now I’m more confident that I can implement SUBGROUP search (the other parts of the procedure are easily approachable). But there is one other problem with it:
  • We want K, the group we initialize H with, to have the same base as G. The current deterministic implementation of the Schreier-Sims algorithm (using Jerrum’s filther) always produces a BSGS from scratch, and therefore we can’t tell it to make a BSGS for K with respect to some particular base. Hence we need an implementation of the so-called “incremental” Schreier-Sims algorithm, which takes a sequence of points and a generating set and extends them to a BSGS. This is also described in [1], together with some optimizations, and it won’t be very hard to go through the pseudocode and implement it – so that’ going to be the next step. It would also be a useful addition to the entire group-theoretical module since often in algorithms we want a BSGS with respect to some convenient base.

More or less, that’s it for now. In the next few days I’ll try to write some actual code implementing the above two bullets and get some more reviewing for my most recent pull request.

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

 

Google Summer of Code 2012: Week 7

Hi all,

here’s a brief summary of what I’ve been doing during the 7th week of my GSoC, as well as a general overview of what’s going on and where things are going with computational group theory in sympy.

Things I did during the week.

This week I focused on:

  • improving the existing code for the functions I recently added – the randomized Schreier-Sims algorithm, the function BASESWAP that changes two points in a base, and the PRINTELEMENTS function (I talk about these here and here). I included some comments in the bodies of the functions since these tend to be quite long. Also, I adopted some new naming conventions for handling all the structures related to a base and a strong generating set. It’d be nice if this naming convention is used throughout the combinatorics module (which for now depends mostly on me, as it seems 🙂 ), and it’d be nice if people provide some feedback on the names I chose. So here we go:
  • making possible the interaction with the deterministic Schreier-Sims algorithm. After some insights from Mario on the values returned by his implementation, I extracted from it the data necessary to make the algorihtms described in [1] that use a base and strong generating set possible.
  • splitting the code further, with the sympy.combinatorics.util file which now holds the internal functions used to handle permutaion groups (this can be later expanded with other internal functions across the combinatorics module).
  • Finally, adding docstrings, tests and making a pull request which is available here . It’s about 1300 lines of code, which is sort of bad, but I can remove some of the stuff and keep it for a future pull request.

So here are the naming conventions for working with a BSGS:

degree – the degree of the permutation group.

base – This is sort of obvious. A base for a permutation group G is an ordered tuple of points (b_1, b_2,\ldots, b_k) such that no group element g \in G fixes all the points b_1, b_2, \ldots, b_k (the significance of the ordering will become apparent later). This is implemented as a list.

base_len – the number of elements in a base.

strong_gens – the strong generating set (relative to some base). This is implemented as a list of Perm objects.

basic_stabilizers – For a base (b_1, b_2,\ldots, b_k) , the basic stabilizers are defined as G^{(i)} = G_{b_1, \ldots, b_{i-1}} := \{ g \in G | g(b_1) = b_1, \ldots, g(b_{i-1}) = b_{i-1}\} for i \in \{1, 2, \ldots, k\} so that we have G^{(1)} = G. This is implemented as a list of permutation groups.

distr_gens – the strong generators distributed according to the basic stabilizers. This means: for a base (b_1, b_2,\ldots, b_k) and a strong generating set S= \{ g_1, g_2, \ldots, g_t\}, distribute the g_i in sets S^{(i)} = G^{(i)} \cap S for i \in \{1, 2,\ldots, k\} where the G^{(i)} are defined as above. This is implemented as a list of lists holding the elements of the S^{(i)}

basic_orbits – these are the orbits of b_i under G^{(i)}. These are implemented as a list of lists, being the list of lists of keys for the basic transversals, see below.

basic_transversals – these are transversals for the basic orbits. Notice that the choice for these may not (and in most cases won’t be) unique. For one thing, it depends on the set of strong generators present (which is also not uniquely determined for a given base). They are implemented as a list of dictionaries indexed according to the base (b_1, b_2,\ldots, b_k) , with keys – the elements of the basic orbits, and values – transversal elements sending the current b_i to the key.

I wrote functions extracting basic_orbits, basic_transversals, basic_stabilizers, distr_gens from only a base and strong generating set, as well as functions for extracting all of them from a base, strong generating set, and a part of them, so that if any of them is available, it can be supplied in order to avoid recalculations.

Also, there is a straightforward test _verify_bsgs in sympy.combinatorics.util that tests a sequence of points and group elements for being a base and strong generating set. It simply verifies the definition of a base and strong generating set relative to it. There will likely be other ways to do that in the future – more effective, but surely more complicated and thus error-prone. This will serve as a robust testing tool

Where we are.

So, here’s a checklist of what I’ve promised in my proposal on the melange website, and which parts of it have already been implemented. This is reading the optimistic timeline. This all pertains to permutation groups, unless specified:

  • handling different representations – NO
  • excluding singleton cycles from the cycle decomposition – NO
  • powers and orders of elements – YES. This was actually already there for permutations.
  • orbits – YES.
  • stabilizers – YES.
  • schreier vectors – YES.
  • randomized Schreier-Sims algorithm – YES
  • handling bases and strong generating sets – YES
  • membership testing – YES (the function _strip in sympy.combinatorics.util)
  • rewriting algorithm – NO.
  • actions on cosets – NO.
  • quotient groups – NO.
  • order of a group – YES. This was already there.
  • subgroup testing – NO.
  • coset enumeration by the Todd-Coxeter algorithm & consequences – NO.
  • primitivity testing – YES.
  • finding (minimal) block systems – YES.
  • general backtrack search for a certain property – No, however easy to do by modifying PRINTELEMENTS.
  • outputting all group elements – YES. This was already there, however PRINTELEMENTS does it in lexicographical order according to a base.
  • Sylow subgroups – NO.
  • calculating the center – NO.
  • pointwise stabilizers (of more than one point, see above) – NO.
  • change of base – YES.
  • product groups – YES.
  • more on finitely presented groups (…) – NO.
  • the p-core – NO.
  • the solvable radical – NO.
  • database of known groups – NO.

Things yet to be done.

Apart from the things that got a “NO” on the list above, the following currently come to mind (I’ll update this list periodically):

  • Work on removing redundant generators from a strong/any generating set, as described in [1].
  • Precompute more properties for the groups in the named groups module (transitivity degrees, bases and strong generating sets, etc.)
  • Add more groups to the named groups module.
  • Fix the issues pointed out in the review of my second pull request.
  • Finally do something for handling representations of finite groups over vector spaces, like working with character tables. It’d be cool to have a function that computes the conjugacy classes for a given group, but I don’t know right now how possible that is.
  • Finally implement the group intersection algorithm… I’m currently starting to work my way through the SUBGROUPSEARCH function which is fundamental for implementing backtracking algorithms for group intersection, centralizers, etc.
  • Upgrade the randomized version of Schreier-Sims to Las Vegas type in the case when the order of the group is known.
  • Currently, transversal elements for the basic orbits for a stabilizer chain are stored explicitly. This requires too much memory for large groups. An alternative solution (which slows down execution) is to use Schreier vectors to describe the orbits. This means supplying some more arguments and adding code to many of the functions already present, and is a significant challenge by itself. The good news is that it can be carried out without modifying what is already there.
  • Come up with a more concise functionality to relate the different structures used to describe a base and strong generating set: the generators for basic stabilizers, the basic orbits, the basic transversals… There are many situations in which some of these are given and we need some of the other ones; sometimes it’s more convenient to get the orbits as sets, and sometimes as lists, and so on… the current approach is to write a new utility function whenever the present ones don’t suffice.
  • Handle the case when the identity element is provided as a generator for a permutation group – this can make some algorithms less efficient.
  • Optimize the behavior of BASESWAP so that only the i-th and i+1-th transversals are calculated.
  • Reduce side effects as much as possible (let’s be pythonic!)
  • Improve the docstring quality: it might be reasonable to lay out the theory/notation/definitions behind the Schreier-Sims algorithm in one place in some of the files and then simply refer to it as necessary. Otherwise the descriptions get unnecessarily long.
    .

Well, that’s it for now it seems. If anything else pops up soon, I’ll add it here!

Google Summer of Code 2012: Week 6

Hi all,

here’s a brief summary of what I’ve been doing for the sixth week of my GSoC:

  • Submitting, fixing and finally getting merged my second pull request. Thanks a lot to Stefan and my mentor David for reviewing it! now we have a lot more functionality for handling permutation groups.
  • Some more debugging on PRINTELEMENTS (I was talking about it in the third bullet of my post from last week). It turned out that it was still doing something slightly wrong but now it’s the way it should be. Apart from that, its speed was optimized by a different means of storing computed subwords of the group element being computed as a word in elements from the basic transversals (this assumes some knowledge of the theory of bases and strong generating sets; for a discussion, see [1],  pp.87-88,  pp.108-110)
  • In the comments on my post from last week, I got a clarification from Mario on the struggles with _coset_repr that I discussed in the third bullet of last week’s post. Now I’ll be able to use the current deterministic implementation of the Schreier-Sims algorithm whenever a BSGS is needed (after some minor modifications to the attributes of a PermutationGroup that are assigned after running Schreier-Sims).
  • Finally, the implementation of the algorithm BASESWAP ([1],  pp.102-103). This function is necessary for SUBGROUPSEARCH ([1], p.117) which in turn is necessary for the group intersection algorithm. This deserves some special attention – I have strong reasons to believe that the pseudocode & its discussion in [1], pp. 102-103 contain the same mistake repeated several times. Namely, I think that line 3 of the pseudocode for BASESWAP should read |\beta_i^{\left\langle T\right\rangle}|\neq s instead of |\beta_{i+1}^{\left\langle T\right\rangle}|\neq s. At first I implemented the algorithm the way it was given is pseudocode, and lost many hours (it wasn’t working) until I discovered that this little detail might be wrong. Now, I shall assume the notation used in [1] in order to follow their argument as closely as possible. My reasoning is as follows: as we change the set T during the run of BASESWAP, we finally want to have \left\langle T\right\rangle = H := G^{(i)}_{\beta_{i+1}}=G_{\beta_1, \beta_2, \ldots, \beta_{i-1}, \beta_{i+1}}. The last line |G^{(i)}| = |\Delta^{(i)}||\Delta^{(i+1)}||G^{(i+2)}| = |\beta_{i+1}^{G^{(i)}}||H| on page 102 of [1] is indeed correct by a straightforward application of the orbit-stabilizer theorem; so if we put s = \frac{|\Delta^{(i)}||\Delta^{(i+1)}|}{ |\beta_{i+1}^{G^{(i)}}|} we indeed have |H| = s|G^{(i+2)}|. Up to this point, I believe the book. However, after that they say that the last equation implies that s = |\beta_{i+1}^H|. Looking more closely, by definitions we recall that H = G_{\beta_1, \beta_2, \ldots, \beta_{i-1}, \beta_{i+1}}, G^{(i+2)} = G_{\beta_1, \beta_2,\ldots, \beta_{i+1}}. Hence, G^{(i+2)} is the stabilizer of \beta_i in H, thus by the orbit-stabilizer theorem we have |H| = |\beta_i^H||G^{(i+2)}|, hence we must have |\beta_i^H| = s, not |\beta_{i+1}^H|=s. This same mistake (\beta_{i+1} instead of \beta_i) appears several other times (in fact, all the times) in the discussion of BASESWAP and once in the pseudocode. Now that I changed it to \beta_i, the implementation doesn’t break and behaves as expected. I also implemented the randomized version described in [1], p.103 and [2], p.98, and it also behaves as expected. I’d be extremely happy if anyone else is willing to go over this and check whether what I’m saying is true; I’m pretty sure it is, but I didn’t expect to find such a serious mistake in that book. I’m willing to provide their argument in its entirety or clarify the notation, just shoot me a comment below.

So, that’s it for now. I’m in the process of furnishing my code for the next pull request (which will hopefully be submitted tomorrow), and then I’ll resume my work on subgroup intersections.

Edit: My pull request has not been submitted yet since writing the docstrings and tests took me longer than expected. The current state of it is available here, if anyone wants to take a look at how things are going. I still have to write some more tests, and hopefully will push it today for review.

Edit#2: The pull request is finally out. It is some 1300 lines of code, so if people object I can remove some of the stuff and save them for a future pull request.

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

[2] Permutation Group Algorithms, Ákos Seress, Cambridge University Press

Google Summer of Code 2012: Week 5

Hi everyone,

here’s a brief summary of what I’ve been doing for the fifth week of my GSoC.

  • Firstly, I finally had my first pull request merged; it’s all my fault I didn’t complain about it on the mailing list earlier : ) I had to fix a bunch of things on it (stuff like documentation using sphinx and some improvements in code quality, mainly docstrings), and now I’m really happy to have my first major contribution to sympy merged with the master branch. Thanks to everyone who helped in reviewing it – Stefan, Mario, Tom, Matthew, and of course my mentor David.
  • Secondly, my next pull request, mainly containing work from weeks 2 and 5. After some moderate rebasing over the fixes from my first pull request, it is available here. Apart from implementations of what I did in week 2, it addresses the issue of testing randomized algorithms ( which is still being discussed here ), and splits the generators for the named groups (symmetric, dihedral,…) in a separate file (which is probably going to contain more and more constructors for some well-known groups as time goes by). The PR looks longer than it is ( : ) read: any help in the review process will be appreciated), mainly because some 250 lines were copied to a new file in order to accommodate the named groups module. I hope that this time I did a better job at splitting the different parts of the PR into several commits.
  • Finally, I started work on algorithms for backtrack searches in groups. These include stuff like printing all group elements (sort of boring, but you have to start somewhere), searching for subgroups of elements satisfying a given property, finding normalizers and centralizers,  intersection of subgroups,… In general, backtrack searches tend to be slow since all the elements in the group have to be visited, but there are ways of skipping large numbers of them. Also, for some problems in computational group theory, backtrack searches are the best we have today. They are described in [1], 4.6., and I’m currently following the exposition offered there. After two days of debugging, I finally got the function PRINTELEMENTS described in 4.6.1 of [1] to work; it turned out that the current implementation of the Schreier-Sims algorithm sets the field _coset_repr of an object of class PermutationGroup in sympy.combinatorics.perm_groups.py in a way that was unexpected to me. This little digression might help anyone else trying to understand the perm_groups file better. So for example consider the following:

In [297]: S = SymmetricGroup(4)

In [298]: S.schreier_sims()

In [299]: S._coset_repr
Out[299]:
[[[0, 1, 2, 3], [1, 2, 3, 0], [2, 3, 0, 1], [3, 0, 1, 2]],
[[0, 1, 2, 3], [0, 2, 1, 3], [0, 3, 1, 2]],
[[0, 1, 2, 3], [0, 1, 3, 2]]]
 
 In [300]: S._base
Out[300]: [0, 1, 2]

From this and similar examples I concluded that the i-th component of _coset_repr is a transversal of the i-th basic orbit of the group S and tried to use this in PRINTELEMENTS. However, consired the following example:


In [302]: G = PermutationGroup([Permutation([[0, 1, 2, 3], [4], [5]]), Permutation([[1, 3], [0], [2], [4], [5]]), Permutation([[0], [1], [2], [3], [4, 5]])])

In [303]: G.schreier_sims()

In [304]: G._base
 Out[304]: [0, 1, 4]

In [305]: G._coset_repr
 Out[305]:
 [[[0, 1, 2, 3, 4, 5],
 [1, 2, 3, 0, 4, 5],
 [2, 3, 0, 1, 4, 5],
 [3, 0, 1, 2, 4, 5]],
 [[0, 1, 2, 3, 4, 5], [0, 3, 2, 1, 4, 5]],
 [[0, 1, 2, 3, 4, 5]],
 [[0, 1, 2, 3, 4, 5]],
 [[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 5, 4]]]

Here, the first two components of _coset_repr are as I expected, but the 3rd and 4th are something I didn’t expect to be there (I expected to get the 5th component instead of the 3rd, and no more components). Hence at present the behaviour of _coset_repr is not too clear to me. The current solution is to use the randomized version of the Schreier-Sims algorithm to get a base and a strong generating set. Another option would be to use the generators from the attribute _stabilizers_gens, but I haven’t tried that yet. Anyway, PRINTELEMENTS works now (edit: there are many other algorithms present for printing all the elements of a group, but this one is significant for the implementation of backtrack searches), and the order in which the elements of the group are visited (lexicographically with respect to the image of the base, in an ordering of \Omega = \{0, 1,\ldots, n-1\} in which base points come first) is used in most of the following backtrack searches, so a large part of this algorithm will be reused in subsequent algorithms ( I hope : ) ). My description of the situation assumed some knowledge of the theory behind the Schreier-Sims algorithm, so if something is not quite clear feel free to ask in the comments!

That’s it for now. Next week, I’ll continue with backtrack searches, and hopefully will implement the subgroup intersection routine (it seems formidable right now)… and put some more effort into getting the second pull request merged – I’ve got a lot to catch up with in terms of getting my code in sympy : ) .

[1] Derek F. Holt, Bettina Eick, Bettina, Eamonn A. O’Brien, “Handbook of computational group theory”, Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 1-58488-372-3

Google Summer of Code 2012: Week 4

Hi everyone,

Here’s a brief summary of what I’ve been doing for the 4-th week of my GSoC.

This week, like the previous one, was not intense in terms of coding. Here’s what I have up and running, basically what I was talking about last week:

  • A working implementation of the randomized Schreier-Sims algorithm. This still needs to be integrated with the deterministic version of the algorithm, using the fields  _base, _coset_repr, _coset_repr_n, stabilizers_gens so that a result from the randomized algorithm can be verified deterministically. Also, it’s been suggested that we have a function that determines the base and transversal elements for the basic orbits by a given generating set that is known to be strong. This won’t be hard to implement, and will be helpful for the Butler-Portugal algorithm for tensor canonicalization – see this pull request for more information if you are interested! For more on bases and strong generating sets, see [1], pp.101-119
  • A function \text{DirectProduct(*groups)} that constructs the direct product of several groups. For more than two groups, \text{DirectProduct}(G_1,\ldots, G_n) is several times faster than calling G_1*\ldots * G_n (benchmarked it), thus it makes sense to have such a function. This is later used in constructing an arbitrary abelian group by its cycle decomposition.
  • A function for calculating the degree of transitivity of a permutation group. The idea is very brute-force: we look at the orbit of a k+1-tuple (0, 1, \ldots, k) for k = 0, 1, \ldots, n-1 and check if it spans all possible k+1-tuples. This is really bad since the number of tuples is growing like n^{k+1}, hence the complexity is O(n^{k+1}r) where k is the degree of transitivity, r is the number of generators. It seems that some sort of randomization that checks only several randomly chosen tuples for membership in the orbit will decrease the complexity, but to make sure we still need to do all the checks if the random tuples pass, which is again O(n^{k+1}r). Some bound on the probability will be good to know here.

The main focus this week was on several discussions about future changes in the permutation groups module, and on making some more effort to get my code so far merged 🙂 :

  • In this post to the mailing list, it was suggested to implement an algorithm for intersecting subgroups of a given group so that it can be used in the tensor canonicalization algorithm (again, see here). This is done in [1] but seems fairly complicated and opens the subject of backtrack searches in permutation groups; I’ll try to figure it out and implement it this coming week.
  • In this post to the mailing list, we discussed ways of testing randomized algorithms (and there are a lot of them involved in computational group theory), and an agreement was reached that some sort of manual setting of the randomized output (via an additional argument) is a sensible approach.
  • In this post to the mailing list, we discussed some changes in interface in the permutations module. Even though not everybody agrees with what I last suggested, I’ll carry these changes out and see how things unfold (i.e., whether people are happy)
  • Finally, my work from week 1 is, I hope, ready to be merged, now that I’ve made the changes suggested in the discussion of the pull request . I (finally) got familiar with the sphinx system and building the docs for sympy, and with all the conventions for writing docstrings (and convinced myself that I’ve been writing them the wrong way, I’ll fix all the docstrings in the module in the future). By the way, I installed the sphinx system in a virtualenv at the suggestion of S. Krastanov, and found the following guide really helpful in the process. When the week1 branch gets merged, a pull request with the week2 code will follow shortly, and then with the rest of the code so far…
  • And in this discussion, there were some more changes suggested, for example David proposed isolating the named groups (Symmetric, Dihedral, …) in a separate module, which I’m going to do in one of the next pull requests.

So, that’s it for now. Next week I’ll focus on getting some more of my code merged and backtrack searches.

[1] Holt, D., Eick, B., O’Brien, E. “Handbook of computational group theory”

mathbabe

Exploring and venting about quantitative issues

Rational Altruist

Adventures of a would-be do-gooder.

Gowers's Weblog

Mathematics related discussions

What's new

Updates on my research and expository papers, discussion of open problems, and other maths-related topics. By Terence Tao

Stefan's Blog (archive, for news see blog.krastanov.org)

Most of the blog moved to blog.krastanov.org