Python, Perl: Partition a List by Equivalence

By Xah Lee. Date:

Here's another interesting algorithmic exercise, again from part of a larger program in the previous series.

Problem Spec

Here's the original Perl documentation:

merge($pairings) takes a list of pairs, each pair indicates the sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

For those of you unfamiliar with math lingoes, partition a list means to create sublists, based on some criteria. In our case, the input list comes in the form of pairs, and its members are the union of all members in the pairs. The criterion for partition in our case is that of equivalence, specified in the pairs input.

Try to write a Python code for this. In the Python code, the input should be a list of couples. (for Perlers, sketch out the algorithm on paper and try to code in Python directly.)

I'll post Perl and Python code tomorrow.

Answer below.

Perl Solution

# -*- coding: utf-8 -*-
# perl

sub merge($) {
my @pairings = @{$_[0]}; # for example: ([a,b], [c,d], …)

my @interm; # array of hashs. For the hash, Keys are numbers, values are dummy 'x'.

# chop the first value of @pairings into @interm
$interm[0]={$pairings[0][0]=>'x'}; ${interm[0]}{$pairings[0][1]}='x'; shift @pairings;

 N1: for my $aPair (@pairings) {
     for my $aGroup (@interm) {
         if (exists ${$aGroup}{$aPair->[0]}) {${$aGroup}{$aPair->[1]}='x'; next N1}
         if (exists ${$aGroup}{$aPair->[1]}) {${$aGroup}{$aPair->[0]}='x'; next N1}
     }
     push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
 }

my @fin = shift @interm;

N2: for my $group (@interm) {
    for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
            if (exists ${$newcoup}{$k}) {map { ${$newcoup}{$_}='x'} (keys %$group); next N2;}
  }
    }
    push @fin, $group;
}
return map {[keys (%$_)]} @fin;
}

Python Solution

Here's a direct translation of the Perl code above into Python:

# -*- coding: utf-8 -*-
# python

def merge(pairings): # pairings is a list of couples. For example, [(9,2),(7,6),…]

    # interm is a list of groups. Each group is a list that hold
    # equivalent numbers. interm stands for interim result. Each group
    # is a dictionary. Keys are numbers, values are all dummy
    # 'x'. Dictionary is used for ease of dealing with duplicates or
    # checking existence.
    interm=[];

    # move first pair of pairings into interm as the first group
    interm.append({pairings[0][0]:'x', pairings[0][1]:'x'}) ; del pairings[0]

    # go thru pairings. For each pair, check if it is in any group in
    # interm. If any part of pair is in a group, then add the other
    # part into that group. If no part of the pair is in any group,
    # then add this pair into interm as a new group.
    for aPair in pairings:
        for aGroup in interm:
            if (aGroup.has_key(aPair[0])): aGroup[aPair[1]]='x'; break
            if (aGroup.has_key(aPair[1])): aGroup[aPair[0]]='x'; break
        else: interm.append( {aPair[0]:'x'} ); interm[-1][aPair[1]]='x'

    # now make another pass of the groups in interm, because some pair
    # that may connect two groups (i.e. with one element in one group,
    # and second element in another group), yet the pair is simply
    # consumed by one group.
    # This pass will check if there are any element in any other
    # group, if so, such two groups will be unioned. In this pass, we
    # move things from interm into fin. fin==final.


    fin=[]; fin.append(interm.pop(0))
    goto=False
    for group in interm:
        for newcoup in fin:
            for k in group.keys():
                if newcoup.has_key(k):
                    newcoup.update(group);
                    goto=True
                    break
            if (goto):
                goto=False
                break
        else:
            fin.append(group)

    # now turn the dictionaries into lists for return value
    result=[];
    for group in fin: result.append(group.keys())
    return result

I wrote this (Perl) program in 2003-09, and now basically forgot all about the internals. The original Perl code does not have inline comments, nor public consumable documentation as this is my own project. In the process of translation and the publication and explanation on this page, i eventually have to re-acquaint the algorithm i used as i go thru the lines. I was thinking of a quick brainless translation word-for-word, but that turned out not possible as i run into problems.

(While i'm learning Python, i run into frustrations with the Python Documentation. (although it has far more quality than Perl documentations). The frustrations with documentations will be appended to this page later: Python Documentation Problems.)

The Problem of Translating Goto Statement

The translation problem i run into is this. In Perl, there's a GOTO construct where in a loop one can say "next XYZ" to jump to a arbitrary outer loop labeled XYZ. Python has “break” but does not provide a GOTO jump as in Perl. In the process, i have to actually figure out (for the first time for me) how loops with GOTO jumps can be translated to alternative structure. This turned out not too hard. For a GOTO jump to a far outer group, one can use multiple if/break at the end of each loop, possibly in addiction adding a “else” clause to the different levels of the loops. (Python language can have a “else” clause for “for” loops. It is executed when the loop completes. (as opposed to when a break inside jumped out))

Here is a loop with GOTO, translated into Python without:

 N1: for my $aPair (@pairings) {
     for my $aGroup (@interm) {
         if (exists ${$aGroup}{$aPair->[0]}) {${$aGroup}{$aPair->[1]}='x'; next N1}
         if (exists ${$aGroup}{$aPair->[1]}) {${$aGroup}{$aPair->[0]}='x'; next N1}
     }
     push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
 }
    for aPair in pairings:
        for aGroup in interm:
            if (aGroup.has_key(aPair[0])): aGroup[aPair[1]]='x'; break
            if (aGroup.has_key(aPair[1])): aGroup[aPair[0]]='x'; break
        else: interm.append( {aPair[0]:'x'} ); interm[-1][aPair[1]]='x'

Below is another loop with GOTO that jumps 2 levels out. Notice the use of a goto marker.

N2: for my $group (@interm) {
    for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
            if (exists ${$newcoup}{$k}) {map { ${$newcoup}{$_}='x'} (keys %$group); next N2;}
  }
    }
    push @fin, $group;
}
    goto=False
    for group in interm:
        for newcoup in fin:
            for k in group.keys():
                if newcoup.has_key(k):
                    newcoup.update(group);
                    goto=True
                    break
            if (goto):
                goto=False
                break
        else:
            fin.append(group)

The Python version above has not been tested much, but is a direct translation of the Perl code. The Perl version is fairly reliable, because i've used it over the year as part of a larger program. However, no any formal proof or exhaustive machine testing has been done. Later i might write some program to auto-test them … but that'd be another day. The Python version can also use some clean up, or even rewrite as a program in the Python language proper. Addenda or Errata will be added on this page.

If you have a question, put $5 at patreon and message me.