Python, Perl: Comparison Pairings Reduce

By Xah Lee. Date: 2005-02-14

Here's a interesting real-world algorithm to have fun with.

Attached below is the Perl documentation that i wrote for a function called “reduce”, which is really the heart of a software.

The implementation is really simple, but the key is to understand what the function should be. I'll post Perl and Python codes tomorrow.

=pod

reduce( $pairings, $a_pair) return the first argument with some pairs 
deleted.

Detail:

we have n things, represented by numbers 1 to n. Some of these are
identical. We want to partition the range of numbers 1 to n so that
identical ones are grouped together.

To begin comparison, we generate a list of pairings that's all
possible parings of numbers 1 to n. (of course order does not matter,
and the pairing does not contain repetitions) This is the first
argument to reduce.

Now suppose we know that 2 and 4 are identical, then we want to reduce
the pairings so that we don't have to do extra comparison. For
example, if the pairing list contains (2,3) and (4,3), one of them can
be deleted because now 2 and 4 have been determined identical.

(We do this because we expect the comparison will be expensive.)

reduce( $pairings, $a_pair) returns a reduced $pairings knowing that
$a_pair are identical.

The first argument $pairings must be in the form of a hash like this:

 {'1,5' => [1,5],'3,5' => [3,5],'2,4' => [2,4],'4,5' => [4,5],'1,3' =>
 [1,3],'2,5' => [2,5],'1,2' => [1,2],'3,4' => [3,4],'2,3' =>
 [2,3],'1,4' => [1,4]}

(Note that keys are strings of the pairs separated by a comma.)

$a_pair is a reference to a list of the form [$a,$b].

The return value is a reference to a hash.

=cut

Answer below.

↓
↓
↓
↓
↓
↓
↓
↓
↓

↓
↓
↓
↓
↓
↓
↓
↓
↓

↓
↓
↓
↓
↓
↓
↓
↓
↓

↓
↓
↓
↓
↓
↓
↓
↓
↓

↓
↓
↓
↓
↓
↓
↓
↓
↓

↓
↓
↓
↓
↓
↓
↓
↓
↓

Perl

# -*- coding: utf-8 -*-
# perl

sub reduce ($$) {
my %hh= %{$_[0]}; # for example: {'1,2'=>[1,2],'5,6'=>[5,6],…}
my ($j1,$j2)=($_[1]->[0],$_[1]->[1]);  # for example: [3,4]
delete $hh{"$j1,$j2"};
foreach my $k (keys %hh) {
  $k=~m/^(\d+),(\d+)$/;
  my ($k1,$k2)=($1,$2);
  if ($k1==$j1) {
            if ($j2 < $k2) {
                delete $hh{"$j2,$k2"};
            }
            else {
                delete $hh{"$k2,$j2"};
            };
        };
  if ($k2==$j1) {
            if ($k1 < $j2) {
                delete $hh{"$k1,$j2"};
            } 
            else {
                delete $hh{"$j2,$k1"};
            };
        };
    }
return \%hh;
}

Python

# -*- coding: utf-8 -*-
# python 2

def reduce(pairings, pair):
    ps=pairings.copy(); j=pair;
    ps.pop("%d,%d"%(j[0],j[1]),0)
    for k in pairings.itervalues():
  if (k[0]==j[0]):
            if (j[1] < k[1]):
                ps.pop("%d,%d"%(j[1],k[1]),0)
            else:
                ps.pop("%d,%d"%(k[1],j[1]),0)
        if (k[1]==j[0]):
            if (k[0] < j[1]):
                ps.pop("%d,%d"%(k[0],j[1]),0)
            else:
                ps.pop("%d,%d"%(j[1],k[0]),0)
    return ps

When looping thru a list-like entity and delete elements in the vary list, there's a question whether it will change the iteration. (For example, if one loops thru 1 to 9, and deleted 8 while at 2, should the loop still do 8?) This is a design issue. Both behavior are useful. For some languages and or list entities, the answer may be yes or no. However, in imperative languages such as Perl and Python and Java, often modifying a list while looping simply cannot be done, justified as a protection to safeguard programers as ignoramuses, but partially because the internal issues of these languages. (These languages have molded a generation of programers to question and discourse man-made complexities.)

The work around in these languages is always to make a copy of the list-entity, and work with the copy.

Note also that in Python there's already a function called reduce. (it is equivalent to Mathematica's Fold.) In Python, looks like user can override default functions.

2005-02-16

Sean Gugler supplied this version of reduce:

# -*- coding: utf-8 -*-
# python 2

def reduce2( pairings, pair ):
     result={}
     for i,j in pairings.itervalues():
         if i in pair: i=pair[0]
         if j in pair: j=pair[0]
         if i>j: (i,j) = (j,i)
         if i!=j: result["%d,%d"%(i,j)] = (i,j)
     return result

It works entirely differently. Instead of deleting pairs, it built the dictionary. Very nice. I'm not sure which is algorithmically more efficient…

This reduce2 produces identical results with the 'reduce' above, for input of combo(119) and [2,3]. I haven't tested exhaustively or thought about their algorithmic equivalence… Also note the syntax:

  for i,j in pairings.itervalues():

instead of

    for k in pairings.itervalues():
        k[0], k[1]…

So in Python, one can loop thru a list of tuples with variables for each element in tuple. Very convenient.