PWC 233

PWC 233

I tried to restrict to Perl 4 syntax for both challenges this week, as documented in Rex Swain's Perl 4 reference or the 1st edition of Programming perl. 

As I have mentioned before, my aims in using the old syntax are just mild nostalgia plus to add a bit of additional challenge to the task. But I should observe that it is really cool that I am able to write code using 30-year old syntax, and run it unmodified on the latest Perl 5.38. You cannot do this with Python 3 for example. This very deep backward compatibility is one of those unsung underappreciated beauties of Perl 5. In my opinion, it is a very cool thing indeed.

Update and disclaimer: These programs will not work on Perl 4 as I discovered later when I found and installed a Perl 4.019 executable. For one thing, the ability to assign a subroutine to a locally scoped typeglob was not available in Perl 4.

Challenge 1 (Similar Words)

We are given an array of strings with every string consisting of alphabetical characters only. We are asked to count how many pairs of elements of the array are built up from the same set of alphabetical characters. For example, 'aab' and 'aba' share the same set of alphabetical characters {a,b}, while 'aab' and 'bbc' do not share the same set of alphabetical characters ({a,b} in the first case, and {b,c} in the second).

I make a single pass through the array creating a hash %similar_words counting the number of times a particular combination of alphabetical characters is represented. For example, if 'aab' and 'aba' are in the array, then $similar_words{'ab'} increases by 2.

We then just need to count the number of pairs corresponding to each key of the hash. For example, if the hash key 'ab' has a value of 3, or represents three elements of the array, then these correspond to 3 pairs. In general, if the number of elements represented by a key  is n, then the number of pairs is given by n(n-1)/2.

sub nC2 {
    #-- combinations of n items 2 at a time (pairs)
    ($_[0] < 2) ? 0 : ($_[0]*($_[0]-1)/2);
}

To identify the character set corresponding to a particular element, I cobble up a homemade version of the uniq routine from List::Util. This just uses a hash to record membership of items in the array, and then returns the hash keys (unique items).

sub my_uniq {

    #-- homemade uniq (drop duplicates from an array)
    local (%my_uniq);
    foreach (@_) {$my_uniq{$_}=1}
    keys %my_uniq;
}

Then the helper sub-sub &chars that creates a hash key from the unique characters in a string, like say 'ab' from 'baaba'. Split the string, call my_uniq on the resulting array to get its unique elements, sort these unique elements, and then join them in a string again.

 #-- helper sub to extract unique chars
    local *chars = sub {
        join '',
         sort {$a cmp $b}
         &my_uniq( split //, $_[0] );
    };

And here is the core subroutine, looping through the array, counting occurrences of each character set, and then calling nC2 to count the number of pairs.

sub similar_words {
    # chars helper sub-sub routine
    ...
     
    #-- trunk of subroutine
    local %similar_words;
    foreach (@_) {
        $similar_words{ &chars($_) }++;
    }
    
    local $similar_words=0; #-- return value
    foreach (keys %similar_words) {
        $similar_words += &nC2( $similar_words{$_} );
    }
    
    $similar_words;
}

Challenge 2 (Frequency Sort)

We are asked to sort an array by the frequency of its elements, in ascending order. If there are ties, they are sorted in descending order of the tied elements.

I use a hash %frequency_sort to store frequencies, then one just needs to sort by $frequency_sort{$a} <=> $frequency_sort{$b} or $b <=> $a if there are ties (the <=> spaceship operator gives 0 for a tie, interpreted as False in a Perl conditional statement). Here is the key subroutine.

sub frequency_sort {
    local (%frequency_sort) = &my_freq( @_ );
    sort {
        ($frequency_sort{$a} <=> $frequency_sort{$b} ) ||
        ($b <=> $a);
    } @_;
}

Here is the subroutine &my_freq that uses a hash to count frequencies.

sub my_freq {
    #-- count frequencies
    local (%my_freq);
    foreach (@_) {$my_freq{$_}++}
    %my_freq;
}



Comments

Popular posts from this blog

PWC 258

PWC 249

PWC 253