Perl Weekly Challenge: You might think I’m crazy, but I don’t even care

Perl Weekly Challenge 376‘s tasks are “Chessboard Squares” and “Doubled Words”.

I already picked Chess the musical back in PWC 281, double for 290, and word for 255 and 360, but I haven’t given any musical love to squares, which is a shame, because it’s hip to be square… (or, wait, do I mean it’s hip to be A square?).

Task 1: Chessboard Squares

You are given two coordinates of a square on 8×8 chessboard.

Write a script to find the given two coordinates have the same colour.

8 W B W B W B W B
7 B W B W B W B W
6 W B W B W B W B
5 B W B W B W B W
4 W B W B W B W B
3 B W B W B W B W
2 W B W B W B W B
1 B W B W B W B W
  a b c d e f g h

Example 1

Input: $c1 = "a7", $c2 = "f4"
Output: true

Example 2

Input: $c1 = "c1", $c2 = "e8"
Output: false

Example 3

Input: $c1 = "b5", $c2 = "h2"
Output: false

Example 4

Input: $c1 = "f3", $c2 = "h1"
Output: true

Example 5

Input: $c1 = "a1", $c2 = "g8"
Output: false

Approach

I mean, the brute force solution would be to encode the chessboard color matrix as a 2D array/hash and just look up the color given the coordinates.

But I think it will take less space and be just as readable if we note the pattern of colors: columns a, c, e, & g have black squares for odd numbered rows and white squares for even numbered ones, and columns b, d, f, & h have black squares for even numbered rows and white for odd.

But we don’t even need to worry about what the colors actually are. All we care about is whether they’re the same.

Raku

I decided because the coordinates are always two-characters with the column first and then the row, I was just going to use .substr to grab the characters.

sub isBlack($c) {
  my ($col, $row) = ($c.substr(0..0), $c.substr(1..1));
  ($col ~~ /<[aceg]>/ && $row % 2 == 1) ||
  ($col ~~ /<[bdfh]>/ && $row % 2 == 0) ?? 1 !! 0;
}

sub chessboardSquares($c1, $c2) {
  isBlack($c1) == isBlack($c2) ?? 'true' !! 'false';
}

View the entire Raku script for this task on GitHub.

$ raku/ch-1.raku
Example 1:
Input: $c1 = "a7", $c2 = "f4"
Output: true

Example 2:
Input: $c1 = "c1", $c2 = "e8"
Output: false

Example 3:
Input: $c1 = "b5", $c2 = "h2"
Output: false

Example 4:
Input: $c1 = "f3", $c2 = "h1"
Output: true

Example 5:
Input: $c1 = "a1", $c2 = "g8"
Output: false

Perl

The only differences between the Perl and Raku solutions are the syntax of substr, the syntax of the regex for an enumerated character class, and the ternary operator.

sub isBlack($c) {
  my ($col, $row) = (substr($c, 0, 1), substr($c, 1, 1));
  ($col =~ /[aceg]/ && $row % 2 == 1) ||
  ($col =~ /[bdfh]/ && $row % 2 == 0) ? 1 : 0;
}

sub chessboardSquares($c1, $c2) {
  isBlack($c1) == isBlack($c2) ? 'true' : 'false';
}

View the entire Perl script for this task on GitHub.

Python

There isn’t as easy a ternary in Python, so I just made it two ifs and a catch-all. Also, I had to remember that I needed to cast the row as an integer.

import re

def is_black(c):
  col, row = c[0:1], int(c[1:2])
  if re.match("[aceg]", col) and (row % 2 == 1): return 1
  if re.match("[bdfh]", col) and (row % 2 == 0): return 1
  return 0

def chessboard_squares(c1, c2):
  return 'true' if is_black(c1) == is_black(c2) else 'false'

View the entire Python script for this task on GitHub.

Elixir

I have to require the Integer module on line 4 because…

** (UndefinedFunctionError) function Integer.is_odd/1 is undefined or private.
However, there is a macro with the same name and arity. Be sure to
require Integer if you intend to invoke this macro


But in addition, when I was piping the numeric character through String.to_integer/1 so I could check to see if it was odd or even, I realized I didn’t need to do two regex matches and two modulo checks. I could store the result of each check and then test if both were true or both were false.

require Integer

def is_black(c) do
  col = String.at(c, 0) |> String.match?(~r/[aceg]/)
  row = String.at(c, 1) |> String.to_integer |> Integer.is_odd
  if (col && row) || (!col && !row), do: true, else: false
end

def chessboard_squares(c1, c2) do
  if is_black(c1) == is_black(c2), do: "true", else: "false"
end

View the entire Elixir script for this task on GitHub.


Task 2: Doubled Words

You are given a string (which may contain embedded newlines) which is taken from a page on a website. The string will not contain brackets qw{ [ ] }.

Write a script that will find doubled words (such as “this this”) and highlight (wrap in brackets) each doubled word.

The script should:

- Work across lines, even finding situations where a word at the end of
  one line is repeated at the beginning of the next.

- Find doubled words despite capitalization differences, such as with
  'The the...', as well as allow differing amounts of whitespace (spaces,
  tabs, newlines, and the like) to lie between the words.

- Find doubled words even when separated by HTML tags. For example, to
  make a word bold: '...it is <B>very</B> very important...'. Only show
  lines containing doubled words.

Adapted from Mastering Regular Expressions, Third Edition by Jeffrey E. F. Friedl

Example 1

Input: $str = "you're given the job of checking the pages on a\nweb server for doubled words (such as 'this this'), a common problem\nwith documents subject to heavy editing."
Output: "web server for doubled words (such as '[this] [this]'), a common problem"

Example 2

Input: $str = "Find doubled words despite capitalization differences, such as with 'The\nthe...', as well as allow differing amounts of whitespace (spaces,\ntabs, newlines, and the like) to lie between the words."
Output: "Find doubled words despite capitalization differences, such as with '[The]\n[the]...', as well as allow differing amounts of whitespace (spaces,"

Example 3

Input: $str = "to make a word bold: '...it is <B>very</B> very important...'."
Output: "to make a word bold: '...it is <B>[very]</B> [very] important...'."

Example 4

Input: $str = "Perl officially stands for Practical Extraction and Report Language, except when it doesn't."
Output: ""

Example 5

Input: $str = "There's more than one one way to do it.\nEasy things should be easy and hard things should be possible."
Output: "There's more than [one] [one] way to do it."

Approach

Since this came from a book on mastering regular expressions, let’s use regular expressions. We want to match words on word boundaries, and then allow any number of either HTML or whitespace characters, and then match the word again. But, we need to to be able to ignore the case when matching the word (in example 2 we have to match The and the).

Raku

This took me a while because I still haven’t internalized Raku’s Regex syntax. <?wb> matches word boundaries instead of \b. <-[ ]> is a negated character class instead of [^ ]. Adverbs like i and g are prefixed with a colon and go before or within the regex. And using a Regex Boolean condition check wasn’t an idea I had, it was a suggestion in the documentation about the limitations of the Ignorecase adverb.

sub doubleDouble($str is copy) {
  # find doubled words and wrap them
  $str ~~ s:i/<?wb>(\w+)<?wb>
  ([ \< <-[\>]>+ \> | \s | \n ]+)
  {}<?wb>(\w+)<?{$0.fc eq $2.fc}><?wb>/\[$0\]$1\[$2\]/;

  # strip away lines that were not changed
  $str ~~ s:g/^  <-[\[]>+ \n//; # lines starting
  $str ~~ s:g/\n <-[\[]>+  $//; # lines ending
  $str ~~ s:g/^  <-[\[]>+  $//; # no subs
  $str;
}

View the entire Raku script for this task on GitHub.

$ raku/ch-2.raku
Example 1:
Input: $str = "you're given the job of checking the pages on a\nweb server for doubled words (such as 'this this'), a common problem\nwith documents subject to heavy editing."
Output: "web server for doubled words (such as '[this] [this]'), a common problem"

Example 2:
Input: $str = "Find doubled words despite capitalization differences, such as with 'The\nthe...', as well as allow differing amounts of whitespace (spaces,\ntabs, newlines, and the like) to lie between the words."
Output: "Find doubled words despite capitalization differences, such as with '[The]\n[the]...', as well as allow differing amounts of whitespace (spaces,"

Example 3:
Input: $str = "to make a word bold: '...it is <B>very</B> very important...'."
Output: "to make a word bold: '...it is <B>[very]</B> [very] important...'."

Example 4:
Input: $str = "Perl officially stands for Practical Extraction and Report Language, except when it doesn't."
Output: ""

Example 5:
Input: $str = "There's more than one one way to do it.\nEasy things should be easy and hard things should be possible."
Output: "There's more than [one] [one] way to do it."

Perl

In Perl’s regex, however, I didn’t need to have a condition check asserting that the first and second doubled words folded to the same case. I just let the ignorecase adverb do that for me, and I captured the second occurrence into $3.

sub doubleDouble($str) {
  # find doubled words and wrap them
  $str =~ s/\b(\w+)\b((?:<[^>]+>|\s|\n)+)\b(\1)\b/\[$1\]$2\[$3\]/i;

  # strip away lines that were not changed
  $str =~ s/^[^\[]+\n//g; # lines starting
  $str =~ s/\n[^\[]+$//g; # lines ending
  $str =~ s/^[^\[]+$//g;  # no subs
  $str;
}

View the entire Perl script for this task on GitHub.

Python

Converting to Python was easy because the regex syntax is the same as Perl’s.

import re

def double_double(string):
  # find doubled words and wrap them
  string = re.sub(
    r'\b(\w+)\b((?:<[^>]+>|\s|\n)+)\b(\1)\b',
    r'[\1]\2[\3]', string, flags=re.I
  )

  # strip away lines that were not changed
  string = re.sub(r'^[^\[]+\n', '', string) # lines starting
  string = re.sub(r'\n[^\[]+$', '', string) # lines ending
  string = re.sub(r'^[^\[]+$',  '', string) # no subs
  return string

View the entire Python script for this task on GitHub.

Elixir

And the regex engine for Elixir is the same as well. I really like that I was able to pipe the result of one String.replace/4 into the next instead of having to assign the results to variables.

import String

def double_double(str) do
  str # find doubled words and wrap them
  |> replace(~r/\b(\w+)\b((?:<[^>]+>|\s|\n)+)\b(\1)\b/i,
             "[\\1]\\2[\\3]")
  # strip away lines that were not changed
  |> replace(~r/^[^\[]+\n/, "") # lines starting
  |> replace(~r/\n[^\[]+$/, "") # lines ending
  |> replace(~r/^[^\[]+$/,  "") # no subs
end

View the entire Elixir script for this task on GitHub.


Here’s all my solutions in GitHub: https://github.com/packy/perlweeklychallenge-club/tree/challenge-376-packy-anderson/challenge-376/packy-anderson

Leave a Reply