Perl Weekly Challenge: Ba-a-nking Day! Ba-a-nking Day! That’s all I really wanted to say…

Oh, I’m weary from waiting
In Washington, D.C.
I’m coming to see my congressman
But he’s avoiding me
Weary from waiting down in Washington, D.C.

Oh, Congresswoman
Won’t you tell that congressman
I’ve waited such a long time
I’ve about waited all I can…

La plus ça change, plus c’est la même chose. But this week, things are not the same with Perl Weekly Challenge 259!

Task 1: Banking Day Offset

You are given a start date and offset counter. Optionally you also get bank holiday date list.

Given a number (of days) and a start date, return the number (of days) adjusted to take into account non-banking days. In other words: convert a banking day offset to a calendar day offset.

Non-banking days are:

a) Weekends
b) Bank holidays

Example 1

Input: $start_date = '2018-06-28', $offset = 3, $bank_holidays = ['2018-07-03']
Output: '2018-07-04'

Thursday bumped to Wednesday (3 day offset, with Monday a bank holiday)

Example 2

Input: $start_date = '2018-06-28', $offset = 3
Output: '2018-07-03'

Approach

We’re back to date manipulation, which were the crux of PWC 227 and 237, so we’re probably going to be using date manipulation modules…

Raku

In Raku, the manipulation we need is built in, so we only need Tom Browder’s Date::Names module for getting the day-of-the-week names I use in my explanation.

use Date::Names;

sub bankingDayOffset($start, $offset, @holidays) {
  my $date = Date.new($start); # convert string to Date
  my $off  = $offset;
  my $cnt  = 0;

  # convert holidays to Date objects
  @holidays = map { Date.new($_) }, @holidays;

  # instantiate a Date::Names object
  my $dn = Date::Names.new;

  my @explain;
  my $this_day = $dn.dow($date.day-of-week);
  while ($off) {
    $date++;
    my $next_day = $dn.dow($date.day-of-week);
    if (
      $date.day-of-week == 6 || # it's a Saturday
      $date.day-of-week == 7    # it's a Sunday
    ) { 
      @explain.push:
        "$next_day skipped because it's a weekend";
    }
    elsif ($date == @holidays.any) { # it's a Holiday
      @explain.push:
        "$next_day skipped because it's a holiday";
    }
    else {
      $off--; $cnt++;
      @explain.push:
        "$this_day bumped to $next_day (offset $cnt)";
      $this_day = $next_day;
    }
  }
  return $date.gist, @explain.join("\n");
}
$ raku/ch-1.raku
Example 1:
Input: $start_date = '2018-06-28', $offset = 3,
       $bank_holidays = ['2018-07-03']
Output: '2018-07-04'

Thursday bumped to Friday (offset 1)
Saturday skipped because it's a weekend
Sunday skipped because it's a weekend
Friday bumped to Monday (offset 2)
Tuesday skipped because it's a holiday
Monday bumped to Wednesday (offset 3)

Example 2:
Input: $start_date = '2018-06-28', $offset = 3
Output: '2018-07-03'

Thursday bumped to Friday (offset 1)
Saturday skipped because it's a weekend
Sunday skipped because it's a weekend
Friday bumped to Monday (offset 2)
Monday bumped to Tuesday (offset 3)

Example 3:
Input: $start_date = '2023-12-29', $offset = 5,
       $bank_holidays = ['2024-01-01']
Output: '2024-01-08'

Saturday skipped because it's a weekend
Sunday skipped because it's a weekend
Monday skipped because it's a holiday
Friday bumped to Tuesday (offset 1)
Tuesday bumped to Wednesday (offset 2)
Wednesday bumped to Thursday (offset 3)
Thursday bumped to Friday (offset 4)
Saturday skipped because it's a weekend
Sunday skipped because it's a weekend
Friday bumped to Monday (offset 5)

View the entire Raku script for this task on GitHub.

Perl

As usual, we don’t have to adjust the approach for converting from Raku to Perl, only the particulars. Perl doesn’t have a built-in module for Date manipulation, but Time::Piece and Time::Seconds are in the core modules, and Time::Piece has a method for generating the name of the day of the week. any we get from List::Util.

use List::Util qw( any );
use Time::Piece;
use Time::Seconds qw( ONE_DAY );

sub bankingDayOffset($start, $offset, @holidays) {
  # convert string to Date
  my $date = Time::Piece->strptime($start, "%Y-%m-%d")
                        ->truncate(to => 'day');
  my $cnt  = 0;

  # convert holidays to Date objects
  @holidays = map {
    Time::Piece->strptime($_, "%Y-%m-%d")
               ->truncate(to => 'day')
  } @holidays;

  my @explain;
  my $this_day = $date->fullday;
  while ($offset) {
    $date += ONE_DAY; # add 1 day
    my $next_day = $date->fullday;
    if (
      $date->wday == 7 || # it's a Saturday
      $date->wday == 1    # it's a Sunday
    ) { 
      push @explain,
        "$next_day skipped because it's a weekend";
    }
    elsif (any { $date == $_ } @holidays) { # it's a Holiday
      push @explain,
        "$next_day skipped because it's a holiday";
    }
    else {
      $offset--; $cnt++;
      push @explain,
        "$this_day bumped to $next_day (offset $cnt)";
      $this_day = $next_day;
    }
  }
  return $date->strftime('%F'), join("\n", @explain);
}

View the entire Perl script for this task on GitHub.

Python

Again, I’m using Python’s really robust datetime module.

from datetime import date, timedelta

def bankingDayOffset(start, offset, holidays):
  d = date.fromisoformat(start) # convert string to Date
  cnt = 0

  # convert holidays to Date objects
  holidays = [ date.fromisoformat(h) for h in holidays ]

  explain = []
  this_day = d.strftime('%A')
  while offset:
    d += timedelta(days = 1) # add 1 day
    next_day = d.strftime('%A')
    if (
      d.isoweekday() == 6 or # it's a Saturday
      d.isoweekday() == 7    # it's a Sunday
    ): 
      explain.append(
        f"{next_day} skipped because it's a weekend"
      )
    elif any([d == h for h in holidays]): # it's a Holiday
      explain.append(
        f"{next_day} skipped because it's a holiday"
      )
    else:
      offset -= 1
      cnt += 1
      explain.append(
        f"{next_day} bumped to {next_day} (offset {cnt})"
      )
      this_day = next_day
  return d.strftime('%F'), "\n".join(explain)

View the entire Python script for this task on GitHub.


Task 2: Line Parser

You are given a line like below:

{%  id   field1="value1"    field2="value2"  field3=42 %}

Where

a) "id" can be \w+.
b) There can be 0  or more field-value pairs.
c) The name of the fields are \w+.
b) The values are either number in which case we don't need
   double quotes or string in which case we need double quotes
   around them.

The line parser should return structure like below:

{
       name => id,
       fields => {
           field1 => value1,
           field2 => value2,
           field3 => value3,
       }
}

It should be able to parse the following edge cases too:

{%  youtube title="Title \"quoted\" done" %}

and

{%  youtube title="Title with escaped backslash \\" %}

BONUS: Extend it to be able to handle multiline tags:

{% id  filed1="value1" ... %}
LINES
{% endid %}

You should expect the following structure from your line parser:

{
       name => id,
       fields => {
           field1 => value1,
           field2 => value2,
           field3 => value3,
       }
       text => LINES
}

Approach

Wow! This seems like a pretty big task for a PWC, but I’m up for it.

Raku

It seems pretty obvious to me that, in Raku at least, the tool for this job is a grammar. I’ve never actually written a Raku grammar before, so I went through the grammar tutorial.

The really tricky part was allowing for strings that contained escaped quotes and escaped backslashes. Finally, after a lot of confusion, I stopped trying to keep the data I was parsing in the program file itself, since I couldn’t be certain that "Title with escaped backslash \\" was indeed an escaped backslash and not being interpreted by Raku as escaping the quote. You can look at my input files here.

grammar Parser {
  rule TOP { [ <line> | <text> ] }

  rule line { '{%' <id> [ <field-value> ]* '%}' }

  # negative lookbehind and negative lookahead
  rule text { <!after 「{%」 > <-[ \n ]>+ <!before 「%}」 >}

  token id    { \w+ }
  token field { \w+ }

  token number { \d+ [ \. \d+ ]? }

  token quoted-string { '"' <string> '"' }
  token string {
    [
      <-[ " ]> # any character not a quote
      |
      「\\」     # an escaped backslash
      |
      \\\"     # an escaped quote
    ]*
    }
  
  rule field-value { <field> '=' [<number> | <quoted-string>] }
}

sub MAIN($file) {
  my %data;
  my @ids;
  my $in_id = '';
  for $file.IO.lines -> $line {
    # parse this line of the file
    my $p = Parser.parse($line);

    # is there a line with {% ... %} ?
    if ($p<line>) {
      my $id = $p<line><id>.Str;
      # is the id the end of a block?
      if (my $c = ($id ~~ / end(\w+) /)) { # capture after end
        if (%data{$c[0]}:exists) { # it is!
          $id = $c[0]; 
          $in_id = ''; # clear the id we're processing
          if (%data{$id}{'text'}) {
            # if there's text, remove the final "newline"
            %data{$id}{'text'} ~~ s/\\n$//;
          }
          next; # skip to next line of file
        }
      }
      @ids.push($id); # keep list of ids in order
      $in_id = $id; # keep track of the current id for text
      # initialize base data for this id
      %data{$id} = { name => $id };
      # if we have fields...
      if ($p<line><field-value>) {
        # loop over them and store them in the data
        for $p<line><field-value> -> $fv {
          my $field = $fv<field>;
          my $value = $fv<number> ?? $fv<number> 
                   !! $fv<quoted-string><string>;
          %data{$id}{'fields'}{$field} = $value;
        }
      }
    }
    # if we have non-{% ... %} lines and we have an ID
    elsif ($p<text> && $in_id) {
      # append a "newline" to the end
      %data{$in_id}{'text'} ~= $p<text> ~ "\\n";
    }
  }

  # dump the data
  for @ids -> $id {
    my %group = %data{$id};
    say "\{";
    say "  name => %group{'name'},";
    say "  fields => \{";
    for %group{'fields'}.keys.sort -> $k {
      say "    $k => %group{'fields'}{$k},";
    }
    say "  }";
    say "  text => %group{'text'}" if %group{'text'};
    say "\}";
  }
}

And here’s my output:

$ raku/ch-2.raku data/parser-1.txt
{
  name => id,
  fields => {
    field1 => value1,
    field2 => value2,
    field3 => 42,
  }
}

$ raku/ch-2.raku data/parser-2.txt
{
  name => youtube,
  fields => {
    title => Title \"quoted\" done,
  }
}

$ raku/ch-2.raku data/parser-3.txt
{
  name => youtube,
  fields => {
    title => Title with escaped backslash \\,
  }
}

$ raku/ch-2.raku data/parser-4.txt
{
  name => id,
  fields => {
    filed1 => value1,
  }
  text => LINES
}

$ raku/ch-2.raku data/parser-5.txt
{
  name => id,
  fields => {
    filed1 => value1,
  }
  text => LINES\nLINES\nLINES
}
{
  name => foo,
  fields => {
    foonum => 3,
  }
  text => FOO\nBAR\nBAZ
}

View the entire Raku script for this task on GitHub.

Perl

It’s after the submission deadline, but I finally got around to implementing the parser in Perl:

use v5.38;

my $ID     = qr/ (?<ID> \w+) /x;
my $FIELD  = qr/ (?<FIELD> \w+) /x;
my $NUMBER = qr/ (?<NUMBER> \d+ [ \. \d+ ]? ) /x;
my $STRING = qr/ (?<STRING> ([^"] | \\ | \\\" )+ ) /x;

my $QUOTED_STRING = qr/ (?<QUOTED_STRING> " $STRING " ) /x;
my $FIELD_VALUE = qr/ $FIELD \s* = \s* ( $NUMBER | $QUOTED_STRING ) \s* /x;
my $FIELD_VALUES = qr/ (?<FIELD_VALUES> (?: $FIELD_VALUE \s* )* ) /x;

# negative lookbehind and negative lookahead
my $TEXT = qr/ (?<TEXT> (?<! {% ) .+ (?! %} ) ) /x;
my $LINE = qr/ (?<LINE> \{% \s* $ID \s* $FIELD_VALUES \s* %\} ) /x;

my $TOP  = qr/^ (?: $LINE | $TEXT ) $/x;

my $file = shift @ARGV;
open my $fh, '<', $file;

my %data;
my @ids;
my $in_id;

while (my $line = <$fh>) {
  $line =~ /$TOP/;

  # is there a line with {% ... %} ?
  if ($+{LINE}) {
    my $id = $+{ID};
    # is the id the end of a block?
    if ($id =~ /^ end(\w+) $/x) { # capture after end
      if (exists $data{$1}) { # it is!
        $id = $1; 
        undef $in_id; # clear the id we're processing
        if ($data{$id}{'text'}) {
          # if there's text, remove the final "newline"
          $data{$id}{'text'} =~ s/\\n$//;
        }
        next; # skip to next line of file
      }
    }
    push @ids, $id; # keep list of ids in order
    $in_id = $id; # keep track of the current id for text
    # initialize base data for this id
    $data{$id} = { name => $id };
    # if we have fields...
    my $field_values = $+{FIELD_VALUES};
    # loop over field values and store them in the data
    while ($field_values =~ /$FIELD_VALUE/g) {
      my $value = $+{STRING} ? $+{STRING} : $+{NUMBER};
      if ($+{NUMBER}) {
        $value =~ s/\s+$//; # we're picking up trailing spaces
      }
      $data{$id}->{'fields'}->{ $+{FIELD} } = $value;
    }

  }
  # if we have non-{% ... %} lines and we have an ID
  elsif ($+{TEXT} && $in_id) {
    # append a "newline" to the end
    $data{$in_id}{'text'} .= $+{TEXT} . "\\n";
  }
}

use Data::Dumper::Concise;
foreach my $id (@ids) {
  print Dumper($data{$id});
}
$ perl/ch-2.pl data/parser-1.txt
{
  fields => {
    field1 => "value1",
    field2 => "value2",
    field3 => 42,
  },
  name => "id",
}

$ perl/ch-2.pl data/parser-2.txt
{
  fields => {
    title => "Title \\",
  },
  name => "youtube",
}

$ perl/ch-2.pl data/parser-3.txt
{
  fields => {
    title => "Title with escaped backslash \\\\",
  },
  name => "youtube",
}

$ perl/ch-2.pl data/parser-4.txt
{
  fields => {
    filed1 => "value1",
  },
  name => "id",
  text => "LINES",
}

$ perl/ch-2.pl data/parser-5.txt
{
  fields => {
    filed1 => "value1",
  },
  name => "id",
  text => "LINES\\nLINES\\nLINES",
}
{
  fields => {
    foonum => 3,
  },
  name => "foo",
  text => "FOO\\nBAR\\nBAZ",
}

View the entire Perl script for this task on GitHub.


Here’s all my solutions in GItHub: https://github.com/packy/perlweeklychallenge-club/tree/master/challenge-259/packy-anderson