With PWC 350‘s tasks being “Good Substrings” and “Shuffle Pairs”, I really couldn’t avoid the musical theme being Boz Scaggs’ Lido Shuffle.
So let’s do one more for the road in the Perl Weekly Challenge.
Task 1: Good Substrings
You are given a string.
Write a script to return the number of good substrings of length three in the given string.
A string is good if there are no repeated characters.
Example 1
Input: $str = "abcaefg"
Output: 5
Good substrings of length 3: abc, bca, cae, aef and efg
Example 2
Input: $str = "xyzzabc"
Output: 3
Good substrings of length 3: "xyz", "zab" and "abc"
Example 3
Input: $str = "aababc"
Output: 1
Good substrings of length 3: "abc"
Example 4
Input: $str = "qwerty"
Output: 4
Good substrings of length 3: "qwe", "wer", "ert" and "rty"
Example 5
Input: $str = "zzzaaa"
Output: 0
Approach
I think this is a good use for a bag: loop over the string and extract three-character substrings, and then use a bag to determine if any of the characters occur more than once.
Raku
We can’t push the substring into the Bag to count the characters, we have to break the substring into its composite characters. I used .comb to do that, and then .values to get the character counts and .any to determine if any of those counts were greater than 1.
sub goodString($str) {
my $good = 0;
for 0 .. $str.chars - 3 -> $i {
my $substr = $str.substr($i..$i+2);
my %bag = $substr.comb().Bag;
$good++ unless %bag.values.any > 1;
}
return $good;
}View the entire Raku script for this task on GitHub.
$ raku/ch-1.raku
Example 1:
Input: $str = "abcaefg"
Output: 5
Example 2:
Input: $str = "xyzzabc"
Output: 3
Example 3:
Input: $str = "aababc"
Output: 1
Example 4:
Input: $str = "qwerty"
Output: 4
Example 5:
Input: $str = "zzzaaa"
Output: 0Perl
I’m once again stealing Matthias Muth‘s idea of using List::MoreUtils’ frequency to create a bag/multiset, as well as importing any from List::MoreUtils as well. Otherwise, it’s pretty much the same as the Raku solution.
use List::MoreUtils qw( any frequency );
sub goodString($str) {
my $good = 0;
foreach my $i (0 .. length($str) - 3) {
my $substr = substr($str, $i, $i+2);
my %bag = frequency split //, $substr;
$good++ unless any { $_ > 1 } values %bag;
}
return $good;
}View the entire Perl script for this task on GitHub.
Python
As usual, I’m using collections's Counter to make our multiset, and the built-in any() to mke sure none of the counts were greater than 1.
from collections import Counter
def good_string(mystr):
good = 0
for i in range(len(mystr)-2):
substr = mystr[i:i+3]
bag = Counter(list(substr))
if not any([x > 1 for x in bag.values()]):
good += 1
return goodView the entire Python script for this task on GitHub.
Elixir
Once again, I’m using recursion to do my looping, and Enum.frequencies/1 to make my bag/multiset.
def good_string(_, good, i, m) when i >= m, do: good
def good_string(str, good, i, m) do
bag = str
|> String.slice(i..i+2) # get the substring
|> String.graphemes # split into characters
|> Enum.frequencies # make the map
good = if Enum.any?(Map.values(bag), fn x -> x > 1 end) do
good
else
good+1
end
good_string(str, good, i+1, m)
end
def good_string(str) do
good_string(str, 0, 0, String.length(str) - 2)
endView the entire Elixir script for this task on GitHub.
Task 2: Shuffle Pairs
If two integers A <= B have the same digits but in different orders, we say that they belong to the same shuffle pair if and only if there is an integer k such that A = B * k. k is called the witness of the pair.
For example, 1359 and 9513 belong to the same shuffle pair, because 1359 * 7 = 9513.
Interestingly, some integers belong to several different shuffle pairs. For example, 123876 forms one shuffle pair with 371628, and another with 867132, as 123876 * 3 = 371628, and 123876 * 7 = 867132.
Write a function that for a given $from, $to, and $count returns the number of integers $i in the range $from <= $i <= $to that belong to at least $count different shuffle pairs.
PS: Inspired by a conversation between Mark Dominus and Simon Tatham at Mastodon.
Example 1
Input: $from = 1, $to = 1000, $count = 1
Output: 0
There are no shuffle pairs with elements less than 1000.
Example 2
Input: $from = 1500, $to = 2500, $count = 1
Output: 3
There are 3 integers between 1500 and 2500 that belong to shuffle pairs.
1782, the other element is 7128 (witness 4)
2178, the other element is 8712 (witness 4)
2475, the other element is 7425 (witness 3)
Example 3
Input: $from = 1_000_000, $to = 1_500_000, $count = 5
Output: 2
There are 2 integers in the given range that belong to 5 different shuffle pairs.
1428570 pairs with 2857140, 4285710, 5714280, 7142850, and 8571420
1429857 pairs with 2859714, 4289571, 5719428, 7149285, and 8579142
The witnesses are 2, 3, 4, 5, and 6 for both the integers.
Example 4
Input: $from = 13_427_000, $to = 14_100_000, $count = 2
Output: 11
6 integers in the given range belong to 3 different shuffle pairs, 5 integers belong to 2 different ones.
Example 5
Input: $from = 1030, $to = 1130, $count = 1
Output: 2
There are 2 integers between 1020 and 1120 that belong to at least one shuffle pair:
1035, the other element is 3105 (witness k = 3)
1089, the other element is 9801 (witness k = 9)
Approach
I’m sure there’s some clever way to do this, but I can’t think of it. I’m thinking that the best way to handle this is to loop over the values between $from and $to, generate their multiples with increasing witness values ($k) and see if any of those are shuffles of the starting value. I’m checking for shuffles by sorting the digits in each number to see if they’re the same (i.e., 3421 and 2413 would both, when their digits are sorted, become 1234, so they’re shuffled versions of each other) and keep going until the number of digits in $A * $k is larger than the number of digits in $A (meaning it can’t be a shuffled version of $A because it has too many digits).
Raku
In Raku, I made helper functions to do the length comparison and detecting if the numbers are potential shuffles of each other. Then I put the logic for checking a particular number into a hasShufflePair function. The script, however, takes over a minute to run!
sub isSameLength($A, $B) {
$A.Str.chars == $B.Str.chars;
}
sub isPair($A, $B) {
$A.comb.sort.join eq $B.comb.sort.join;
}
sub hasShufflePair($A, $min) {
my $count = 0;
my $k = 2;
while (True) {
my $B = $A * $k++;
# stop processing $A if $B has more digits than $A
return 0 unless isSameLength($A, $B);
# go to the next $k if $A & $B aren't combinations
# of the same digits
next unless isPair($A, $B);
# it's a shuffle pair, count it
if (++$count >= $min) {
# abort searching if we found the minimum count of pairs
return 1;
}
}
return 0;
}
sub shufflePairs($from, $to, $count) {
sum([$from .. $to].map({hasShufflePair($_, $count)}));
}$ time raku/ch-2.raku
Example 1:
Input: $from = 1, $to = 1000, $count = 1
Output: 0
Example 2:
Input: $from = 1500, $to = 2500, $count = 1
Output: 3
Example 3:
Input: $from = 1000000, $to = 1500000, $count = 5
Output: 2
Example 4:
Input: $from = 13427000, $to = 14100000, $count = 2
Output: 11
Example 5:
Input: $from = 1030, $to = 1130, $count = 1
Output: 2
real 1m13.067s
user 1m12.044s
sys 0m0.267sBut, in Raku, there’s a way to process these in parallel to produce faster results.
sub shufflePairs($from, $to, $count) {
sum([$from .. $to].race(:degree(16))
.map({hasShufflePair($_, $count)}));
}In Raku, objects with the role Iterable have methods hyper and race that allow the iterable to be processed in parallel (hyper keeps the output order the same as the input order, and race the output order isn’t preserved; the documentation says the mnemonic is “in a race, you never know who will arrive first”). Doing this with parallel processing, it’s much faster:
View the entire Raku script for this task on GitHub.
$ time raku/ch-2.raku
Example 1:
Input: $from = 1, $to = 1000, $count = 1
Output: 0
Example 2:
Input: $from = 1500, $to = 2500, $count = 1
Output: 3
Example 3:
Input: $from = 1000000, $to = 1500000, $count = 5
Output: 2
Example 4:
Input: $from = 13427000, $to = 14100000, $count = 2
Output: 11
Example 5:
Input: $from = 1030, $to = 1130, $count = 1
Output: 2
real 0m16.659s
user 2m8.096s
sys 0m1.723sPerl
Interestingly, running the non-parallel version in Perl was way faster: 0m18.853s. With that kind of speed, I didn’t feel the need to try to make it use more than one core of my computer.
Note: on line 7, I’m appending an empty string to explicitly cast the values as strings.
use List::AllUtils qw( sum );
sub isSameLength($A, $B) {
length($A."") == length($B."");
}
sub isPair($A, $B) {
join('', sort split //, $A) eq join('', sort split //, $B);
}
sub hasShufflePair($A, $min) {
my $count = 0;
my $k = 2;
while (1) {
my $B = $A * $k++;
# stop processing $A if $B has more digits than $A
return 0 unless isSameLength($A, $B);
# go to the next $k if $A & $B aren't combinations
# of the same digits
next unless isPair($A, $B);
# it's a shuffle pair, count it
if (++$count >= $min) {
# abort searching if we found the minimum count of pairs
return 1;
}
}
return 0;
}
sub shufflePairs($from, $to, $count) {
sum(map { hasShufflePair($_, $count) } $from .. $to);
}View the entire Perl script for this task on GitHub.
Python
Python was also able to run the non-parallel version in 0m20.718s, so I didn’t bother trying to speed it up.
def is_same_length(A, B):
return len(str(A)) == len(str(B))
def is_pair(A, B):
return(
"".join(sorted(list(str(A))))=="".join(sorted(list(str(B))))
)
def has_shuffle_pair(A, m):
count = 0
k = 2
while True:
B = A * k
k += 1
# stop processing $A if $B has more digits than $A
if not is_same_length(A, B):
return 0
# go to the next $k if $A & $B aren't combinations
# of the same digits
if is_pair(A, B):
# it's a shuffle pair, count it
count += 1
if (count >= m):
# abort searching if we found the minimum count of pairs
return 1
def shuffle_pairs(frm, to, count):
return sum([
has_shuffle_pair(i, count) for i in range(frm, to+1)
])View the entire Python script for this task on GitHub.
Elixir
For the non-parallel version, script execution took 0m42.355s, so I set out to figure out how to execute has_shuffle_pairs/2 asynchronously. But running it asynchronously didn’t get much of a time gain: 0m30.461s.
def is_same_length(a, b) do
a |> Integer.to_string |> String.length ==
b |> Integer.to_string |> String.length
end
def is_pair(a, b) do
a |> Integer.to_string |> String.codepoints |> Enum.sort ==
b |> Integer.to_string |> String.codepoints |> Enum.sort
end
def has_shuffle_pair(a, m, count, k) do
b = a * k
if not is_same_length(a, b) do
0 # stop processing a if b has more digits than a
else
if not is_pair(a, b) do
# go to the next k if a & b aren't combinations
# of the same digits
has_shuffle_pair(a, m, count, k+1)
else
# it's a shuffle pair, count it
count = count + 1
if count >= m do
1 # abort searching if we found the min count of pairs
else
has_shuffle_pair(a, m, count, k+1)
end
end
end
end
def has_shuffle_pair(a, m) do
has_shuffle_pair(a, m, 0, 2)
end
def shuffle_pairs(from, to, count) do
stream = from .. to
|> Task.async_stream(
fn i -> has_shuffle_pair(i, count) end, ordered: false
)
Enum.sum_by(stream, fn {:ok, num} -> num end)
endView the entire Elixir script for this task on GitHub.
Here’s all my solutions in GItHub: https://github.com/packy/perlweeklychallenge-club/tree/master/challenge-350/packy-anderson