Pages

Hackerrank Strings: Making Anagrams

Given two strings in input, tell how many characters we should remove from both ones to leave just the same characters even if a different order.

This HackerRank problem is meant to be about strings. however, I solved it using Python, and in this case I ended up seeing the two strings not differently as they was lists of whatever elements.

One aspect I have overseen initially was that in a string there could be more than an occurrence of a character. To ensure that my code would work correctly I added a test to the one suggested by Hackerrank:
def test_provided_1(self):
    self.assertEqual(4, solution('cde', 'abc'))

def test_many_a(self):
    self.assertEqual(3, solution('aaaa', 'a'))
In the one provided, I have to remove 'd' and 'e' from the first string, plus 'a' and 'b' from the second one to get to the same 'c' in both of them, total, 4 eliminations.
In my test, I just have to remove 3 'a' from the first string.

I decided to implement my solution using a dictionary to store the letters in the first string and their numbers. I could have written a piece of code like this:
counter = {}
for ch in a:
    counter[ch] = counter.setdefault(ch, 0) + 1
The handy setdefault() function return the value for the specified key, or second parameter, instead of the default None, if there is not such key in the dictionary.

But the job of counting the number of elements in a collection is so common, that the standard python collections library give us a class, Counter, that works exactly in this way. So I saved same typing and I used it instead.
from collections import Counter


def solution(a, b):
    counter = Counter(a)
    # ...
Now I have to sort of compare this counting with the values store in the other string. I decided to loop on all the characters in it, and decrease the associated value if the key exists and the value is not zero, otherwise to increase a buffer variable to keep track of all the characters that are in the second string and not in the first one.
extra = 0  # 1
for ch in b:
    if counter.get(ch):  # 2
        counter[ch] -= 1
    else:
        extra += 1
1. Number of characters that are in the second string without a match in the first one.
2. Remember that get() on dictionary returns the associated value to the passed key or None. Here I decrease the value only if I get a value (not None) and it is different from zero.

Finally, I add up all the values in the counter dictionary, plus the extra count coming from the second string, and I return it.
return sum(counter.values()) + extra
I pushed both the unit test and the python script to GitHub for full reference.

No comments:

Post a Comment