This is not real course work, and I'm not taking any CS 102 class. I thought it might be worthwhile to post notes on basic CS concepts once in a while. Like most devs I already know this sort of stuff by heart, but I do LeetCode problems sometimes to stay sharp in different programming languages. I did this simple problem today and decided to write this explainer for a general audience.
Assuming you know what these words mean: string, substring, word, map, index, and pop.
Given a string and a list of equal length words, find all substrings that are a combination of all words.
A naive approach might check a lot of potential combinations. A more standard approach would use comparison between maps.
String: barfoothefoobarman
Words: ["foo", "bar"]
One starting index at a time, scan to the right while comparing with candidate available words from the list. If a word is found, eliminate it and keep scanning. If all words are found, we found an overall match at the starting index, otherwise we found nothing.
index 0
[bar] foothefoobarman ✓ match → pop "bar"
remaining words: {"foo"}; continue at next 3-letter section
bar [foo] thefoobarman ✓ match → pop "foo"
all words used → match at index 0
index 1
b [arf] oothefoobarman ✗ not in word list → advance
index 2
ba [rfo] othefoobarman ✗ not in word list → advance
index 3
bar [foo] thefoobarman ✓ match → pop "foo"
remaining words: {"bar"}; continue
barfoo [the] foobarman ✗ not in word list → fail at this start
...continue.
This approach is much faster to compute because there are less steps.
String: barfoothefoobarman
Words: ["foo", "bar"]
Word length L: 3
First, count how many times each word appears in the word list and put it in a map:
{"foo": 1, "bar": 1}
We split the string into words of length L:
bar foo the foo bar man
We slide a window of size 2 (number of words) from left to right, checking each window to see if it has the same word frequency as our word list:
[bar foo] the foo bar man | {"bar": 1, "foo": 1} == {"foo": 1, "bar": 1} ✓ Match at index 0
bar [foo the] foo bar man | {"foo": 1, "the": 1} != {"foo": 1, "bar": 1} ✗ No match
bar foo [the foo] bar man | {"the": 1, "foo": 1} != {"foo": 1, "bar": 1} ✗ No match
bar foo the [foo bar] man | {"foo": 1, "bar": 1} == {"foo": 1, "bar": 1} ✓ Match at index 9
bar foo the foo [bar man] | {"bar": 1, "man": 1} != {"foo": 1, "bar": 1} ✗ No match
We also have to check at other starting points like so:
b arf oot hef oob arm an
and
ba rfo oth efo oba rma n
Checking windows on each of these.
Note that these are dumbed down and contextual to this post.
"hello" or "barfoothefoobarman"barfoothefoobarmanbar[foo]thefoobarman → foo is a substringbarfoo[the]foobarman → the is a substringbarfoothe[foobar]man → foobar is a substringbarfoothefoobar[man] → man is a substring["foo", "bar"]
{"foo": 1, "bar": 1}["foo", "bar"]["foo"]