We're into double figures! This challenge was /u/jordanreiter's idea; the challenge is to turn a sentence into a number of "tokens" that would be suitable for something like a search engine to use a keywords. For example, "don't tell Suzie Smith-Hopper that I broke Daniel's toy horse" would be turned into "don't,tell,Suzie,Smith-Hopper,that,I,broke,Daniel's,toy,horse" and "other "big name" items" would be turned into "other,big name, items".
The four criteria are as follows:
If words are in quotes, treat them as a single separate token: eg "This "huge test" is pointless" would be changed to "this,huge test,is,pointless". This applies to both single quotes and double quotes.
Hyphenated last names (such as "Smith-Hopper") should be a single token, but words with more hyphens, or hyphens at the beginning or end of the word, should have the hyphens stripped and be treated as separate tokens: "Suzie Smith-Hopper test--hyphens" should be changed to "Suzie,Smith-Hopper,test,hyphens".
Contractions should be treated as a single token; "I can't do it" would be changed to "I,can't,do,it".
Punctuation should be removed (but not hyphens and quotes as above); "Too long; didn't read" would turn into "Too,long,didn't,read".
This challenge is challenging! It is certainly possible, though.