It should be really easy to do actually, just look through every single character and remove the things like punctuation and quotations. Next, look through everything again for whitespace and subtract the location of the last whitespace character from the location of the newest whitespace character. Just make an integer array and do something like this each time it finds new whitespace
Code:
frequencyArray[space between whitespace characters - 1]++;
The frequency of 1 letter words is in frequencyArray[1]
The frequency of 2 letter words is in frequencyArray[2]...