John Philpott on 14 Jun 2016 00:02:39
Was happy to find a WordCloud visual by Microsoft in the gallery. Then I tried using it. The tool needs some more customizations, most importantly some smarts to ignore common words such as 'the', 'and', 'it', etc. Secondary to this, the ability to customize minimum number before a word appears would be helpful as well.
- Comments (4)
RE: Improve the WordCloud visual
can copy and paste the NLTK stop words in (faff), and also choose a minimum boundary for word frequency (don't know if that's new), but an n-gram separation option would be great.
RE: Improve the WordCloud visual
Need bi-gram feature
RE: Improve the WordCloud visual
The stop words work well but apostrophes do not get filtered either...
I also agree with Joie's comments.
RE: Improve the WordCloud visual
I found that enabling the default stopwords in the visual did the trick for most common English words to filter out. I do agree though that it is missing many customizations. Advanced stopwords, alternate stopword lists, string length on stopwords, color customization - these are all lacking in the current visualization.
Currently I am playing with the tm and wordcloud packages in R to get what I need for wordclouds. They provide for all the above that I mentioned plus you can filter out numbers, punctuation, sort out case inconsistencies and plot to an image if needed (although I have not played with that yet).