Solr 3.5 – Stop words are not removed
At http://wiki.apache.org/solr/LanguageAnalysis stop word files can be downloaded for several languages. Also the one for the English language is more extended than the default one shipped with Solr.
However if you enable them with the solr.StopFilterFactory the stop words still are not removed. This is caused by the “|” pipe characters after each word. Solr wants every word on a new line without anything else. Also replacement of | to # doesn’t work. This problem can be hard to discover if you expect that information provided by the Solr website should just work.
So to still contain the nice comments, it’s best to put them above each word. Besides that I replaced all | characters with #.
Stop words should be working perfectly now! If you want to use multiple stop word files check out http://pietervogelaar.nl/solr-multiple-stop-word-files.