It's an unanticipated result, but it's a unanimous result...

It is a  

Details:

This page determines whether "a" or "an" should precede a word. It does this using the method described in this stackoverflow response. The dataset used is the wikipedia-article-text dump. Some additional preprocessing was done to remove as much wiki-markup as possible and extract only things vaguely resembling sentences using regular expressions. If the word following 'a' or 'an' started with a quote or parenthesis, the initial quote or parenthesis was ignored. The resulting prefix-list with the code to query it is less than 10KB in size; excluding the actual counts would reduce the size still further.

Try...

You may use, modify, redistribute and do whatever you want with the data+script used on this page, but please don't misrepresent its source (license: Apache 2.0). If you've any questions, you can mail me at <firstname>@<lastname>.org.

Downloads:

The implementations are efficient: on a single thread of a 2.5GHz Q9300 a benchmark classifying all words of an english dictionary achieves about 15 million words a second; that's just 166 clock cycles per word. The javascript implementations were benchmarked on chrome 26, firefox 22, IE 10, and opera 12, and are about 5-10 times slower, at approximately 2 million classifications per second.

--Eamon Nerbonne