It's an unanticipated result, but it's a unanimous result...
This page determines whether "a" or "an" should precede a word. It does this using the method described in this stackoverflow response. The dataset used is the wikipedia-article-text dump. Some additional preprocessing was done to remove as much wiki-markup as possible and extract only things vaguely resembling sentences using regular expressions. If the word following 'a' or 'an' started with a quote or parenthesis, the initial quote or parenthesis was ignored. The resulting prefix-list with the code to query it is less than 10KB in size; excluding the actual counts would reduce the size still further.
You may use, modify, redistribute and do whatever you want with the data+script used on this page, but please don't misrepresent its source (license: Apache 2.0). If you've any questions, you can mail me at <firstname>@<lastname>.org.