Data and code


How to cite

A formant-based vowel classifier

<< Back to the main page

Roy Becker's corpus of vowel formants was used as a reference data-set. Vowels with less than 10 entries in the corpus were removed. The formant plot of the train data:

Formant plot of Becker's vowel data

Two separate nearest-neighbour classifiers (with the number of neighbours rather arbitrarily set to 7) were trained for two-formant and three-formant queries. KNeighborsClassifier from the Python package scikit-learn was used.

The classifiers are re-trained every time the program starts, so you can supply your own reference data table. The code and the data are available in a GitHub repo. In order to start a local copy, clone the repo, install the deps using pipenv, and execute the command indicated in the Procfile or run heroku local. (Don’t forget to modify the JavaScript in the driver site if you want to use it too.)