Coding Responses

While listening to all of the sound files was extremely time consuming (and kind of mind-numbing), coding proved to almost as bad. Since this was essentially a pilot study (i.e. we weren’t working off of much background and were using this to plan for further research), I had nothing to base the coding off of other than our raw data. After talking with my advisor, we decided to use 4 main categories – 1 = perfect, 2 = minor errors, 3 = major errors/unintelligible, and 4  = metathesis. The plan was to hone these categories down as I started listening to data so the coding would effectively summarize our data.

Honing down these categories ended up being quite the task, because the spectrum of responses was enormous. Sometimes a word had a voicing error and a place error, but was almost perfect otherwise. Other times there’d be one completely wrong sound but everything else was exactly as it was supposed to be. Sometimes the recording was hard to hear and determining what sort of error (if any) had been made became almost a guessing game. The categories ended up being slightly broader than anticipated (with the exception of the metathesis category), but I think each served it’s own purpose and a massive amount of time was saved by not getting down to nitty-gritty details for each group.

The categories ended up as…

1. Perfect or near perfect: words with no errors or with only one voicing error (i.e. ‘p’ instead of ‘b’)

2. Minor error: any word with 3 or fewer mistakes (so a response could have any combination of errors – usually voicing or place – as long as there were 3 or fewer)

3. Major errors: any words that were impossible to understand or had more than 3 mistakes

4. Metathesis: any word that would have been in the perfect or near perfect category if two sounds had not metathesized.

I did my best to keep these categories concise, but when working with hugely varied human speech, it’s impossible to keep things in black and white. ‘2’ ended up being sort of a catch-all – the majority of responses fell into this category. These categories were also decided on intuitively in some cases – when it was hard to get an exact number of errors for a word, comparing the sound of the target word to the response word was useful in determining how glaring the errors were. This system of coding was useful for approximating how ‘successful’ each person was at giving responses – so for example, if someone got a lot of 3s and 2s, no 1s, and also a lot of 4s, we would take that into account when determining whether or not something was actually metathesis or if it was just a regular error. Most importantly, this system of coding allowed us to easily separate the cases of metathesis from the rest of the data.