It’s that time of year again. I’m back in the office, working on #amoresproject. This post is only tangentially related to the project so this is not going to be an update – if you’re new to my blog or missed my earlier posts on the topic, you might want to have a look here and here.
I have previously said – well, I can’t remember if I’ve said it in that many words on the blog, but certainly plenty of times in other contexts – that it’s been really interesting working on the project in a non-teaching role. I enjoy teaching, but it’s generally familiar ground. It was great to have the opportunity to work in a different environment and deal with tasks a teacher does not normally do in the course of a day. Not just because it was novel, but also because I now feel more confident that I would still be employable if, at some point, teaching, for whatever reason, was no longer an option.
For instance, something I recently did was transcribe interviews. I don’t know that this particular skill will contribute significantly to my employability; however, I was surprised to discover that there are agencies out there that offer this as a billable service. At roughly $1-4 per audio minute, depending on how quickly you need the finished product and the level of accuracy you’re happy with.
These were interviews with teachers on the project – we needed to find out how they felt about a number of aspects of the pilot now that it’s over. The first one was with the Croatian teachers: we thought it would make more sense to interview them in Croatian, as they’re not teachers of English, and thus might find it easier to express themselves and feel more comfortable in their mother tongue, particularly as the interview was recorded. The plan was that my boss would do the interview and share the audio files with me, and I would translate them so that they could be used in the pilot evaluation.
As this was a first for me – translating audio (into writing) – I’m not even sure this qualifies as transcription. I tweeted this as I was working so if you know a more accurate term, please let me know in the comments.
The recording was about 35 minutes long and seemed easy enough to follow content-wise, so I translated directly; I listened to the original in Croatian and typed out the English translation. The second interview was with the UK teachers and that, obviously, was transcribing. The audio was roughly the same length, maybe 5 minutes longer, and I should point out that another researcher met with the teachers in person, so I wasn’t present at that interview either.
I was fascinated by the differences between the two resulting texts. Despite the similar length, the Croatian document was just 7 pages long compared to the UK one at 15. I thought I would describe some of my observations about working on both interviews, and recommend a pretty useful tool at the end.
The Croatian interview
- The first thing I noted was the length of turn-taking: the interviewer asked a question and one of the teachers would answer in some detail, speaking generally for around a minute. Okay, perhaps this didn’t stand out quite so much until I transcribed the UK interview, about which more below. Just to illustrate what I mean though, the Croatian interviewer had 24 turns, while the UK one had 85!
- What did stand out from the start was that none of the interlocutors spoke across each other. I remember thinking that they were almost too polite in terms of waiting for the other speakers to complete their sentences, at least compared with a typical (not overly formal) conversation. I suppose this could have been out of consideration for the translator, but it’s a little difficult to keep up as the conversation goes on just for that reason, I think.
- There’s this thing you’re taught as a first grader in Croatia – answer in full sentences. “What do you think was the most important part of the book?” “The most important part of the book was…” – takes me back to elementary school. 🙂 I got the impression it was sometimes used in the interview to buy the interlocutor some time and allow them to organize their thoughts.
- Because I was translating and wanted to convey the message in the source language as accurately as possible, I wasn’t sure initially if I should ignore fillers like “um”. There were, however, relatively few ums, and as they didn’t add anything to the meaning I eventually decided against including them in the translation. Other words/phrases that did seem to serve the purpose of fillers were “also”, “in a way” and “certain” (as in “certain level”, “certain content”) – i.e. their Croatian equivalents, and I included those in the text.
- It was easy to follow the speakers; they enunciated clearly and apart from one short segment that was inaudible because it hadn’t recorded properly I was able to hear and understand everything that was said in the interview. The audio was broken up into 20 short files and thus easy to manipulate.
The UK interview
- The length of turn-taking was noticeably shorter. The interlocutors rarely spoke for a minute; there were some longer turns, but generally this was much more like a real conversation with people cutting across each other and speaking at the same time – even though the interviewer said they should try and avoid that at the very beginning. The UK teachers had 76 and 84 turns, while the Croatian ones had 19 and 20.
- I noticed the fillers a lot more. Maybe I should have been more selective about including them in the text because, again, not all added significantly to the meaning, but since I was transcribing and didn’t have to process the content as much as when I was translating, I found it easier to just type what I heard. Those that seem to have been used frequently are “um”, “okay”, “yeah”, “you know” and “obviously”.
- The interlocutors spoke in short bursts, compared to the first interview. By this I mean that they tended to utter a segment and then pause, which meant ellipsis featured quite heavily in the text.
- The teachers occasionally asked each other questions, unlike in the first interview where only a couple of references to what their colleague had said a little while previously indicated that the two teachers had actually been interviewed together.
- It was at times quite difficult to follow the speakers. A spot of googling revealed that when transcribing you will either not hear fragments (or longer utterances), which you should indicate in the text as “inaudible” or you won’t be able to make out exactly what you’ve heard, which you should indicate as “unintelligible”. It’s apparently also good practice to note down approximately how many words you missed out. There were several times when I very reluctantly put down [unintelligible 2-3 words] because either the speakers trailed off or ran their words together to the extent I couldn’t decipher what they were saying even when I slowed the recording down. I say very reluctantly not because I think those fragments added something significant to the meaning, but because of this nagging feeling – hard to shake off – that someone will say it was unintelligible to me because I’m not a native speaker. Although rationally, of course, people obviously have better things to do with their time.
One final thing – the useful tool I mentioned. I was about 10 minutes into the UK transcription – that would be 10 minutes of audio and 2 hours of typing – when I came across this page, containing tips on transcribing interviews. Some helpful soul had shared it on Twitter. I was suddenly dizzy with the discovery of machine transcription! OMG! I could sit back and let YouTube do the work for me. To cut a long story short – and uploading a 40-minute audio file to YouTube as a video takes a surprisingly long time – YouTube could not even attempt a transcription of the interview. Possible reasons are that the sound is too low, speech too indistinct, or the recording is simply too long.
I tried out this service as well; I’m not sure if it’s available to everyone, but it was free of charge once I registered. I received the transcription in a few hours, but it was unfortunately inaccurate to the point of being pretty much useless.
Finally, I signed up for a free trial of Transcribe, and was absolutely thrilled with the results. After my unsuccessful forays into machine transcription territory their claim that “automatic audio to text conversion is largely science fiction” sounded like they might really know what they were talking about. Anyway, I can’t recommend it highly enough. You upload the audio, so, obviously, it works with any language, and being able to manipulate the audio and type on the same page is already a huge time-saver. In addition, you can slow down or speed up the recording, activate a loop which basically allows you to keep typing instead of wasting time replaying the audio manually, move back and forth in tiny increments… I love it!
Have you ever had to transcribe audio? Was it in one or more languages and did you notice any differences? Any tips for effective transcribing?