Digital File Types for Audio Transcription

Aug 11


Anne Hickley PhD

Anne Hickley PhD

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

There are a variety of recording systems available, suitable for dictation to conference recording. If you already have a digital recording mechanism, you can probably record a variety of different file types for different purposes. This article aims to discuss the different types and suggest the right one for you, depending on your circumstances. If you are still considering which digital recording device to purchase then you should consider the file types it will produce before you buy.


There are a variety of recording systems available, Digital File Types for Audio Transcription Articles suitable for dictation to conference recording. If you already have a digital recording mechanism, there is a good chance that it will record a variety of different file types for different purposes. This article aims to discuss these different types and suggest the right one for you, depending on your circumstances. If you are still considering which digital recording device to purchase then you have the opportunity to consider the file types it will produce before you buy.

If you do not know what file types you are working with, you can tell by looking at the file extension. This is the set of three letters that follow the dot, as in, for example, 'interview.wav'       

The different file types all have advantages and disadvantages for transcription services, the most obvious of which is a trade-off between quality and file size. Sound files can be very, very large if they are not compressed, but compression is 'lossy'; in other words a complete or 'lossless' audio file has been taken and compressed, which removes data that is considered redundant, resulting in reduced audio quality, which can cause problems for the transcriber.

It may initially seem obvious that you and your transcriptionist want the best quality but in fact, many lossy formats have a negligible quality loss but are much smaller files. If you are planning to email files for transcription to your transcriptionist, the advantage of a 2MB file, as opposed to one 40MB in size, should be obvious! No sound file of any length is small, but at least it is possible to email a 2MB file for transcription. Most service providers will not allow a 50MB file through, and even if they did it could take hours to download, blocking both your email and your transcriptionist's. More and more transcriptionists are using a system which bypasses email; you can either upload files directly to their website or send your files using a simple file transfer programme. However, even these options have limits to the file sizes as a rule.

It also worth noting that depending on the playback software being used for transcription, your transcriptionist may only be able to play back certain file types. Some cover practically all digital file types while others are more limited, so it is worth checking first.

The 'right' file type and attributes for you and your transcriptionist will also depend on what the purpose of your recording is. If it is a dictation, a lower sound quality will still provide a clear enough recording for a digital transcription.

If you are recording a focus group, for example, where several people are seated at different distances from the recorder and speaking at different levels and pitches, you will probably need a higher sound quality to accommodate this.

Your recording equipment may allow you to set different attributes for the same file type. This can make an enormous difference to the sound quality and size of the file, and consequently the transcription quality. In some cases, for example dictation (one person speaking into machine, in a quiet environment) you can probably afford to loose sound quality and the recording will still be clear for transcription. In other cases (focus groups, noisy environments) you may find you need to choose a slightly larger file size in order to maintain decent sound quality.

Attributes are often shown as Hz. 8,000kHz mono is suitable for dictation and the range goes up to 44,100kHz stereo, which is the top quality, used for music CDs. Examples of different file formats and some details about their use in transcription follow: WAVeform Audio (.wav) WAVeform Audio (.wav) is a common file format and was one of the first audio file types developed for use with the PC. It is lossless, but generally very large. This means that you will probably need to send the files on a CD, rather then emailing them, although some transcriptionists, including myself, on my site, have a system whereby you can send large files via the internet without using email. You certainly need a broadband connection, or similar, to utilise these effectively though. Warning! Not all wav files are the same! Although they all end .wav, depending on the recorder, you and your transcriber may need a special 'codec' to play it back. An example is Sanyo; a popular and moderately priced recording system but one that records specifically Sanyo wav files. You should check that your transcriptionist has the ability to transcribe Sanyo wavs. If not, Sanyo may oblige by sending out the relevant codec on CD, if you ask them nicely! The following types are all lossy, but generally the sound quality change is negligible and you will save significant time and money with reduced transfer times. This is not an exhaustive list of all audio file types; there are a huge number. It aims to cover most of the types recorded by available transcription software. MPEG-1 Audio Layer 3 (.mp3) This is a compressed WAV file often used in music. Many digital dictation recorders will also record mp3 or allow you to record WAV and then compress to mp3 to send on for transcription. The compressed files will be around a twelfth the size of WAV files.            Windows Media Audio (.wma) Windows Media Audio (.wma) was developed for Windows Media Player which is bundled with all Windows-based PCs these days. It is even more compressed than an MP3, to about one thirty-sixth the size of a .wav, but apparently retaining the original sound quality. I have to say that in my experience of transcribing I am not sure whether this has always been the case. Digital Speech Standard (.dss) In my experience, most playback software used for transcription will play .dss files. If not, there is a free download available on the Olympus site. I believe .dss was developed by Olympus and almost all, but not all, Olympus recorders will record .dss files. Lanier and Grundig recorders also generally use .dss. The file size is reduced by twelve to twenty times, as compared to a WAV file, and is ideal for transcription as it is small and easy to email.


This is Sony’s answer to the .dss file. It is a very highly compressed file, but fine for voice, and its small size makes it very easy to send by email. Encrypted dictation (.dct) Often used for medical transcription, which requires very high confidentiality, these recordings are encrypted at the recording end and need to be decrypted on receipt by the transcriber. A wide variety of playback software will deal with these files. TrueSpeech from DSP Group TrueSpeech, from the DSP group, was designed for personal computers and personal communications devices. It has very high compression ratios ranging from 15:1 to 27:1. If you are able to record this format it is probably best restricted to use with dictation or one-to-one interview in a quiet environment. It is probably too lossy for focus groups etc. There are a whole host of other file types available so don't worry if the type that your machine produces is not listed above. Contact your VA and s/he will probably be able to assist you, or at least point you in the right direction. CD Audio file (.cda) These files are standard recordings onto CD, and are generally the file type of music files bought no CD. As I understand it (not too well!) the CDA is actually just a sort of cover file that says this is a file on a CD, and the underlying file is probably a wave file or one of the other file types listed above. Most transcription recorders will not record to CDA but if you are having a professional recording made of a conference or series of lectures, for example, you may well find yourself with CDA files. Most transcription software will not work with CDA format. There are a number of ways to convert CDA files but these can be expensive and/or time consuming. Transcriptionists specialising in digital transcription will probably be able to transcribe these files but you may be charged a surcharge for the time taken in converting them to a useable format. There are a variety of converters available, but one that I have used successfully, and that provides a trial version, is Easy CD Ripper, a shareware with a fully functional trial version that can be downloaded from various places including

You will probably have only a limited range of file types available for your transcription, depending on your recording equipment, but any equipment should provide you with a range of options depending on your needs. If you have the opportunity it is always a good idea to make a test recording with the settings that you think will be right. Then play it back, or send it to your transcriptionist to play back, in order to check that the sound quality is acceptable. Finally, I would just like to say that this information is based on my understanding and experience of digital transcription. I cannot absolutely guarantee its accuracy but would be very interested to hear of any errors and happy to correct them. Similarly if you feel that there is an important area I have not covered, please let me know and I will do my best to incorporate it.