If you’ve decided to transcribe audio to text yourself, this complete guide can save you countless hours.

Fortunately transcribing audio or video isn’t rocket science, but by covering the basics and knowing some insider secrets, you can cut your transcription time in half.

We’ve outlined easy-to-understand steps to help you:

  1. Increase your transcription speed and efficiency.
  2. Go from a novice to a pro transcriber in the shortest time possible.

Before you begin transcribing audio to text though, it helps to take care of some things beforehand to make your life much easier down the line.

Pre Audio Transcription Checklist

1. Find or Create a Quiet Workplace

Nothing adds more time to transcribing audio than background disturbance or distractions, whether it’s in the audio or in your surroundings.

Try transcribing an audio file in a busy coffee shop and you’ll understand what we mean.

And unless you have industrial strength noise canceling headphones, it helps to find a quiet spot to work.

quiet workplace to transcribe audio to text

So, make sure your home office or desk at work is noise free!

If you live in a noisy area, perhaps even look into cost effective ways to soundproof your room.

2. Make sure you have the right equipment

You can’t efficiently transcribe audio on a smartphone or tablet, unless you can connect a physical keyboard to it.

Which is why a PC/Mac or a laptop is most suitable for long hours of transcription.

Especially if you want to get things done faster.

Many transcribers I know connect external full sized keyboards and mice to their laptop to be more efficient and productive.

3. Warm up with Typing Tests & Games

You need to be a fast typist to be a good transcriber, especially if you want to make any decent money from transcription.

Unfortunately audio transcription is a time constraint based work, which means time is money, and the faster you transcribe, the more you earn.

Here are a few typing tests and games that can help you warm up your knuckles before you dive in head first.

How to Dictate and Transcribe Audio to Text for Free?

PLEASE NOTE:  All free transcription software listed below uses speech-to-text AI similar to ours.

At the time of publishing this article, a speech-to-text AI works best when there’s only one speaker speaking at a very slow pace, close to the microphone with no background disturbance.

If there’s more than one speaker, overlapping conversation, fast pace of talking or background disturbance, any speech-to-text AI will struggle.

In which case, we suggest looking at human audio transcription services.

You will also be sending your private data to Google and Microsoft with no promise of confidentiality.

Google Docs Voice Typing

Google Docs Voice Typing to Transcribe Audio

Google Docs has an in-built dictation tool called Voice Typing.

This is a dictation tool and it transcribes what you say into a microphone into a Google Doc in real time.

Voice Typing isn’t a transcription software, so it doesn’t transcribe an already-recorded interview yet.

It’s intended for people who cannot easily type, or who prefer to dictate notes.

Since this is a dictation tool, as long as you talk slow, it should transcribe everything accurately.

However, much like other speech-to-text AI, if you pick up the pace, introduce more participants, or use any difficult, technical words, this dictation tool will struggle.

One workaround is for you to listen to an interview with multiple speakers, then repeat what they’re saying into a microphone into Google Docs.

That should be faster than transcribing the interview yourself.

If you think that sounds like a bit of hassle, maybe the best way to transcribe audio for you would be to hire a transcription service to do the heavy lifting, so you can concentrate on more important tasks related to your project.

Transcribe Audio in Microsoft Word

Microsoft Word 365 now has a transcribe feature that can help you transcribe audio using a very lightweight speech to a text AI model.

Similar to Google Docs voice typing, you can dictate into a word document.

But on top of that it also allows you to upload an audio file for transcription.

If you have more than one speaker with background noise, results are underwhelming and consistent with other speech to text AI.

How to Automatically Transcribe Audio to Text?

Automated speech-to-text transcription software has changed the way most people transcribe audio these days.

Our article on the Best Speech Recognition Software should help with the best recommendations.

However, while speech-to-text AI sounds amazing in theory, it has its limitations.  Below are some things to note:

  1. For best results have only one speaker, speaking slowly, next to the microphone, without any background noise.
  2. If it’s a two-person interview, make sure both of you speak slow and there’s no overlapping conversation.
  3. Try not to use difficult words, super niche terms, or industry jargons.
  4. Make sure you convert your audio or video files to a smaller file format like MP3.

If you can take care of a couple of those things, most speech-to-text software should do a decent enough job.

How to Transcribe an Interview Yourself?

how to transcribe audio yourself

If you’ve scrolled this far down, it probably means you haven’t had success with transcribing audio using the free dictation or automated speech-to-text options.

Don’t worry, our goal with this article and our website/blog is to make it as easy as possible for you to transcribe audio, without hassle, even on files with difficult audio.

We’ve done detailed guides for complete beginners in the past that can be helpful, like:

One of the questions we get asked the most is:

How long does it take to transcribe 1 hour of audio?

For a Novice (without transcription playback software or hardware and moderate typing speed) = 12-14 hours

For an Experienced Transcriber (with transcription equipment and fast typing speed) = 6-7 hours

If you haven’t replayed the same part of a recording at least 10 times, you haven’t really transcribed. :)

Whether you’re just starting out, or have been trying to increase your efficiency as a transcriber without success, it’s important you take care of the basics of transcribing audio.

The steps below can help you do that:

1. Have a Big Cup of Coffee Ready

coffee to transcribe audio

While not extremely necessary, it can help.

If you’re not a fan of caffeine or stimulants in general, substitute with tea, water or short workouts in your breaks to keep yourself energized.

This is because to be able to transcribe audio efficiently, you will have to sustain immense levels of concentration for hours on end.

Be careful to not drink too much coffee though, as it can make you sleepy.

2. Take Regular Breaks

Nothing hampers your productivity more than stress.

So, take regular breaks, breathe in some fresh air, go out for a walk, just do something to break the monotony.

A good rule of thumb that I like to follow and that has brought me great results is to work for 45 minutes followed by a break of 15 minutes every hour.

3. Choose the right Transcription Playback Software

how to use express scribe to transcribe audio

If you’ve been using Windows Media Player to transcribe audio without success, there is a reason.

You shouldn’t control file playback with your hands.  (Unless you’re an alien typist with 4 hands and 20 fingers.)

It usually takes an experienced transcriber (with the right equipment) 1-2 hours to transcribe 20 minutes of audio or video.

That’s because you have to play, pause, rewind the audio or video file several times while transcribing audio.

So you need playback controls like keyboard shortcuts that work even when the playback software window is minimized.

The software we’ve been using for the last 17 years to transcribe audio is called Express Scribe.

The link above will help you install the free version of the software, which is more than good enough to get you started for now.

Once installed, please refer to our guide on How to Use Express Scribe to get a quick overview + some tips and tricks before you begin to transcribe audio.

4. Invest in Good Transcription Hardware

If you’re going to transcribe audio often, for example on a daily basis for a period of time, please invest in transcription playback hardware.

And by that we mean transcription foot pedals.

These are a game changer when it comes to speeding up the transcription process.

They’re more efficient compared to using shortcuts on a keyboard because you control playback with your feet while your hands are free to type.

A foot pedal helps you get in a rhythm and flow where you don’t have to stop typing every time you need to use playback controls like play, pause, rewind etc.

Why invest in a transcription foot pedal?

It cuts your time in half, literally.

Without a foot pedal it will take you between 10-12 hours to transcribe audio/video that’s 60 minutes long in duration.

For the same duration, a transcription foot pedal cuts the time down to 5-6 hours.

So if you’re in it for the long haul, this investment will save you TONS of time.

5. One Step at a Time

steps to become an expert in transcribing audio

Understand that in order to transcribe audio efficiently, you’re going to have to be patient.

Audio transcription can be a tedious, and oftentimes frustrating process that requires a lot of concentration.

Take breaks often, get enough sleep, and take it easy at first.

You can build speed into it as you get used to it all.

A lot of people jump in with a lot of enthusiasm but get discouraged when they see that all their efforts have only resulted in transcription of 5 audio minutes in the last 2 hours.

But Rome wasn’t built in a day etc.

How to Transcribe a Video?

There is no specific video transcription software that allows more features.

Oftentimes it’s more beneficial to extract audio from video and use it in a transcription playback software like Express Scribe, especially if you want to add timestamps using Express Scribe to make an SRT file.

Having said that, Express Scribe allows you to play video and transcribe text next to it in the same window with playback controls, so that makes things a bit easier.

How to Transcribe a YouTube Video?

youtube captions fail

If YouTube’s buggy and grossly inaccurate closed captions don’t float your boat, you’d want to transcribe a YouTube video yourself.

YouTube doesn’t allow keyboard shortcuts or foot pedals for playback controls, so it’d be fairly difficult to transcribe it that way.

However, you can do a Google search for “youtube to mp3” and find websites which will convert a YouTube video to MP3 or another audio format online.

That is getting more difficult though as copyright features from YouTube may prevent you from extracting MP3 audio from their videos.

Online YouTube to MP3 converters still exist though, and you should take advantage while you can.

Most of them allow you to paste the video link and convert to MP3 in a few seconds or minutes.

Once you have the MP3, you can load it in Express Scribe and start transcribing.


  1. Dictation software and speech-to-text AI are good, but don’t expect any miracles or 100% accuracy.
  2. Be patient when you start to transcribe audio.
  3. Invest in the right software and hardware to increase efficiency and make your life a little easier.
  4. Use online tools to your advantage for video or YouTube transcription

If you think this guide was helpful, please share it with your friends.

Have any suggestions to improve this article and make it more complete?  Please leave a comment below.

Our transcription services has been rated excellent with 100% 5-star reviews.  Let us know if we can help you out too.