Case Study:Captioning Emotions Rather Than Words

This Case Study proved two things: humans still beat machines when nuance is the brief, and McGowan Transcriptions can evolve traditional transcription skills into something far more bespoke when a project demands it.

A client approached McGowan Transcriptions with an unconventional request. Instead of traditional subtitles, they needed SRT files that timestamped only the changes in an actor’s emotions, ignoring the spoken words entirely.

The Brief

  • 80 videos, 15 minutes each
  • 15 emotions defined by the client
  • 240 SRT caption files (3 per video)

The videos featured actors speaking direct-to-camera.

Rather than traditional subtitles, the client needed emotion-only captions, clear markers pinpointing the exact moment an actor’s emotion shifted, selected strictly from a set of 15 pre-approved emotions provided.

The Challenge

Emotion is subjective. What seems confident to one viewer may look nervous to another. We explained this to the client, who acknowledged the issue but felt that in their videos, the changes would be clear enough.

With that reassurance, and the understanding that our interpretation would always be subject to their review, we agreed to proceed.

The Approach

After analysis of the project, we proposed a three-file structure for each video:

  • Full transcript
  • Standard closed captions (SRT)
  • Emotion-only SRT, formatted in square brackets

The client provided two test videos, each capturing one side of a two-person conversation, giving us the perfect setup to trial and demonstrate our process.

The test proved our in-house transcribers could spot emotion shifts reliably and log them consistently in the agreed SRT format. Once the client signed off the results, we scaled the process across the full 80-video project.

Keeping 20+ Transcribers in Sync

To maintain consistency across a project built on interpretation, we:

  • Expert Oversight: Formed a senior group of transcribers
  • Dedicated Client Line: Created a client-specific feedback channel
  • Team Calibration: Trained the wider team using real video examples
  • Rigorous Quality Control: Ran QC on every single file to remove stray text and verify emotion timestamps and taxonomy accuracy

This final layer of scrutiny meant the client received a fully brief-compliant set of files, ready for immediate use with complete confidence in their accuracy and consistency.

Human vs AI

Although AI is advancing rapidly, this was not a project that could ever be automated. Detecting emotion on screen is not a fixed science; it relies on subtle cues such as tone of voice, body language, and facial expression, which can often be ambiguous or overlapping. A computer might detect a smile, but it cannot tell if it reflects happiness, nervousness, or even sarcasm.

What the client required wasn’t detection, it was judgement. Thoughtful, reasoned, and applied the same way across 80 videos. Only experienced human transcribers, working to a strict 15-emotion framework, could interpret those shifts reliably and timestamp them in a way that was useful for research and training.

In this case, 100% human input was essential. It caught the clear changes, navigated the unclear ones, and brought logic where emotions overlapped or blurred. The result: captions that actually meant something, built on 100% human judgement, 100% consistently applied.

The Result

We delivered bespoke emotion captions, standard captions, and transcripts for all 80 videos:

  • Project Delivery at Scale: Delivered 240 files across 80 videos
  • Right First Time: On time and in the exact format required
  • Ready to Use, No Rework: Immediately usable for the client

The emotion-only files are now being used for research and training within a much larger project rolling out later this year. The client didn’t just get files. They got a consistent foundation they could trust and build on.

The Reflection

This project highlights what McGowan Transcriptions does best. By utilising our 30+ years’ experience, we were able to:

  • Adapt quickly when projects go off-script
  • Train and unify a team across remote platforms
  • Apply decades of expertise with structured consistency
  • Deliver work that’s useful, not just accurate

Sometimes the brief isn’t words; it’s emotion. Either way, we caption what matters.


GET A QUOTE

    Intelligent VerbatimFull Verbatim (includes erms, false starts, repetition, superfluous text)Time CodesMedical TranscriptionTechnical Transcription

    * Compulsory

    Transcription Quality Values

    Our values

    • Quality is never compromised
    • We are honest, transparent, ethical, and fair
    • We will never produce a sub-standard transcript
    • Constantly moving forward
    • We treat every client and transcriber equally
    • We invest in our team and provide consistent on-going training and opportunities for advancement
    • You get what you pay for which is why our transcribers are paid well
    • We have a five-day week to give our team a good work/life balance
    • We believe in allowing our team to unplug and do not contact them after 6pm

    Need A Quote? Want To Upload Your Files Now?