Multimedia Processing

The Associated Press wished to automate the creation of video shotlists to increase the throughput of assets they would be able to sell to their customers. They turned to Vidrovr to deliver that solution.

The Challenge

The Associated Press provides raw content, video and writing to over 10,000 customers around the world. This makes them one of the most prolific and important sources of information for every human being.

lt takes an army of editors to make sure the 1000‘s of hours of video content that they send to customers is properly annotated and well described so that their customers can quickly make use of those videos in their television broadcasts and online news articles.

Typically, videos are supplied to customers consisting of many "shots", which are one visually distinct section of video.Customers then use one of more of the "shots" when they areproducing a story. The AP team has to ensure each shot isproperly described with a phrase like, "Jen Psaki, White HousePress Secretary arriving at the podium for a briefing".

This is an incredibly time-consuming process, and they would like to alleviate this work for their editors so they can undertake more creative tasks.

Our Solution

In conjunction with Limecraft and the IBC, Vidrovr developed an "Automated shortlisting" solution that alleviates the pain for editors of manually describing each of the shots within a video that they publish to their customers. When the raw video is sent to Vidrovr, Vidrovr  extracts timestamps for when each shot starts and ends and a detailed description of the particular shot timeframe.

Example shot caption

This example shot, which has been analyzed by Vidrovr AI and annotated with a description for future search purposes, demonstrates the powerful capabilities of our technology.

Description: "Donald Trump, former US President, shakes hands with Germany‘s Chancellor Angela Merkel"


The Vidrovr shotlist action then provided the following information that the AP team published alongside the raw video asset with their clients:

  • Exact start and end frame for a given shot
  • The political or popular figure(s) that are appearing in a shot
  • The language being spoken in a shot
  • A human readable sentence highlighting the figures, setting and any actions that are taking place.
  • Additional knowledge graph powered person and entity titling such as "Jen Psaki" : "White House Press Secretary"

Further shot caption examples

"President candidate Bernie Sanders, delivers a speech in the US state of Nevada"
"SOUNDBITE (ENGLISH): Joe Biden, U.S. President"


As this was a man in the middle system built to augment editors‘ workflows. The AP team inspected the shot lists generated by Vidrovr, and proceeded to correct or change any issues.

With each correction, the Vidrovr system understood the correction and tweaked the automated shot captioning model creating a feedback loop that leads to improved performance.

Contact us today to learn more: email – or, if you prefer, request a demonstration on our website.

Ready to Get Started?

Schedule a Demo!

Join the growing list of companies that have turned to VidRovr to action their media landscape.