Transcription

This section describes features that are coming in 4.0

With Transcription in place, a content instance will automatically convert an audio source attachment to a text file that contains the transcription of the language detected in the source. The textual transcript result will contain all of the identified spoken words in the audio file. It will be saved onto the original node as text/plain attachment with the name transcription.

Any audio content that you place into Cloud CMS will automatically have a transcription generated for it. As you update or modify the audio, the transcription will be kept in sync.

In addition, the transcription text will be indexed for full-text search. As such, your editorial team will be able to search for content based on the spoken audio.

For example, we might define a content type for an audio file where the editor can fill in attributes about the audio itself. As part of this, you may wish to have a transcript generated. The f:auto-transcribe feature could be added to this content type to make it so this happens automatically.

Triggering Transcription

To trigger transcription, you will need to add the f:auto-transcribe feature to your content instance.

When you add the f:auto-transcribe feature to a content instance, Transcription will run every time you make a change to the audio attachment source.

For more information on this feature, please check out our formal documentation on the f:auto-transcribe feature.

Using a Drop Zone

We recommend considering the use of a "Drop Zone" folder to let your editorial users drop audio files into Cloud CMS whenever they'd like. A Drop Zone folder can be configured to execute a Rule when new content arrives. The rule can then execute the Add a Feature action to add the f:auto-transcribe feature to newly arrived items.

To do so, you simply set up a Rule on a folder that is bound to the p:afterAssociateNode policy. The Rule runs the Add a Feature action. Now, when a user drops a image (or a PDF or any other content with a binary payload) into that folder, Transcription will run.

You might then configure a follow-on action that moves the document to a different folder.

Transcription Results

Once Transcription has completed, there will be an attachment stored on your node with the results:

  • transcription - a text file

transcription

The transcription attachment contains an text file that is in the text/plain format.

Transcription Providers

The following Transcription Service Providers are available: