Services

Speech, audio and audio-visual datasets for AI and research

FYI Africa collects and processes African datasets across speech, sound, video, transcription, annotation, metadata and quality control.

Speech
Collected
Audio Interviews, mobile recordings, noisy environments Structured
Video Speaker video, UX tasks, interaction recordings Reviewed
Text Transcripts, translations, labels, timestamps Annotated
Delivery Files, metadata, consent tracking and QC report Ready
What we do

Four core service areas

Each project is structured around the client’s use case, target languages, data type, consent requirements, metadata needs and delivery specification.

01 / Speech Data Collection

Speech datasets for systems that need to understand African speakers, languages and accents.

FYI Africa collects speech data for voice AI, speech recognition, conversational systems, call-centre AI, voice assistants and multilingual model evaluation.

ASR TTS Voice assistants Call-centre AI Accent adaptation

Includes

  • Read/scripted speech
  • Spontaneous/natural speech
  • Conversational speech
  • Call-centre/telephony-style audio
  • Command and control speech
  • Wake-word recordings
  • Code-switching speech

Use cases

  • Automatic speech recognition
  • Text-to-speech support
  • Conversational AI
  • Speech analytics
  • Voice search
  • Automotive voice systems
  • Low-resource language modelling
02 / Audio Data Collection

Audio datasets where sound, speech behaviour, environment or acoustic context matters.

FYI Africa collects audio data in controlled, semi-controlled and real-world environments, depending on the project specification.

Human voice Multi-speaker audio Mobile recordings Noisy environments

Includes

  • Human voice recordings
  • Multi-speaker audio
  • Interview recordings
  • Group discussions
  • Product feedback recordings
  • Task-based user recordings
  • Mobile-device recordings
  • Noisy-environment audio

Use cases

  • Speech model robustness
  • AI model training and evaluation
  • Acoustic testing
  • User research
  • Product testing
  • Localisation research
03 / Audio-Visual Data Collection

Audio-visual datasets for multimodal AI, user research and real-world evaluation.

FYI Africa collects consented audio-visual data where spoken content, visual context, task behaviour or interaction environment matters.

Video interviews UX recordings Multimodal AI Interaction data

Includes

  • Video interviews
  • Speaker videos paired with audio
  • Product interaction recordings
  • Customer service simulations
  • User experience research recordings
  • Instruction-following tasks
  • Screen-and-camera recordings
  • Multilingual video responses
  • Code-switching video responses

Use cases

  • Multimodal AI
  • Speech-plus-video model testing
  • Human interaction datasets
  • Behaviour and context-aware model evaluation
  • User experience research
  • Localisation testing
04 / Dataset Processing and Delivery

Turning raw recordings into structured, usable datasets.

FYI Africa can support the processing layer that makes collected data usable for training, testing, evaluation, localisation and research.

Transcription Translation Annotation Metadata QC

Includes

  • Verbatim transcription
  • Clean transcription
  • Local-language transcription
  • Translation into English
  • Speaker-labelled transcription
  • Timestamped transcription
  • Code-switching transcription

Dataset outputs

  • Annotation and labelling
  • Metadata structuring
  • Quality-control reporting
  • Structured file delivery
  • Consent tracking sheets
  • Delivery summary
Dataset delivery

Complete, usable datasets — not just recordings

FYI Africa structures delivery around the client’s technical requirements, consent needs, metadata fields and quality criteria.

Files

Audio, video or audio-visual files delivered in agreed technical formats.

Text

Transcripts, translations, speaker labels, timestamps and text outputs.

Metadata

Structured fields for language, accent, region, speaker profile, device and environment.

QC

Quality-control reporting, consent tracking and delivery summaries where required.

Start a project

Need a custom African dataset?

Tell us the data type, use case, target languages, sample design and delivery requirements. We’ll help define the right project scope.

Scroll to Top