Marco Ramponi

Developer Educator at AssemblyAI

AI trends in 2024: Graph Neural Networks
AI trends in 2024: Graph Neural Networks

From fundamental research to productionized AI models, let’s discover how this cutting-edge technology is powering production applications and may be shaping the future of AI.

AI for Universal Audio Understanding: Qwen-Audio Explained
AI for Universal Audio Understanding: Qwen-Audio Explained

Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific finetuning.

Introducing Our New Punctuation Restoration and Truecasing Models
Introducing Our New Punctuation Restoration and Truecasing Models

We’ve trained new Punctuation and Truecasing models on 13 billion words to achieve a 39% F1 score improvement for mixed-case words. Building on a novel application of a hybrid architecture for a character-level classifier reduces inference time and improves the scalability of our Speech AI systems.

Combining Speech Recognition and Diarization in one model
Combining Speech Recognition and Diarization in one model

A new approach towards multi-speaker speech processing integrates Speaker Diarization and Automatic Speech Recognition in a unified framework. We discuss the key insights from this recent exciting development in Speech AI research.

What AI Music Generators Can Do (And How They Do It)
What AI Music Generators Can Do (And How They Do It)

Text-to-Music Models are advancing rapidly with the recent release of new platforms for AI-generated music. This guide focuses on MusicLM, MusicGen, and Stable Audio, exploring the technical breakthroughs and challenges in creating music with AI.