IT
OmnvertImage • Document • Network

Speed / Pitch

Change playback speed (keeps pitch) and optionally shift pitch in semitones.

Max 500 MB
Speed (0.5×–2.0×)

Speed uses atempo, which preserves pitch.

Pitch (semitones)

Pitch shift uses asetrate + aresample. Combine with speed as needed.

Server-sideProcessed server-side

This tool uses a server-side service for processing; uploaded files or requests are not kept for long-term storage.

About

This tool lets you speed up or slow down audio for practical listening and practice workflows. Make a long podcast faster to get through it on a commute, slow a language lesson down so you can catch every phoneme, take a recorded lecture at 1.5× and cut an hour off your review time, or drop a musical backing track by a few semitones to fit your vocal range. The workflow is the same in every case: upload a file, pick a speed (and optionally a pitch change), and download an MP3 at the new tempo.

Two related but independent settings do the work: speed (or tempo), and pitch. Speed is how fast the waveform plays back, measured as a ratio relative to the original (0.5× is half speed, 2.0× is double speed). Pitch is how high or low the tone sounds, measured in semitones where 12 semitones equal a full octave. Depending on the mode you choose, changing speed can preserve pitch (time‑stretch) or shift pitch along with it (the classic “tape speed” effect), and pitch can be shifted independently without changing speed.

Time‑stretch (preserve pitch) is the right mode for almost all listening use cases. Doubling a podcast’s speed without pitch preservation turns every voice into a chipmunk; with pitch preservation the speaker just sounds faster, which is usually what you want. For language learning, 0.75× or 0.5× with pitch preservation gives you clearer consonants and vowels without the weird low‑pitched drone that a naïve slowdown would produce.

Pitch shifting in semitones is where musical use cases live. A singer whose range doesn’t quite fit a recording can drop the backing track by 2 or 3 semitones to get a comfortable key. Guitar players practicing along with a reference can shift the track up or down to match their tuning. Songwriters sketching cover ideas can try a song in different keys in seconds. One semitone is the smallest step between adjacent notes in Western music; twelve semitones make a full octave.

Safe ranges and artifact behavior: small changes (0.85×–1.25× speed, ±2 semitones pitch) are almost transparent on most sources. Moderate changes (0.6×–1.6× speed, ±5 semitones) still sound natural for speech but start showing processing artifacts on music — slight “swishing,” transient smearing on drums, or a metallic tone on complex mixes. Extreme changes (below 0.5× or above 2.0×, or more than one octave of pitch shift) sound noticeably processed and are usually only useful for effect or analysis rather than enjoyable listening.

Practical listening presets people actually use: 1.25× is a gentle podcast speedup that still feels natural. 1.5× is the sweet spot for review and study content, shaving a third off the runtime. 1.75× and 2.0× work for information‑dense material but require more attention. 0.75× is a comfortable slowdown for dense technical explanations. 0.5× is useful for transcription and language drills but sounds obviously slowed. For music, ±1–2 semitones is standard for key matching; ±3 is the upper end before a song stops feeling like itself.

Why audio sounds “robotic” at extreme settings: time‑stretch algorithms work by analyzing short windows of the signal and repeating or thinning them to change duration without changing pitch. When changes are small, those repetitions are imperceptible; when they are large, the windows start to become audible as a texture, and transient events (drum hits, hard consonants) smear across adjacent windows. Complex polyphonic music stresses the algorithm most; plain speech stresses it least. Smaller increments always sound cleaner than one extreme jump.

Quality and workflow tips: test a short region first before committing to the full file, especially on music. Work from the highest‑quality source you have — a heavily compressed MP3 will show more artifacts after stretching than a WAV or lossless source. Do the speed/pitch change once, not in multiple passes, because each pass adds both algorithmic artifacts and (for this tool’s MP3 export) lossy encoding. If the final file will sit next to other audio, follow up with Normalize so the loudness matches.

A sensible pipeline: (1) if your source is a video, extract the audio first with MP4 → MP3 or MP4 → M4A, (2) trim the exact region you want to play with, (3) apply speed and/or pitch changes here, (4) normalize loudness at the very end. Doing the speed change before normalization is important because changing tempo and pitch can shift perceived loudness; normalizing afterwards produces a consistent final level. Save the original source file so you can rerun with different settings without compounding quality loss.

Things this tool is not for: elaborate multi‑band pitch correction, vocal tuning, removing vocals from a track, or changing speed for only a section of the file. It applies one speed and (optionally) one pitch shift uniformly to the entire file. For surgical musical tasks — auto‑tune‑style pitch correction, key‑change automation inside a song, isolating a specific part — a full DAW with pitch/time plugins is the right tool. This tool is built for simple, whole‑file transformations you want done in seconds.

Duration math: the new duration equals original duration divided by the speed ratio. A 60‑minute lecture at 1.5× becomes 40 minutes; at 2.0× becomes 30 minutes. A 3‑minute song at 0.75× becomes 4 minutes. This is exact — the tool doesn’t round — so you can plan review windows or playlist slots confidently. If you need a very specific target duration, pick the speed that divides the original evenly into it (e.g. a 48‑minute target from a 60‑minute source needs 1.25×).

How it works

  1. 1Open Speed / Pitch and choose your file or enter the required input.
  2. 2Check the settings and start the process.
  3. 3The tool creates the result with temporary server-side processing.
  4. 4Download the output or copy the result when it is ready.

FAQ

Can I speed up without changing pitch?
Yes — use the time‑stretch mode that preserves pitch while changing tempo. For listening workflows (podcasts, lectures, language practice), this is almost always the right choice; without pitch preservation, speeding up makes voices sound high and squeaky.
Why does the audio sound robotic or metallic?
At large speed or pitch changes, the algorithm’s short analysis windows become audible as a texture, and sharp transients smear across them. Smaller increments sound cleaner. Music and complex mixes stress the algorithm more than plain speech — expect more artifacts there.
What does semitone shift actually mean?
A semitone is the smallest step between adjacent notes in Western music (for example C to C♯). Twelve semitones equal a full octave. Shifting by semitones is how musicians talk about key changes — ±2 semitones is “a tone up/down.”
Does changing the speed change the file duration?
Yes, exactly. A 60‑minute file at 1.5× becomes 40 minutes, at 2.0× becomes 30 minutes, at 0.75× becomes 80 minutes. The tool computes this exactly, so you can plan target durations confidently.
Will this work for speech and music?
Both, but artifacts are more noticeable on music and especially on complex polyphonic mixes. For speech, time‑stretch with pitch preserved sounds natural at a wide range of speeds; for music, keep speed changes within ±25% and pitch shifts within ±3 semitones for clean results.
What output format do I get?
An MP3 at a sensible bitrate. If you need a lossless stretched file (for archiving or further editing), do the speed/pitch change in a DAW and export to WAV or FLAC.
What’s a good speedup for podcasts and lectures?
1.25× is a gentle, natural increase. 1.5× is the sweet spot for learning and review content. 1.75× and 2.0× work for information‑dense material but demand more attention; try 1.5× first and step up if you still feel it’s slow.
What’s a good slowdown for language learning?
0.75× keeps the recording natural‑sounding while giving you more time to parse each word. 0.5× is useful for drilling tricky pronunciations but sounds obviously slowed. Always use pitch preservation for language work.
Can I shift pitch without changing speed?
Yes. The tool lets you shift pitch independently, which is what you want for key matching — raising or lowering the tone while keeping the tempo identical to the original. Useful for singers, instrumentalists, and songwriters testing alternate keys.
Should I apply speed and pitch changes in multiple passes?
No. Do it once with the final settings. Each pass introduces both stretching artifacts and (for MP3 export) a lossy re‑encode; stacking passes quickly degrades quality compared to a single combined transform.
Do I need to normalize after changing the speed?
Often yes. Speed and pitch changes can shift perceived loudness subtly. If the file will sit next to other audio, run it through Normalize afterwards so the final clip matches the rest of your material.
Is my file stored?
The upload is processed to generate your download and isn’t retained for long‑term storage. For very sensitive material, prefer a local DAW so the file never leaves your device.