How to Create a Webcam Audio Visualizer with Three.js

A tutorial on how to create a Three.js powered audio visualizer that takes input from the user’s webcam.

In this tutorial you’ll learn how to create an interesting looking audio visualizer that also takes input from the web camera. The result is a creative visualizer with a depth distortion effect. Although the final result looks complex, the Three.js code that powers it is straightforward and easy to understand.

So let’s get started.

Processing flow

The processing flow of our script is going to be the following:

  1. Create a vertex from every pixel of the image we get from the web camera input
  2. Use the image data from the web camera and apply the magnitude value of the sound frequency to the Z coordinate of each particle
  3. Draw
  4. Repeat point 2 and 3

Now, let’s have a look at how we can get and use the data from the web camera.

Web camera

First of all, let’s see how to access the web camera and get an image from it.

Camera access

For camera access in the browser, simply use getUserMedia().

<video id="video" autoplay style="display: none;"></video>
video = document.getElementById("video");

const option = {
    video: true,
    audio: false
};

// Get image from camera
navigator.getUserMedia(option, (stream) => {
    video.srcObject = stream;  // Load as source of video tag
    video.addEventListener("loadeddata", () => {
        // ready
    });
}, (error) => {
    console.log(error);
});

Draw camera image to canvas

After camera access succeeded, we’ll get the image from the camera and draw it on the canvas.

const getImageDataFromVideo = () => {
    const w = video.videoWidth;
    const h = video.videoHeight;
    
    canvas.width = w;
    canvas.height = h;
    
    // Reverse image like a mirror
    ctx.translate(w, 0);
    ctx.scale(-1, 1);

    // Draw to canvas
    ctx.drawImage(image, 0, 0);

    // Get image as array
    return ctx.getImageData(0, 0, w, h);
};

About acquired imageData

ctx.getImageData() returns an array which RGBA is in order.

[0]  // R
[1]  // G
[2]  // B
[3]  // A

[4]  // R
[5]  // G
[6]  // B
[7]  // A...

And this is how you can access the color information of every pixel.

for (let i = 0, len = imageData.data.length; i < len; i+=4) {
    const index = i * 4;  // Get index of "R" so that we could access to index with 1 set of RGBA in every iteration.?0, 4, 8, 12...?
    const r = imageData.data[index];
    const g = imageData.data[index + 1];
    const b = imageData.data[index + 2];
    const a = imageData.data[index + 3];
}

Accessing image pixels

We are going to calculate the X and Y coordinates so that the image can be placed in the center.

const imageData = getImageDataFromVideo();
for (let y = 0, height = imageData.height; y < height; y += 1) {
    for (let x = 0, width = imageData.width; x < width; x += 1) {
        const vX = x - imageData.width / 2;  // Shift in X direction since origin is center of screen
        const vY = -y + imageData.height / 2;  // Shift in Y direction in the same way (you need -y)
    }
}

Create particles from image pixels

For creating a particle, we can use THREE.Geometry() and THREE.PointsMaterial().

Each pixel is added to the geometry as a vertex.

const geometry = new THREE.Geometry();
geometry.morphAttributes = {};
const material = new THREE.PointsMaterial({
    size: 1,
    color: 0xff0000,
    sizeAttenuation: false
});

const imageData = getImageDataFromVideo();
for (let y = 0, height = imageData.height; y < height; y += 1) {
    for (let x = 0, width = imageData.width; x < width; x += 1) {
        const vertex = new THREE.Vector3(
            x - imageData.width / 2,
            -y + imageData.height / 2,
            0
        );
        geometry.vertices.push(vertex);
    }
}
particles = new THREE.Points(geometry, material);
scene.add(particles);

Draw

In the drawing stage, the updated image is drawn using particles by getting the image data from the camera and calculating a grayscale value from it.

By calling this process on every frame, the screen visual is updated just like a video.

const imageData = getImageDataFromVideo();
for (let i = 0, length = particles.geometry.vertices.length; i < length; i++) {
    const particle = particles.geometry.vertices[i];
    let index = i * 4;

    // Take an average of RGB and make it a gray value.
    let gray = (imageData.data[index] + imageData.data[index + 1] + imageData.data[index + 2]) / 3;

    let threshold = 200;
    if (gray < threshold) {
        // Apply the value to Z coordinate if the value of the target pixel is less than threshold.
        particle.z = gray * 50;
    } else {
        // If the value is greater than threshold, make it big value.
        particle.z = 10000;
    }
}
particles.geometry.verticesNeedUpdate = true;

Audio

In this section, let’s have a look at how the audio is processed.

Loading of the audio file and playback

For audio loading, we can use THREE.AudioLoader().

const audioListener = new THREE.AudioListener();
audio = new THREE.Audio(audioListener);

const audioLoader = new THREE.AudioLoader();
// Load audio file inside asset folder
audioLoader.load('asset/audio.mp3', (buffer) => {
    audio.setBuffer(buffer);
    audio.setLoop(true);
    audio.play();  // Start playback
});

For getting the average frequency analyser.getAverageFrequency() comes in handy.

By applying this value to the Z coordinate of our particles, the depth effect of the visualizer is created.

Getting the audio frequency

And this is how we get the audio frequency:

// About fftSize https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode/fftSize
analyser = new THREE.AudioAnalyser(audio, fftSize);

// analyser.getFrequencyData() returns array of half size of fftSize.
// ex. if fftSize = 2048, array size will be 1024.
// data includes magnitude of low ~ high frequency.
const data = analyser.getFrequencyData();

for (let i = 0, len = data.length; i < len; i++) {
    // access to magnitude of each frequency with data[i].
}

Combining web camera input and audio

Finally, let’s see how the drawing process works that uses both, the camera image and the audio data.

Manipulate the image by reacting to the audio

By combining the techniques we’ve seen so far, we can now draw an image of the web camera with particles and manipulate the visual using audio data.

const draw = () => {
    // Audio
    const data = analyser.getFrequencyData();
    let averageFreq = analyser.getAverageFrequency();

    // Video
    const imageData = getImageData();
    for (let i = 0, length = particles.geometry.vertices.length; i < length; i++) {
        const particle = particles.geometry.vertices[i];
    
        let index = i * 4;
        let gray = (imageData.data[index] + imageData.data[index + 1] + imageData.data[index + 2]) / 3;
        let threshold = 200;
        if (gray < threshold) {
            // Apply gray value of every pixels of web camera image and average value of frequency to Z coordinate of particle.
            particle.z = gray * (averageFreq / 255);
        } else {
            particle.z = 10000;
        }
    }
    particles.geometry.verticesNeedUpdate = true;  // Necessary to update

    renderer.render(scene, camera);

    requestAnimationFrame(draw);
};

And that’s all. Wasn’t that complicated, was it? Now you know how to create your own audio visualizer using web camera and audio input.

We’ve used THREE.Geometry and THREE.PointsMaterial here but you can take it further and use Shaders. Demo 2 shows an example of that.

We hope you enjoyed this tutorial and get inspired to create something with it.

Tagged with:

Ryota Takemoto

Founder/developer of the digital art platform NEORT. Loves creative coding and making digital art.

Stay up to date with the latest web design and development news and relevant updates from Codrops.

Feedback 1

Comments are closed.
  1. Hi, this is cool, very complex for me … i want to change the colors of the pixels but dont find where to replace this blue color you choose.

    Can you please let me understand how to replace colors in your index.js ?