From our sponsor: Meco is a distraction-free space for reading and discovering newsletters, separate from the inbox.
In this tutorial you’ll learn how to create an interesting looking audio visualizer that also takes input from the web camera. The result is a creative visualizer with a depth distortion effect. Although the final result looks complex, the Three.js code that powers it is straightforward and easy to understand.
So let’s get started.
Processing flow
The processing flow of our script is going to be the following:
- Create a vertex from every pixel of the image we get from the web camera input
- Use the image data from the web camera and apply the magnitude value of the sound frequency to the Z coordinate of each particle
- Draw
- Repeat point 2 and 3
Now, let’s have a look at how we can get and use the data from the web camera.
Web camera
First of all, let’s see how to access the web camera and get an image from it.
Camera access
For camera access in the browser, simply use getUserMedia()
.
<video id="video" autoplay style="display: none;"></video>
video = document.getElementById("video");
const option = {
video: true,
audio: false
};
// Get image from camera
navigator.getUserMedia(option, (stream) => {
video.srcObject = stream; // Load as source of video tag
video.addEventListener("loadeddata", () => {
// ready
});
}, (error) => {
console.log(error);
});
Draw camera image to canvas
After camera access succeeded, we’ll get the image from the camera and draw it on the canvas.
const getImageDataFromVideo = () => {
const w = video.videoWidth;
const h = video.videoHeight;
canvas.width = w;
canvas.height = h;
// Reverse image like a mirror
ctx.translate(w, 0);
ctx.scale(-1, 1);
// Draw to canvas
ctx.drawImage(image, 0, 0);
// Get image as array
return ctx.getImageData(0, 0, w, h);
};
About acquired imageData
ctx.getImageData()
returns an array which RGBA
is in order.
[0] // R
[1] // G
[2] // B
[3] // A
[4] // R
[5] // G
[6] // B
[7] // A...
And this is how you can access the color information of every pixel.
for (let i = 0, len = imageData.data.length; i < len; i+=4) {
const index = i * 4; // Get index of "R" so that we could access to index with 1 set of RGBA in every iteration.?0, 4, 8, 12...?
const r = imageData.data[index];
const g = imageData.data[index + 1];
const b = imageData.data[index + 2];
const a = imageData.data[index + 3];
}
Accessing image pixels
We are going to calculate the X and Y coordinates so that the image can be placed in the center.
const imageData = getImageDataFromVideo();
for (let y = 0, height = imageData.height; y < height; y += 1) {
for (let x = 0, width = imageData.width; x < width; x += 1) {
const vX = x - imageData.width / 2; // Shift in X direction since origin is center of screen
const vY = -y + imageData.height / 2; // Shift in Y direction in the same way (you need -y)
}
}
Create particles from image pixels
For creating a particle, we can use THREE.Geometry()
and THREE.PointsMaterial()
.
Each pixel is added to the geometry as a vertex.
const geometry = new THREE.Geometry();
geometry.morphAttributes = {};
const material = new THREE.PointsMaterial({
size: 1,
color: 0xff0000,
sizeAttenuation: false
});
const imageData = getImageDataFromVideo();
for (let y = 0, height = imageData.height; y < height; y += 1) {
for (let x = 0, width = imageData.width; x < width; x += 1) {
const vertex = new THREE.Vector3(
x - imageData.width / 2,
-y + imageData.height / 2,
0
);
geometry.vertices.push(vertex);
}
}
particles = new THREE.Points(geometry, material);
scene.add(particles);
Draw
In the drawing stage, the updated image is drawn using particles by getting the image data from the camera and calculating a grayscale value from it.
By calling this process on every frame, the screen visual is updated just like a video.
const imageData = getImageDataFromVideo();
for (let i = 0, length = particles.geometry.vertices.length; i < length; i++) {
const particle = particles.geometry.vertices[i];
let index = i * 4;
// Take an average of RGB and make it a gray value.
let gray = (imageData.data[index] + imageData.data[index + 1] + imageData.data[index + 2]) / 3;
let threshold = 200;
if (gray < threshold) {
// Apply the value to Z coordinate if the value of the target pixel is less than threshold.
particle.z = gray * 50;
} else {
// If the value is greater than threshold, make it big value.
particle.z = 10000;
}
}
particles.geometry.verticesNeedUpdate = true;
Audio
In this section, let’s have a look at how the audio is processed.
Loading of the audio file and playback
For audio loading, we can use THREE.AudioLoader()
.
const audioListener = new THREE.AudioListener();
audio = new THREE.Audio(audioListener);
const audioLoader = new THREE.AudioLoader();
// Load audio file inside asset folder
audioLoader.load('asset/audio.mp3', (buffer) => {
audio.setBuffer(buffer);
audio.setLoop(true);
audio.play(); // Start playback
});
For getting the average frequency analyser.getAverageFrequency()
comes in handy.
By applying this value to the Z coordinate of our particles, the depth effect of the visualizer is created.
Getting the audio frequency
And this is how we get the audio frequency:
// About fftSize https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode/fftSize
analyser = new THREE.AudioAnalyser(audio, fftSize);
// analyser.getFrequencyData() returns array of half size of fftSize.
// ex. if fftSize = 2048, array size will be 1024.
// data includes magnitude of low ~ high frequency.
const data = analyser.getFrequencyData();
for (let i = 0, len = data.length; i < len; i++) {
// access to magnitude of each frequency with data[i].
}
Combining web camera input and audio
Finally, let’s see how the drawing process works that uses both, the camera image and the audio data.
Manipulate the image by reacting to the audio
By combining the techniques we’ve seen so far, we can now draw an image of the web camera with particles and manipulate the visual using audio data.
const draw = () => {
// Audio
const data = analyser.getFrequencyData();
let averageFreq = analyser.getAverageFrequency();
// Video
const imageData = getImageData();
for (let i = 0, length = particles.geometry.vertices.length; i < length; i++) {
const particle = particles.geometry.vertices[i];
let index = i * 4;
let gray = (imageData.data[index] + imageData.data[index + 1] + imageData.data[index + 2]) / 3;
let threshold = 200;
if (gray < threshold) {
// Apply gray value of every pixels of web camera image and average value of frequency to Z coordinate of particle.
particle.z = gray * (averageFreq / 255);
} else {
particle.z = 10000;
}
}
particles.geometry.verticesNeedUpdate = true; // Necessary to update
renderer.render(scene, camera);
requestAnimationFrame(draw);
};
And that’s all. Wasn’t that complicated, was it? Now you know how to create your own audio visualizer using web camera and audio input.
We’ve used THREE.Geometry
and THREE.PointsMaterial
here but you can take it further and use Shaders. Demo 2 shows an example of that.
We hope you enjoyed this tutorial and get inspired to create something with it.
Hi, this is cool, very complex for me … i want to change the colors of the pixels but dont find where to replace this blue color you choose.
Can you please let me understand how to replace colors in your index.js ?