Audio data fundamentals
First, let us understand some basic terminology in audio data analysis:
- Amplitude: Sound is made up of waves, and the height of those waves is called the amplitude. The bigger the amplitude, the louder the sound. Amplitude refers to the maximum extent of a vibration or oscillation, measured from the position of equilibrium. Imagine a swinging pendulum. The distance the pendulum moves from its resting position (middle point) to one extreme is its amplitude. Think of a person on a swing. The higher they swing, the greater the amplitude of their motion.
- RMS calculation: To find the loudness using RMS, we square the amplitude values of the sound waves. This is done because it helps us focus on the positive values (removing any negative values) and because loudness should reflect the intensity of the sound.
- Average power: After squaring the amplitudes, we calculate the average (mean) of these squared values. It’s like finding the typical size...