DC removal

What problem are you trying to solve?

When converting analog AC signals to digital, there is often a DC offset. This offset can be caused by the original signal or the conversion process itself. Our task is to filter out not only fixed offsets but any offset below 100 Hz. By doing so, we filter out all low-frequency noise. This filtering has several benefits:

- Preventing Speaker Damage: A DC component can cause the speaker diaphragm to stay permanently displaced, potentially leading to overheating and physical damage to the speaker over time.
- Improving Sound Quality: Removing the DC component allows the audio signal to oscillate around zero, enhancing the accuracy of sound reproduction and preventing distortions.
- Accurate Signal Processing: Signal processing operations like equalization, compression, or other effects work more precisely when the signal is free of DC components.
- Better Recordings: During recording, removing the DC component ensures that the recorded signal accurately represents the original signal without unwanted offsets.

In practice, we are asked to implement a High-Pass Filter (HPF) to perform this task in real time. Thus, filtering low frequencies in the frequency domain is not feasible; it must be done in the time domain.

How does the code you wrote solve the problem?

- Read the WAV file and extract the necessary data (sampling rate, number of channels, etc.).
- Convert the data to a NumPy array.
- Create a high-pass filter using a sinc function.
- Apply the filter to the signal using convolution.
- Convert the filtered signal back to WAV format.

The high-pass filter removes low-frequency components (including DC) from the signal while preserving higher-frequency components.

In more detail, the sinc function in the time domain is a window function in the frequency domain. We define a window function that allows frequencies up to 100 Hz to pass. When convolving with the sinc function (window) in the time domain, only frequencies up to 100 Hz will be preserved. We perform convolution with an impulse function and subtract the sinc from the impulse function. We multiply the resulting coefficients by the Hamming window.

Our filter operates with a delay dependent on the number of samples required by the filter. Increasing the number of samples improves filtering quality and delay. We then store the most recent samples in a buffer and perform convolution with the coefficients computed earlier.

Integration & Research
What types of inputs does the code accept and what is expected in the output?

Inputs:
- `input_file:` File object in WAV format.
- `cutoff_frequency:` Cutoff frequency of the filter (default: 100 Hz).
- `numtaps:` Number of filter coefficients (default: 4400).

Output:
- The filtered WAV file.

The output is expected to be a filtered version of the input file, with reduced low-frequency components.

Explain the parameter names in the code and their functions

- `input_file:` The input file in WAV format.
- `cutoff_frequency:` The cutoff frequency of the filter. Frequencies below this value will be attenuated.
- `numtaps:` The number of coefficients in the filter. A higher value provides more accuracy but requires more computations.
- `fs:` The sampling rate of the audio file.
- `n_channels:` The number of channels in the audio file (mono or stereo).
- `sampwidth:` The number of bytes per sample.
- `signal:` A NumPy array containing the audio data.
- `high_pass_filter:` Array of coefficients for the high-pass filter.
- `output_signal:` The filtered signal.
- `buffer:` A temporary buffer for storing previous input values while applying the filter.

The Running

Let us show you how does it work