top of page

DC removal

What problem are you trying to solve?

The code takes a WAV file and a speed factor and returns a WAV file. The task is to either slow down or speed up the given speech signal.

How does the code you wrote solve the problem?

Functions:
1. Read_wav -
Reads the input, the header containing metadata, and then the rest of the file. It prints error messages if necessary.

2. Write_wav -
Creates a new WAV file (output).

3. Change_speed -
This is the main part of the code.
   - First, compute the number of samples in the original file:
     `num_samples:` Calculates the total number of samples of the audio data provided according to the sample width. This helps determine how much audio data needs to be processed.
   - Now, adjust the number of samples based on the speed factor:
     `new_num_samples:` Recalculates the number of samples after adjusting the speed.
   - Create new `data`:
     Iterate over the new number of samples (`new_num_samples`) and calculate the position (`sample_index`) in the original file from which to retrieve each audio sample for the new audio segment. Convert `new_data` back to bytes and return it as processed audio data.

4. Update_header -
Updates the header with the new WAV file data.

5. Process_wav -
The main function.

Example for illustration: Suppose a file contains 20 bytes of data and a speed factor of 0.5.
1) `num_samples` will contain the value 10 samples. [`sample_width` is 2 bytes per sample (this is common in 16-bit audio files)].
   [S0, S1, S2, S3, S4, S5, S6, S7, S8, S9]
2) With a `speed_factor` of 0.5, `new_num_samples` is calculated as 10/0.5, which is 20. This means the new file will contain twice as many samples as the original file.
3) In the loop: `sample_index = int(i * speed_factor) * sample_width`, so we get:
   For i = 0: `sample_index = int(0 * 0.5) * 2 = 0`, choose S0
   For i = 1: `sample_index = int(1 * 0.5) * 2 = 0`, choose S0
   For i = 2: `sample_index = int(2 * 0.5) * 2 = 2`, choose S1
   For i = 3: `sample_index = int(3 * 0.5) * 2 = 2`, choose S1
   For i = 4: `sample_index = int(4 * 0.5) * 2 = 4`, choose S2
   For i = 5: `sample_index = int(5 * 0.5) * 2 = 4`, choose S2
   And so on...
   (Due to the `int` function, there is rounding down, which is how the magic happens).

   Iteration 0:
   i = 0
   `sample_index = int(0 * 0.5) * 2 = 0`
   `data[0:2]` extracts S0
   `new_data.extend(data[0:2])` adds S0 to `new_data`
   `new_data` now contains: [S0]

   Iteration 1:
   i = 1
   `sample_index = int(1 * 0.5) * 2 = 0`
   `data[0:2]` extracts S0
   `new_data.extend(data[0:2])` adds S0 to `new_data`
   `new_data` now contains: [S0, S0]

   Iteration 2:
   i = 2
   `sample_index = int(2 * 0.5) * 2 = 2`
   `data[2:4]` extracts S1
   `new_data.extend(data[2:4])` adds S1 to `new_data`
   `new_data` now contains: [S0, S0, S1]

   The new samples after slowing down to a slower speed will look like this:
   [S0, S0, S1, S1, S2, S2, S3, S3, S4, S4, S5, S5, S6, S6, S7, S7, S8, S8, S9, S9]
   The result is that the new file contains double the samples, which causes the speech to slow down.

What were the challenges that required you to use existing modules, and what do these modules do?

Since I needed to handle a variable containing file data and did not write code that directly operates on the file, I had to use the `struct` library, which is used for reading data from the header of a WAV file, such as the number of channels, sampling rate, and bits per sample. It is also used for writing this data back when creating the new file.

Additionally, the `BytesIO` object from the `io` library was used to create an in-memory file where the WAV file data, after processing, is stored. This allows writing the header and new audio data without directly writing to a file on disk.
What types of inputs does the code accept and what is expected in the output?

The code accepts a variable named `input_wav_data`, which contains the WAV file whose speed we want to change, and a variable named `speed_factor`, which contains the speed factor for acceleration or deceleration. The output will be a new variable named `result` containing the data of the new WAV file after the speed change.
Explain the parameter names in the code and their functions:

- `input_wav_data`: A bytes-type input containing all the data of the WAV file passed to the function, including the header and the audio data itself.
- `speed_factor`: The acceleration/deceleration factor for changing the playback speed of the audio. For example, a value of 0.5 will slow down the audio to half the original speed, while 2 will double the playback speed.
- `data`: A local variable used within various functions to read WAV data. It contains all the data from the original file.
- `channels`: The number of channels in the WAV file (e.g., mono is 1, stereo is 2). This information is taken from the WAV file header.
- `sample_rate`: The sampling rate of the WAV file, i.e., how many samples are taken per second, in terms of Hertz (Hz). This information is also taken from the WAV file header.
- `bits_per_sample`: The number of bits per audio sample. This determines the audio quality and the size of each sample. This information is taken from the WAV file header.
- `sample_width`: The width of each sample in bytes, calculated based on the number of bits per sample (`bits_per_sample`). For example, if there are 16 bits per sample, then the sample width would be 2 bytes (16/8).
- `audio_data`: The audio data of the WAV file that starts after the header. This is the part of the file where the actual audio is located.
- `processed_data`: The audio data after changing the speed according to the `speed_factor`.
- `output_buffer`: A `BytesIO` object used to store the data of the new WAV file created after changing the speed. It is an in-memory file.
- `result`: The final data of the new WAV file, after the header and audio data have been adjusted according to the `speed_factor`. This is what the function returns.
- `new_num_samples`: The number of new samples created after changing the audio speed. This is based on the original number of samples and the `speed_factor`.
- `sample_index`: An index pointing to the location of the sample in the original audio, used for selecting samples when changing speed.

bottom of page