Audio Data Preprocessing and Serialization for SEGAN
import os
import librosa
import numpy as np
from tqdm import tqdm
#clean_train_folder = 'data/test/clean_trainset_56spk'
#noisy_train_folder = 'data/test/noisy_trainset_56spk'
#clean_test_folder = 'data/test/clean_testset'
#noisy_test_folder = 'data/test/noisy_testset'
#serialized_train_folder = 'data/test/serialized_train_data'
#serialized_test_folder = 'data/test/serialized_test_data'
clean_train_folder = 'D:/test_module/SEGAN/data/clean_trainset'
noisy_train_folder = 'D:/test_module/SEGAN/data/noisy_trainset'
clean_test_folder = 'data/test2/clean_testset'
noisy_test_folder = 'data/test2/noisy_testset'
serialized_train_folder = 'data/test2/serialized_train_data'
serialized_test_folder = 'data/test2/serialized_test_data'
window_size = 2 ** 14 # about 1 second of samples
sample_rate = 16000
def slice_signal(file, window_size, stride, sample_rate):
'''
Helper function for slicing the audio file
by window size and sample rate with [1-stride] percent overlap (default 50%).
'''
wav, sr = librosa.load(file, sr=sample_rate)
hop = int(window_size * stride)
slices = []
for end_idx in range(window_size, len(wav), hop):
start_idx = end_idx - window_size
slice_sig = wav[start_idx:end_idx]
slices.append(slice_sig)
return slices
def process_and_serialize(data_type):
'''
Serialize, down-sample the sliced signals and save on separate folder.
'''
stride = 0.5
if data_type == 'train':
clean_folder = clean_train_folder
noisy_folder = noisy_train_folder
serialized_folder = serialized_train_folder
else:
clean_folder = clean_test_folder
noisy_folder = noisy_test_folder
serialized_folder = serialized_test_folder
if not os.path.exists(serialized_folder):
os.makedirs(serialized_folder)
# walk through the path, slice the audio file, and save the serialized result
for root, dirs, files in os.walk(clean_folder):
if len(files) == 0:
continue
for filename in tqdm(files, desc='Serialize and down-sample {} audios'.format(data_type)):
clean_file = os.path.join(clean_folder, filename)
noisy_file = os.path.join(noisy_folder, filename)
# slice both clean signal and noisy signal
clean_sliced = slice_signal(clean_file, window_size, stride, sample_rate)
noisy_sliced = slice_signal(noisy_file, window_size, stride, sample_rate)
# serialize - file format goes [original_file]_[slice_number].npy
# ex) p293_154.wav_5.npy denotes 5th slice of p293_154.wav file
for idx, slice_tuple in enumerate(zip(clean_sliced, noisy_sliced)):
pair = np.array([slice_tuple[0], slice_tuple[1]])
np.save(os.path.join(serialized_folder, '{}_{}'.format(filename, idx)), arr=pair)
def data_verify(data_type):
'''
Verifies the length of each data after pre-process.
'''
if data_type == 'train':
serialized_folder = serialized_train_folder
else:
serialized_folder = serialized_test_folder
for root, dirs, files in os.walk(serialized_folder):
for filename in tqdm(files, desc='Verify serialized {} audios'.format(data_type)):
data_pair = np.load(os.path.join(root, filename))
if data_pair.shape[1] != window_size:
print('Snippet length not {} : {} instead'.format(window_size, data_pair.shape[1]))
break
if __name__ == '__main__':
process_and_serialize('train')
data_verify('train')
process_and_serialize('test')
data_verify('test')
Code Breakdown:
-
Import necessary libraries:
os: For file system operations.librosa: For audio loading and manipulation.numpy: For numerical operations and array handling.tqdm: For displaying progress bars.
-
Set audio slicing parameters:
window_size: The length of each audio slice (1 second here).sample_rate: The sampling rate of the audio data.
-
slice_signalfunction:- This function takes an audio file, window size, stride, and sample rate as input.
- It loads the audio file using
librosa.load()and slices it into segments with a specified overlap (controlled by thestrideparameter). - It returns a list of audio slices.
-
process_and_serializefunction:- This function handles the main processing and serialization of audio data.
- It takes the
data_type(either 'train' or 'test') as input. - It determines the appropriate folders for clean audio, noisy audio, and serialized data based on the
data_type. - It iterates through the clean audio folder, slices each audio file using
slice_signal, and combines the clean and noisy slices into anumpyarray. - It saves the array as a
.npyfile in the serialized folder, with a filename that includes the original filename and slice index.
-
data_verifyfunction:- This function verifies that the serialized data has the correct length.
- It takes the
data_typeas input and accesses the corresponding serialized folder. - It iterates through the serialized files, loads each file, and checks the length of the loaded array against the
window_size. - If the length doesn't match, it prints an error message.
-
Main execution block:
- It calls
process_and_serializeanddata_verifyfor both 'train' and 'test' datasets, ensuring that the data is properly preprocessed and verified.
- It calls
原文地址: https://www.cveoy.top/t/topic/n6Cv 著作权归作者所有。请勿转载和采集!