由于没有提供具体的语音文件,以下是一般情况下二级判决法的端点检测代码:

% 读取音频文件 [y,fs] = audioread('your_audio_file.wav');

% 设置参数 win_len = 0.02; % 窗口长度,单位为秒 win_overlap = 0.5; % 窗口重叠率 energy_thres = 0.1; % 能量门限 zcr_thres = 10; % 过零率门限 silence_thres = 0.5; % 静音门限,单位为秒

% 计算窗口大小和步长 win_size = round(win_len * fs); step_size = round((1 - win_overlap) * win_size);

% 初始化门限和端点变量 energy_thresh_1 = energy_thres * max(y.^2); energy_thresh_2 = energy_thres * energy_thresh_1; end_points = [];

% 计算每个窗口的能量和过零率 for i = 1:step_size:length(y)-win_size % 计算窗口内的能量和过零率 y_win = y(i:i+win_size-1); energy = sum(y_win.^2); zcr = sum(abs(diff(sign(y_win))))/2;

% 判断是否为静音
if energy < energy_thresh_1
    % 静音区间
    if isempty(end_points) || i/fs - end_points(end,2) > silence_thres
        end_points = [end_points; i/fs, i/fs+win_len];
    else
        end_points(end,2) = i/fs+win_len;
    end
else
    % 非静音区间
    if energy >= energy_thresh_2 || zcr > zcr_thres
        end_points = [end_points; i/fs, i/fs+win_len];
        energy_thresh_1 = 0.2 * energy_thresh_1 + 0.8 * energy;
        energy_thresh_2 = 0.2 * energy_thresh_2 + 0.8 * energy_thresh_1;
    else
        energy_thresh_1 = 0.2 * energy_thresh_1 + 0.8 * energy;
        energy_thresh_2 = 0.2 * energy_thresh_2 + 0.8 * energy_thresh_1;
    end
end

end

% 绘制语音波形和门限 t = (0:length(y)-1)/fs; figure; plot(t,y); hold on; plot([t(1), t(end)], [energy_thresh_1, energy_thresh_1], 'b--'); plot([t(1), t(end)], [energy_thresh_2, energy_thresh_2], 'b--'); ylim([-1,1]);

% 绘制端点 for i = 1:size(end_points,1) plot([end_points(i,1), end_points(i,1)], [-1,1], 'r'); plot([end_points(i,2), end_points(i,2)], [-1,1], 'r'); end

其中,energy_thresh_1和energy_thresh_2分别为一级门限和二级门限,end_points为检测到的端点。绘制的图像中,蓝色虚线表示门限,红色竖线表示端点。需要注意的是,由于二级判决法的门限是动态调整的,因此门限线可能会有所变化。


原文地址: https://www.cveoy.top/t/topic/nY12 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录