这是学习时的笔记,包含相关资料连接,有的当时没有细看,记录下来在须要的时候回顾。html
有些较混乱的部分,后续会再更新。python
欢迎感兴趣的小伙伴一块儿讨论,跪求大神指点~c++
tags:voicegit
pyAudio
http://old.sebug.net/paper/books/scipydoc/wave_pyaudio.html
注:这部分是经过录音设备给语音激活检测传输语音流。github
"path/to/vad/audio_stream.py" #!usr/bin/env python # -*- coding: utf-8 -*- import numpy as np from pyaudio import PyAudio,paInt16 from datetime import datetime import wave from Tkinter import * import sys from ffnn import FFNNVADGeneral import logging # import chardet # 查看编码 # define of params NUM_SAMPLES =160 FRAMERATE = 16000 CHANNELS = 1 SAMPWIDTH = 2 FORMAT = paInt16 TIME = 125 FRAMESHIFT = 160 def save_wave_file(filename,data): '''save the date to the wav file''' wf = wave.open(filename,'wb') wf.setnchannels(CHANNELS) wf.setsampwidth(SAMPWIDTH) wf.setframerate(FRAMERATE) wf.writeframes("".join(data)) # ""中间不能有空格,否则语音录入会有不少中断。 wf.close() def my_button(root,label_text,button_text,button_stop,button_func,stop_func): '''create label and button''' label = Label(root,text=label_text,width=30,height=3).pack() button = Button(root,text=button_text,command=button_func,anchor='center',width=30,height=3).pack() button = Button(root,text=button_stop,command=stop_func,anchor='center',width=30,height=3).pack() def record_wave(): '''open the input of wave''' pa = PyAudio() # 录音 stream = pa.open(format=FORMAT, channels=CHANNELS, rate=FRAMERATE, input=True, frames_per_buffer=NUM_SAMPLES) #一个buffer存NUM_SAMPLES个字节,做为一帧 vad = FFNNVADGeneral('/path/to/VAD/alex-master/alex/tools/vad_train/model_voip/vad_nnt_546_hu32_hl1_hla6_pf10_nf10_acf_1.0_mfr20000_mfl20000_mfps0_ts0_usec00_usedelta0_useacc0_mbo1_bs100.tffnn', filter_length=2, sample_rate=16000, framesize=512, frameshift=160, usehamming=True, preemcoef=0.97, numchans=26, ceplifter=22, numceps=12, enormalise=True, zmeansource=True, usepower=True, usec0=False, usecmn=False, usedelta=False, useacc=False, n_last_frames=10, n_prev_frames=10, lofreq=125, hifreq=3800, mel_banks_only=True) # 语音激活检测神经网络方法的类FFNNVADGeneral. save_buffer = [] count = 0 # logging设置,用于记录日志 logging.basicConfig(level=logging.INFO, filename='log.txt', filemode ='w', format='%(message)s') while count < TIME*4: string_audio_data = stream.read(NUM_SAMPLES) result = vad.decide(string_audio_data) frame = count*NUM_SAMPLES/float(FRAMESHIFT) time = count*NUM_SAMPLES/float(FRAMERATE) # time=frame*frameshift/framerate logging.info('frame: '+str(frame)+' time: '+str(time)+' prob: '+str(result)) # logging记录字符串,用‘+’链接 save_buffer.append(string_audio_data) count += 1 #chardet.detect(string_audio_data) #查看编码类型 print "." filename = datetime.now().strftime("%Y-%m-%d_%H_%M_%S")+".wav" save_wave_file(filename,save_buffer) save_buffer = [] print "filename,saved." def record_stop(): # stop record the wave sys.exit(0) def main(): root = Tk() root.geometry('300x200+200+200') root.title('record wave') my_button(root,"Record a wave","clik to record","stop recording",record_wave,record_stop) root.mainloop() if __name__ == "__main__": main() # error $ bt_audio_service_open: connect() failed: Connection refused (111) # 解决: 貌似有多余蓝牙库却没有蓝牙设备 $ sudo apt-get purge bluez-alsa # Warning $ ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side Cannot connect to server socket err = No such file or directory Cannot connect to server request channel jack server is not running or cannot be started # 是因为usr/share/alsa/alsa.conf默认设置
sudo apt-get update #更新软件源,最好使用国内的软件源,如何配置源参考百度。 sudo apt-get upgrade #升级软件包 sudo apt-get install alsa-utils alsa-tools alsa-tools-gui alsamixergui #安装所需软件包 # 查看音频设备 $ arecord -l > card 0: PCH [HDA Intel PCH], device 0: ALC887-VD Analog [ALC887-VD Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 2: ALC887-VD Alt Analog [ALC887-VD Alt Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 # 机器有多于一个声卡,能够用下面的命令显示出来 $ cat /proc/asound/cards > 0 [PCH ]: HDA-Intel - HDA Intel PCH HDA Intel PCH at 0xf7210000 irq 29 1 [NVidia ]: HDA-Intel - HDA NVidia HDA NVidia at 0xf7080000 irq 17 # 每个声卡有一个card number和一个device number,能够用下面命令显示出来 $ aplay -l > card 0: PCH [HDA Intel PCH], device 0: ALC887-VD Analog [ALC887-VD Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 1: ALC887-VD Digital [ALC887-VD Digital] Subdevices: 1/1 Subdevice #0: subdevice #0 card 1: NVidia [HDA NVidia], device 3: HDMI 0 [HDMI 0] Subdevices: 1/1 Subdevice #0: subdevice #0 card 1: NVidia [HDA NVidia], device 7: HDMI 1 [HDMI 1] Subdevices: 1/1 Subdevice #0: subdevice #0 # 录音 $ arecord -D "plughw:0,0" -f S16_LE -r 16000 -d 5 -t wav file.wav # -D 选择设备 试过hw:1,0 hw:0,2 只有hw:0,0能够录音 # -f 录音格式 S16_LE表明有符号16位小端序 # -r 采样率 # -t 录音时长 # file.wav 文件名 # 不添加plug会有警示,由于是外置声卡 Warning: rate is not accurate (requested = 16000Hz, got = 44100Hz) please, try the plug plugin # 验证录音 $ aplay file.wav
通常alsa设置了一个defaults设备,音频播放软件默认使用defaults设备输出声音。defaults设备定义在alsa.conf中,内容以下:vim
[plain] # # defaults # # show all name hints also for definitions without hint {} section defaults.namehint.showall off # show just basic name hints defaults.namehint.basic on # show extended name hints defaults.namehint.extended off # defaults.ctl.card 0 defaults.pcm.card 0 defaults.pcm.device 0 defaults.pcm.subdevice -1 ……
defaults会默认匹配card number和device number比较小的声卡。
若是要修改,则修改/etc/asound.conf或~/.asoundrc。好比我要把defaults匹配到card 1,device 0上,则添加一下几行:网络
[plain] $ sudo vim /etc/asound.conf defaults.pcm.card 1 defaults.pcm.device 3 defaults.ctl.card 1
https://github.com/aaronaanderson/ofxPortSFapp
有些可能记录时忘记记录获取信息的网站地址,有不当之处请指正~~
(若非特别声明,文章均为Vanessa的我的笔记,转载请注明出处。文章若有侵权内容,请联系我,我会及时删除)socket