这周作了一个新功能,就是全自动地把微信的语音文件给提取出来,而后转换成文字,一来是一目了然,二来是方便检索。不曾想到的是,在这个过程当中,碰到了几个不大不小的坎,踩到了几个不深不浅的坑,在此逐一记录解决方案,供后来人参考。
想要体验这个功能,请参看这个连接480html
经过iOS应用逆向工程114中提到的思路,定位微信的语音文件并非什么难事。它们通常在ios
/WeChat Path/Documents/Random Serial/Audio/Random Serial/random.audgit
发出或收到一条语音消息,都会产生一个.aud文件。由于aud不是常见的后缀,咱们不知道该用什么方式来播放这条语音,那么就先百度一下24看看吧。github
百度结果不少:windows
Screen Shot 2016-03-26 at 5.49.02 PM.png552x794 197 KB微信
能够看到,大多数的帖子,都提到微信语音是amr格式的。其中这个帖子124说,aud就是少了amr文件头的amr文件,只须要在文件开头加上amr标识就能够了:
app
可是我发现,添加了amr头部的aud文件,仍是不能用OSX自带的preview功能播放:
而正常状况下,preview是能够播放amr文件的。这说明,添加了amr标识的文件,并不能被preview正确识别;而这多半是aud文件的问题——它不是amr文件。dom
我把刚才的aud文件用MacVim从新打开,发现除去咱们自行添加的amr头部以外,aud自己含有一个ide
#!SILK_V3函数
字样的标识。虽然我对音频编码和文件格式不甚了解,但既然amr文件的头部标识是以#!
开头,那么我推测#!SILK_V3
应该也是一个文件头。Google之:
Screen Shot 2016-03-26 at 6.12.59 PM.png754x865 136 KB
这样的话,咱们基本能够肯定微信的语音文件,就是 变种的 silk文件了,咱们须要把它修改成纯正的silk文件。为何说是变种呢?立刻就知道了
根据果壳网友NetCharm61的分析:
Screen Shot 2016-03-26 at 6.29.07 PM.png691x461 102 KB
微信在aud文件的最开始添加了1个字节,若是把这个字节删掉,这个文件就是一个纯正的silk格式文件。他编译了一个decoder.exe,能够把silk文件给转换成pcm文件;可是我用的是OSX,没法执行exe文件。好么,继续寻找解决方案,把silk文件转换成pcm。
又是一顿Google,找到了这篇文章189,博主的目的跟我相似。按照他的操做,先用HomeBrew安装ffmpeg:brew install ffmpeg
,而后从GitHub263上下载SILKCodec工程(值得一提的是,这个工程的主人是一个中国开发者,音频处理达人,我也已经跟他取得了联系,请教一些音频技术上的问题),而后在SILK_SDK_SRC_ARM
里先make lib
,再make decoder
,便可生成OSX上的命令行工具decoder。对纯正的silk文件执行命令:/path/to/decoder /path/to/silk path/to/pcm
,便可生成pcm文件。
2016.7.21编辑:
还能够用这个静态库261将silk转化成pcm,@hangcom 亲测可用。
根据刚才那篇博文,用博主的脚本25就能够把pcm文件转换成wav文件,而他的解决方案正是ffmpeg:ffmpeg -f s16le -ar 24000 -i /path/to/pcm -f wav /path/to/wav
播放此wav文件,听到的就是我在微信里发出的语音消息——
“按期整理用户昵称中含有的手机号、用户名、公司。”21.wav62 (267.3 KB)
从这里64下载讯飞的官方iOS demo(注意勾选“语音听写”),在下载的工程中打开“MSCDemo -> MSCDemo -> business -> isr -> IATViewController.h”,其中的pcmFilePath
便是音频文件的路径。
在IATViewController.m中的第42行,将pcmFilePath
改成wav文件的地址,而后点击下图的“音频流识别”:
Simulator Screen Shot Mar 26, 2016, 7.06.24 PM.png375x667 23.3 KB
便可快速测试解析功能(注:讯飞iOS SDK仅支持pcm和wav格式的音频文件):
Simulator Screen Shot Mar 26, 2016, 7.44.29 PM.png375x667 25.6 KB
对比上面的wav文件,能够看到解析的误差很大。由于讯飞的中文语音识别技术已是全球领先的了,因此不大可能会出现这么大的识别错误。问了下讯飞的技术人员,得知问题可能在这些方面:
Screen Shot 2016-03-26 at 7.51.11 PM.png851x480 48.2 KB
好了,继续研究,看看怎么处理。
和爱拍的CTO指点:
得知讯飞哥们提到的“单通道,16位,16000/8000”各指的是channels、bit rate/sample size和sample rate。它们在ffmpeg里对应的设置方式分别是-ac channels
、-b:a bitrate
、-ar rate
。咱们来试试看:ffmpeg -f s16le -ar 24000 -i /path/to/pcm -f wav -ar 16000 -b:a 16 -ac 1 /path/to/wav
识别效果如图:
Pasted image375x667 26.9 KB
识别得几乎没有问题了。
由于在上面的识别中,“昵称”被听成了“旅程”,仍是有些不爽,因而我又擅自改变ffmpeg的参数,想看看能不能作到100%识别准确。最后发现,按照以下的参数配置,识别率是最高的:ffmpeg -f s16le -ar 12k -ac 2 -i /path/to/pcm -f wav -ar 16k -ac 1 /path/to/wav
可是,对于这个事实,我只知其然,不知其因此然,还请懂行的高手赐教
总结上面的操做,大体的顺序是:
这5步我已经在OSX上用半自动的方式完成了,那么要在iOS中全自动完成,须要作到的,也是5步,与上面一一对应:
Mgr
结尾的),从中获取aud文件;NSMutableData
处理aud,生成silk;[IFlySpeechRecognizer writeAudio:]
及其回调函数。至此,iOS全自动化解决方案的骨架已经出来了,剩下的肉,就留给各位看官本身来填吧
谢谢阅读~
参考:
http://bbs.feng.com/read-htm-tid-6186081.html124
http://kronopath.net/blog/extracting-audio-messages-from-wechat/#extracting-and-converting-wechat-audio-messages189
https://github.com/Kronopath/SILKCodec263
https://github.com/pereckerdal/silk-arm-ios49
延伸阅读:
《实战:把微信语音转换成文字的全自动化解决方案(第二弹)363》
在《从微信中提取语音文件,并转换成文字的全自动化解决方案》96里,我曾使用讯飞iOS SDK完成了将微信语音转化为文字的功能。在帖子的7楼12,@everettjf 说,微信自己就提供了长按语音转文字的功能,而我在前期研究微信时并无发现这个功能。通过一番调研,发现只有当系统语言为中文时,微信才会开放这个功能,而我一直使用的是英文系统,因此没看到过这个功能。今天,咱们就以这个功能为突破口,摒弃讯飞SDK,采用微信自带的方案,将语音全自动转换为文字。
这个过程我就不复述了,核心代码是:
%hook VoiceMessageNodeView - (BOOL)canShowVoiceTransMenu { %orig; return YES; } %end
你们编译一个tweak安装一下,就能够在英文版微信里开启这个功能了
IMG_0078.PNG640x960 80.1 KB
照例,咱们先用Cycript,定位这个UIMenuItem。停留在上图的界面中,而后用choose
命令作一次小小的hack:
cy# choose(UIMenuItem) [#"<UIMenuItem: 0x15621200>",#"<UIMenuItem: 0x1562d020>",#"<UIMenuItem: 0x15680080>",#"<UIMenuItem: 0x1577f9b0>",#"<UIMenuItem: 0x157afe90>"] cy# [#0x15621200 title] @"Favorite" cy# [#0x1562d020 title] @"Turn Off Speaker" cy# [#0x15680080 title] @"More..." cy# [#0x1577f9b0 title] @"Convert to Text"
好了,找到了“Convert to Text”这个UIMenuItem,咱们看看它的action是什么:
cy# [#0x1577f9b0 action] @selector(onVoiceTrans:)
好了,onVoiceTrans:
就是咱们的答案了。我们grep一遍WeChat的头文件,看看哪一个类实现了这个方法:
FunMaker-MBP:~ snakeninny$ grep -r onVoiceTrans: /Users/snakeninny/Code/RE/WeChat
/Users/snakeninny/Code/RE/WeChat/VoiceMessageNodeView.h:- (void)onVoiceTrans:(id)arg1;
Binary file /Users/snakeninny/Code/RE/WeChat/WeChat_arm64.decrypted matches
Binary file /Users/snakeninny/Code/RE/WeChat/WeChat_armv7.decrypted matches
看起来是VoiceMessageNodeView这个类。从类名上猜想,这个类应该是语音信息view的类,在后面咱们会验证这个猜想。
把decrypt以后的WeChat可执行文件拖到Hopper里,待分析完毕以后,看看它的反编译C代码:
Pasted image924x852 99.5 KB
能够看到很明显的2条分支,1条简单调用了showVoiceTransView
,另1条的操做相对复杂。咱们先从简单的入手,看看showVoiceTransView
都干了些什么。
Pasted image670x802 72.2 KB
从不够完整的函数实现截图来看,showVoiceTransView
作的基本都只是UI层面的操做;它们具体是什么呢?等会再揭晓,咱们先看另1条相对复杂的分支。
从截图里诸如getStringForCurLanguage:defaultTo:
、initWithTitle:andImageName:andContent:andCancelText:
、m_voiceTransIntro
、show
等关键词能够大概猜想,这段代码也是show了一个view出来,貌似也是UI层面的操做。若是if和else都是纯UI操做,那么语音转换的核心代码在哪里呢?是时候上LLDB,往更深一层次的代码去挖掘了。
用LLDB附加WeChat,而后在[VoiceMessageNodeView onVoiceTrans:]
上下断点并触发:
(lldb) br s -a 0x00064000+0x0183e950 Breakpoint 1: where = WeChat`___lldb_unnamed_function88615$$WeChat, address = 0x018a2950 Process 3098 stopped * thread #1: tid = 0x3a3d5, 0x018a2950 WeChat`___lldb_unnamed_function88615$$WeChat, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x018a2950 WeChat`___lldb_unnamed_function88615$$WeChat WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2950 <+0>: push {r4, r5, r6, r7, lr} 0x18a2952 <+2>: add r7, sp, #0xc 0x18a2954 <+4>: push.w {r8, r10, r11} 0x18a2958 <+8>: sub sp, #0x1c (lldb)
咱们用ni
单步跟一下这个函数,看看每个objc_msgSend都是在干吗:
* thread #1: tid = 0x3a3d5, 0x018a2974 WeChat`___lldb_unnamed_function88615$$WeChat + 36, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2974 WeChat`___lldb_unnamed_function88615$$WeChat + 36 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2974 <+36>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2978 <+40>: mov r7, r7 0x18a297a <+42>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a297e <+46>: mov r5, r0 (lldb) p (char *)$r1 (char *) $0 = 0x0237235d "getMainSettingExt" (lldb) po $r0 SettingUtil
* thread #1: tid = 0x3a3d5, 0x018a2998 WeChat`___lldb_unnamed_function88615$$WeChat + 72, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2998 WeChat`___lldb_unnamed_function88615$$WeChat + 72 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2998 <+72>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a299c <+76>: mov r7, r7 0x18a299e <+78>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a29a2 <+82>: mov r6, r0 (lldb) p (char *)$r1 (char *) $2 = 0x0238a4f8 "theadSafeGetObject:" (lldb) po $r0 <CSettingExt: 0x15722990> (lldb) po $r2 SETTINGEXT_VOICE_TRANS_TIP_TIMES
* thread #1: tid = 0x3a3d5, 0x018a29b8 WeChat`___lldb_unnamed_function88615$$WeChat + 104, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a29b8 WeChat`___lldb_unnamed_function88615$$WeChat + 104 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a29b8 <+104>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a29bc <+108>: cmp r0, #0x0 0x18a29be <+110>: ble 0x18a29d0 ; <+128> 0x18a29c0 <+112>: movw r0, #0xae4 (lldb) p (char *)$r1 (char *) $7 = 0x33c13528 "integerValue" (lldb) po $r0 <nil>
* thread #1: tid = 0x3a3d5, 0x018a2a04 WeChat`___lldb_unnamed_function88615$$WeChat + 180, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2a04 WeChat`___lldb_unnamed_function88615$$WeChat + 180 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2a04 <+180>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2a08 <+184>: str r0, [sp, #0x10] 0x18a2a0a <+186>: movw r0, #0xca5e 0x18a2a0e <+190>: movt r0, #0x10c (lldb) p (char *)$r1 (char *) $9 = 0x33c09603 "alloc" (lldb) po $r0 MMTipsViewController
* thread #1: tid = 0x3a3d5, 0x018a2a26 WeChat`___lldb_unnamed_function88615$$WeChat + 214, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2a26 WeChat`___lldb_unnamed_function88615$$WeChat + 214 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2a26 <+214>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2a2a <+218>: mov r7, r7 0x18a2a2c <+220>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2a30 <+224>: mov r5, r0 (lldb) p (char *)$r1 (char *) $11 = 0x33c0c435 "defaultCenter" (lldb) po $r0 MMServiceCenter
* thread #1: tid = 0x3a3d5, 0x018a2a50 WeChat`___lldb_unnamed_function88615$$WeChat + 256, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2a50 WeChat`___lldb_unnamed_function88615$$WeChat + 256 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2a50 <+256>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2a54 <+260>: mov r2, r0 0x18a2a56 <+262>: movw r0, #0xca22 0x18a2a5a <+266>: movt r0, #0x10c (lldb) p (char *)$r1 (char *) $13 = 0x33c0951e "class" (lldb) po $r0 MMLanguageMgr
* thread #1: tid = 0x3a3d5, 0x018a2a68 WeChat`___lldb_unnamed_function88615$$WeChat + 280, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2a68 WeChat`___lldb_unnamed_function88615$$WeChat + 280 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2a68 <+280>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2a6c <+284>: mov r7, r7 0x18a2a6e <+286>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2a72 <+290>: str r0, [sp, #0xc] (lldb) p (char *)$r1 (char *) $15 = 0x0236945e "getService:" (lldb) po $r0 <MMServiceCenter: 0x156dd840> (lldb) po $r2 MMLanguageMgr
* thread #1: tid = 0x3a3d5, 0x018a2a8e WeChat`___lldb_unnamed_function88615$$WeChat + 318, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2a8e WeChat`___lldb_unnamed_function88615$$WeChat + 318 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2a8e <+318>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2a92 <+322>: mov r7, r7 0x18a2a94 <+324>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2a98 <+328>: str r0, [sp, #0x8] (lldb) p (char *)$r1 (char *) $18 = 0x0236a7db "getStringForCurLanguage:defaultTo:" (lldb) po $r0 <MMLanguageMgr: 0x1573a430> (lldb) po $r2 Voice_Trans_Tips_Title (lldb) po $r3 Voice_Trans_Tips_Title
* thread #1: tid = 0x3a3d5, 0x018a2aa8 WeChat`___lldb_unnamed_function88615$$WeChat + 344, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2aa8 WeChat`___lldb_unnamed_function88615$$WeChat + 344 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2aa8 <+344>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2aac <+348>: mov r7, r7 0x18a2aae <+350>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2ab2 <+354>: mov r6, r0 (lldb) p (char *)$r1 (char *) $22 = 0x33c0c435 "defaultCenter" (lldb) po $R0 MMServiceCenter
* thread #1: tid = 0x3a3d5, 0x018a2ac2 WeChat`___lldb_unnamed_function88615$$WeChat + 370, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2ac2 WeChat`___lldb_unnamed_function88615$$WeChat + 370 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2ac2 <+370>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2ac6 <+374>: mov r2, r0 0x18a2ac8 <+376>: mov r0, r6 0x18a2aca <+378>: mov r1, r10 (lldb) p (char *)$r1 (char *) $24 = 0x33c0951e "class" (lldb) po $R0 MMLanguageMgr
* thread #1: tid = 0x3a3d5, 0x018a2ace WeChat`___lldb_unnamed_function88615$$WeChat + 382, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2ace WeChat`___lldb_unnamed_function88615$$WeChat + 382 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2ace <+382>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2ad2 <+386>: mov r7, r7 0x18a2ad4 <+388>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2ad8 <+392>: movw r2, #0x6b2c (lldb) p (char *)$r1 (char *) $26 = 0x0236945e "getService:" (lldb) po $r0 <MMServiceCenter: 0x156dd840> (lldb) po $r2 MMLanguageMgr
* thread #1: tid = 0x3a3d5, 0x018a2ae8 WeChat`___lldb_unnamed_function88615$$WeChat + 408, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2ae8 WeChat`___lldb_unnamed_function88615$$WeChat + 408 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2ae8 <+408>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2aec <+412>: mov r7, r7 0x18a2aee <+414>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2af2 <+418>: mov r10, r0 (lldb) po $r0 <MMLanguageMgr: 0x1573a430> (lldb) p (char *)$r1 (char *) $30 = 0x0236a7db "getStringForCurLanguage:defaultTo:" (lldb) po $r2 Voice_Trans_Tips_Content (lldb) po $r3 Voice_Trans_Tips_Content
* thread #1: tid = 0x3a3d5, 0x018a2b10 WeChat`___lldb_unnamed_function88615$$WeChat + 448, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2b10 WeChat`___lldb_unnamed_function88615$$WeChat + 448 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2b10 <+448>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2b14 <+452>: ldr.w r1, [r4, r8] 0x18a2b18 <+456>: str.w r0, [r4, r8] 0x18a2b1c <+460>: mov r0, r1 (lldb) p (char *)$r1 (char *) $33 = 0x023b452b "initWithTitle:andImageName:andContent:andCancelText:" (lldb) po $r0 <MMTipsViewController: 0x16544950> (lldb) po $r2 Audio to Text (lldb) po $r3 <nil> (lldb) x/10 $sp 0x27d99c88: 0x160e8500 0x00000000 0x160d0a90 0x1573a430 0x27d99c98: 0x16544950 0x156dd840 0x00000000 0x157367a0 0x27d99ca8: 0x0000010f 0x1563d790 (lldb) po 0x160e8500 This feature only supported for Mandarin Chinese, and the result is for reference only. (lldb) po 0x00000000 <nil>
* thread #1: tid = 0x3a3d5, 0x018a2b5a WeChat`___lldb_unnamed_function88615$$WeChat + 522, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2b5a WeChat`___lldb_unnamed_function88615$$WeChat + 522 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2b5a <+522>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2b5e <+526>: movw r0, #0xcc36 0x18a2b62 <+530>: movt r0, #0x10c 0x18a2b66 <+534>: add r0, pc (lldb) p (char *)$r1 (char *) $40 = 0x0236cc3c "setM_delegate:" (lldb) po $r0 <MMTipsViewController: 0x16544950> (lldb) po $r2 <VoiceMessageNodeView: 0x157087c0; frame = (183 0; 128 59); layer = <CALayer: 0x15708950>>
* thread #1: tid = 0x3a3d5, 0x018a2b6e WeChat`___lldb_unnamed_function88615$$WeChat + 542, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2b6e WeChat`___lldb_unnamed_function88615$$WeChat + 542 WeChat`___lldb_unnamed_function88615$$WeChat: -> 0x18a2b6e <+542>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2b72 <+546>: mov r0, r6 0x18a2b74 <+548>: add sp, #0x1c 0x18a2b76 <+550>: pop.w {r8, r10, r11} (lldb) p (char *)$r1 (char *) $43 = 0x33c0f712 "show" (lldb) po $r0 <MMTipsViewController: 0x16544950>
到此告一段落。能够看到,这个分支实际上是走了else
的这一段,就是偏复杂的操做。值得注意的是0x18a2b10
处的objc_msgSend,生成了一个关键句“This feature only supported for Mandarin Chinese, and the result is for reference only.”,咱们c
一下便可看到:
Pasted image640x960 56 KB
好了,出现了一个WeChat自定义的弹框。咱们用Cycript看看,点击“OK”后会触发什么样的操做。
通常来讲,弹框都不会出如今keyWindow
上,须要到其余的window里找寻它的踪影;所以,咱们依次检查各个window:
cy# [[UIApp windows][1] recursiveDescription].toString() `<SvrErrorTipWindow: 0x1601d640; baseClass = UIWindow; frame = (0 0; 320 45); hidden = YES; gestureRecognizers = <NSArray: 0x1601d180>; layer = <UIWindowLayer: 0x1601d500>> | <UIImageView: 0x1601bc50; frame = (12.5 2.5; 40 40); opaque = NO; userInteractionEnabled = NO; layer = <CALayer: 0x1601b8d0>> | <UIButton: 0x1601a5a0; frame = (295 12.5; 20 20); opaque = NO; layer = <CALayer: 0x1601a4f0>> | | <UIImageView: 0x160194b0; frame = (0 0; 20 20); clipsToBounds = YES; opaque = NO; userInteractionEnabled = NO; layer = <CALayer: 0x16019480>> | <RichTextView: 0x160189f0; baseClass = UILabel; frame = (65 5; 0 0); opaque = NO; layer = <CALayer: 0x16018890>>` cy# [[UIApp windows][2] recursiveDescription].toString() "<UITextEffectsWindow: 0x156acc50; frame = (0 0; 320 480); hidden = YES; opaque = NO; gestureRecognizers = <NSArray: 0x16455e60>; layer = <UIWindowLayer: 0x1623d730>>" cy# [[UIApp windows][3] recursiveDescription].toString() "<UITextEffectsWindow: 0x162e66e0; frame = (0 0; 320 480); hidden = YES; gestureRecognizers = <NSArray: 0x15615570>; layer = <UIWindowLayer: 0x1604ba00>>" cy# [[UIApp windows][4] recursiveDescription].toString() `<MMUIWindow: 0x162cd5d0; baseClass = UIWindow; frame = (0 0; 320 480); gestureRecognizers = <NSArray: 0x1572f4c0>; layer = <UIWindowLayer: 0x1645eb70>> | <UIView: 0x1602d620; frame = (0 0; 320 480); autoresize = W+H; layer = <CALayer: 0x1602d680>> | | <UIButton: 0x16021780; frame = (0 0; 320 480); opaque = NO; autoresize = W+H; layer = <CALayer: 0x156dd060>> | | | <UIView: 0x15705000; frame = (20 144; 280 192); layer = <CALayer: 0x156e3fc0>> | | | | <UIImageView: 0x165457f0; frame = (0 0; 280 192); opaque = NO; userInteractionEnabled = NO; layer = <CALayer: 0x156e3f40>> | | | | <MMUILabel: 0x165456d0; baseClass = UILabel; frame = (15 22; 250 22); text = 'Audio to Text'; userInteractionEnabled = NO; layer = <CALayer: 0x156d7050>> | | | | <MMUILabel: 0x16545350; baseClass = UILabel; frame = (16.5 59; 247 62); text = 'This feature only support...'; userInteractionEnabled = NO; layer = <CALayer: 0x156e3090>> | | | | <FixTitleColorButton: 0x1604ae00; baseClass = UIButton; frame = (0 142; 280 50); opaque = NO; layer = <CALayer: 0x156e3480>> | | | | | <UIButtonLabel: 0x162220c0; frame = (127 14; 26 22); text = 'OK'; clipsToBounds = YES; opaque = NO; userInteractionEnabled = NO; layer = <CALayer: 0x156de430>> | | | | | <UIView: 0x16035ce0; frame = (0 0; 280 0.5); autoresize = W; layer = <CALayer: 0x156da050>>` cy# [#0x1604ae00 allTargets] [NSSet setWithArray:@[#"<MMTipsViewController: 0x16544950>"]]] cy# [#0x1604ae00 allControlEvents] 64 cy# [#0x1604ae00 actionsForTarget:#0x16544950 forControlEvent:64] @["onClickBtn:"]
在第5个window里(注意,[UIApp windows][0]
就是keyWindow,所以这里是第5个window),咱们找到了OK按钮,且它的响应函数是[MMTipsViewController onClickBtn:]
;去看看它的实现细节。
[MMTipsViewController onClickBtn:]
的实现细节
(lldb) br s -a 0x01cd8120+0x00064000 Breakpoint 2: where = WeChat`___lldb_unnamed_function108280$$WeChat, address = 0x01d3c120 Process 3098 stopped * thread #1: tid = 0x3a3d5, 0x01d3c120 WeChat`___lldb_unnamed_function108280$$WeChat, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1 frame #0: 0x01d3c120 WeChat`___lldb_unnamed_function108280$$WeChat WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c120 <+0>: push {r4, r5, r6, r7, lr} 0x1d3c122 <+2>: add r7, sp, #0xc 0x1d3c124 <+4>: push.w {r8, r10, r11} 0x1d3c128 <+8>: mov r4, r0 (lldb) ni
* thread #1: tid = 0x3a3d5, 0x01d3c140 WeChat`___lldb_unnamed_function108280$$WeChat + 32, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x01d3c140 WeChat`___lldb_unnamed_function108280$$WeChat + 32 WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c140 <+32>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x1d3c144 <+36>: tst.w r0, #0xff 0x1d3c148 <+40>: bne 0x1d3c15c ; <+60> 0x1d3c14a <+42>: movw r0, #0x40ee (lldb) p (char *)$r1 (char *) $45 = 0x023e6510 "bIsForbidCancelBtn" (lldb) po $r0 <MMTipsViewController: 0x16544950>
* thread #1: tid = 0x3a3d5, 0x01d3c158 WeChat`___lldb_unnamed_function108280$$WeChat + 56, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x01d3c158 WeChat`___lldb_unnamed_function108280$$WeChat + 56 WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c158 <+56>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x1d3c15c <+60>: movw r0, #0xf490 0x1d3c160 <+64>: movt r0, #0xc7 0x1d3c164 <+68>: add r0, pc (lldb) p (char *)$r1 (char *) $47 = 0x023e643d "hideTips" (lldb) po $r0 <MMTipsViewController: 0x16544950>
* thread #1: tid = 0x3a3d5, 0x01d3c1a4 WeChat`___lldb_unnamed_function108280$$WeChat + 132, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x01d3c1a4 WeChat`___lldb_unnamed_function108280$$WeChat + 132 WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c1a4 <+132>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x1d3c1a8 <+136>: mov r4, r0 0x1d3c1aa <+138>: mov r0, r6 0x1d3c1ac <+140>: blx 0x20eb5c0 ; symbol stub for: objc_release (lldb) p (char *)$r1 (char *) $49 = 0x33c098a1 "respondsToSelector:" (lldb) p (char *)$r2 (char *) $51 = 0x023b4abf "onClickTipsBtn:"
* thread #1: tid = 0x3a3d5, 0x01d3c1c2 WeChat`___lldb_unnamed_function108280$$WeChat + 162, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x01d3c1c2 WeChat`___lldb_unnamed_function108280$$WeChat + 162 WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c1c2 <+162>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x1d3c1c6 <+166>: mov r0, r4 0x1d3c1c8 <+168>: blx 0x20eb5c0 ; symbol stub for: objc_release 0x1d3c1cc <+172>: mov r0, r5 (lldb) p (char *)$r1 (char *) $52 = 0x023b4abf "onClickTipsBtn:" (lldb) po $r0 <VoiceMessageNodeView: 0x157087c0; frame = (183 0; 128 59); layer = <CALayer: 0x15708950>> (lldb) po $r2 1
注意,这里调用了[VoiceMessageNodeView onClickTipsBtn:]
,咱们把动态调试暂停在这里,去看看[VoiceMessageNodeView onClickTipsBtn:]
的实现细节。
[VoiceMessageNodeView onClickTipsBtn:]
的实现细节
Pasted image777x321 48.2 KB
这段代码很容易还原,它的核心操做是:
[[%c(SettingUtil) getMainSettingExt] theadSafeSetObject:@"1" forKey:@"SETTINGEXT_VOICE_TRANS_TIP_TIMES"]; AccountStorageMgr *manager = [[%c(MMServiceCenter) defaultCenter] getService:[%c(AccountStorageMgr) class]]; [manager SaveSettingExt];
从函数及参数名来看,这段代码的功能貌似就是做为语音转换的开关。咱们继续ni
,看看还有没有其余玄机。
[MMTipsViewController onClickBtn:]
的实现细节
* thread #1: tid = 0x3a3d5, 0x01d3c1e8 WeChat`___lldb_unnamed_function108280$$WeChat + 200, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x01d3c1e8 WeChat`___lldb_unnamed_function108280$$WeChat + 200 WeChat`___lldb_unnamed_function108280$$WeChat: -> 0x1d3c1e8 <+200>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x1d3c1ec <+204>: mov r4, r0 0x1d3c1ee <+206>: mov r0, r6 0x1d3c1f0 <+208>: blx 0x20eb5c0 ; symbol stub for: objc_release (lldb) p (char *)$r1 (char *) $55 = 0x33c098a1 "respondsToSelector:" (lldb) p (char *)$r2 (char *) $56 = 0x023b4acf "onClickTipsBtn:Index:" (lldb) po $r0 <VoiceMessageNodeView: 0x157087c0; frame = (183 0; 128 59); layer = <CALayer: 0x15708950>>
由于[VoiceMessageNodeView respondsToSelector:@selector(onClickTipsBtn:Index:)]
的返回值为NO,因此这一大段分析,得出的核心代码,就是上面提取出的这段,且它的功能,就是开启微信语音转换功能的开关。这段代码怎么使用呢?咱们只须要在微信第一次启动时调用这段代码,就能够开启微信的语音转换功能了。
开启了语音转换以后,咱们取得了阶段性胜利。可是下一个问题来了,哪段代码是负责实际转换操做的呢?这才是咱们的重中之重。
刚才的[VoiceMessageNodeView onClickTipsBtn:]
里,除了开启语音转换功能的代码外,还有一个低调的showVoiceTransView
,它的调用者是一个VoiceMessageNodeView
对象,我猜语音转换操做的核心代码就藏着这里。咱们看看它的反编译代码:
Pasted image664x849 78.8 KB
还挺复杂。我们动态跟跟看:
(lldb) br s -a 0x0183eb84+0x64000 Breakpoint 3: where = WeChat`___lldb_unnamed_function88616$$WeChat, address = 0x018a2b84 Process 3098 stopped * thread #1: tid = 0x3a3d5, 0x018a2b84 WeChat`___lldb_unnamed_function88616$$WeChat, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1 frame #0: 0x018a2b84 WeChat`___lldb_unnamed_function88616$$WeChat WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2b84 <+0>: push {r4, r5, r6, r7, lr} 0x18a2b86 <+2>: add r7, sp, #0xc 0x18a2b88 <+4>: str r8, [sp, #-4]! 0x18a2b8c <+8>: vpush {d8} (lldb)
* thread #1: tid = 0x3a3d5, 0x018a2bde WeChat`___lldb_unnamed_function88616$$WeChat + 90, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2bde WeChat`___lldb_unnamed_function88616$$WeChat + 90 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2bde <+90>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2be2 <+94>: mov r7, r7 0x18a2be4 <+96>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2be8 <+100>: mov r8, r0 (lldb) p (char *)$r1 (char *) $59 = 0x0236deb3 "getAppViewControllerManager" (lldb) ni Process 3098 stopped * thread #1: tid = 0x3a3d5, 0x018a2be2 WeChat`___lldb_unnamed_function88616$$WeChat + 94, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2be2 WeChat`___lldb_unnamed_function88616$$WeChat + 94 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2be2 <+94>: mov r7, r7 0x18a2be4 <+96>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2be8 <+100>: mov r8, r0 0x18a2bea <+102>: movw r0, #0xe05a (lldb) po $r0 <CAppViewControllerManager: 0x1601dcb0>
* thread #1: tid = 0x3a3d5, 0x018a2bf8 WeChat`___lldb_unnamed_function88616$$WeChat + 116, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2bf8 WeChat`___lldb_unnamed_function88616$$WeChat + 116 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2bf8 <+116>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2bfc <+120>: mov r7, r7 0x18a2bfe <+122>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2c02 <+126>: mov r6, r0 (lldb) p (char *)$r1 (char *) $61 = 0x023743eb "getMainWindow" (lldb) po $r0 <CAppViewControllerManager: 0x1601dcb0> (lldb) ni Process 3098 stopped * thread #1: tid = 0x3a3d5, 0x018a2bfc WeChat`___lldb_unnamed_function88616$$WeChat + 120, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2bfc WeChat`___lldb_unnamed_function88616$$WeChat + 120 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2bfc <+120>: mov r7, r7 0x18a2bfe <+122>: blx 0x20eb610 ; symbol stub for: objc_retainAutoreleasedReturnValue 0x18a2c02 <+126>: mov r6, r0 0x18a2c04 <+128>: cbz r5, 0x18a2c30 ; <+172> (lldb) po $r0 <iConsoleWindow: 0x156b8ae0; baseClass = UIWindow; frame = (0 0; 320 480); autoresize = W+H; gestureRecognizers = <NSArray: 0x156b9290>; layer = <UIWindowLayer: 0x156b8df0>>
* thread #1: tid = 0x3a3d5, 0x018a2d0e WeChat`___lldb_unnamed_function88616$$WeChat + 394, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2d0e WeChat`___lldb_unnamed_function88616$$WeChat + 394 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2d0e <+394>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2d12 <+398>: movw r0, #0x930e 0x18a2d16 <+402>: add.w r9, sp, #0x20 0x18a2d1a <+406>: movt r0, #0x10e (lldb) po $r2 {m_uiMesLocalID=2, m_ui64MesSvrID=6982651110964038564, m_nsFromUsr=wxi*f12~19, m_nsToUsr=we*in~6, m_uiStatus=2, type=34, msgSource="(null)"} (lldb) po $r0 <VoiceTransFloatPreview: 0x16543cd0; baseClass = UIWindow; frame = (0 0; 320 480); hidden = YES; gestureRecognizers = <NSArray: 0x16053830>; layer = <UIWindowLayer: 0x15624d10>> (lldb) p (char *)$r1 (char *) $77 = 0x02421864 "setVoiceMsg:"
* thread #1: tid = 0x3a3d5, 0x018a2d40 WeChat`___lldb_unnamed_function88616$$WeChat + 444, queue = 'com.apple.main-thread', stop reason = instruction step over frame #0: 0x018a2d40 WeChat`___lldb_unnamed_function88616$$WeChat + 444 WeChat`___lldb_unnamed_function88616$$WeChat: -> 0x18a2d40 <+444>: blx 0x20eb570 ; symbol stub for: objc_msgSend 0x18a2d44 <+448>: add sp, #0x30 0x18a2d46 <+450>: vpop {d8} 0x18a2d4a <+454>: ldr r8, [sp], #4 (lldb) p (char *)$r1 (char *) $79 = 0x02421881 "showWithAnimate:"
我省略了一些明显是UI层的操做,留下了上面这些objc_msgSend。咱们回顾一下微信语音转文字时的UI效果:当咱们点击“Convert to Text”以后,新的界面上出现“converting...”的字样,等待几秒钟,转换好的文字就会出如今界面上。这说明,微信多是先把显示文字的UI给画出来,以“converting...”提示用户等待,同时开一个另外一个线程去转换语音,待语音转换完毕后再把文字给显示在UI上。结合咱们的猜想,上面的一系列objc_msgSend中,最可疑的无疑是[VoiceTransFloatPreview setVoiceMsg:(CMessageWrap *)]
和[VoiceTransFloatPreview showWithAnimate:]
。咱们分别看一下它们的实现细节。
[VoiceTransFloatPreview setVoiceMsg:]
的实现细节
Pasted image717x227 23.3 KB
看起来只是一个普通的setter,没有什么特别的地方,继续下一目标。
[VoiceTransFloatPreview showWithAnimate:]
的实现细节
Pasted image735x433 54.3 KB
在一系列的函数中,[r4 onStartGet]
引发了个人注意,它是除IdleTimerUtil
外,惟一没有出现UI字眼的函数;咱们看看它的实现。
[VoiceTransFloatPreview onStartGet]
的实现细节
看到VoiceTransHelper
的字眼,我知道咱们的分析快要接近尾声了。打开VoiceTransHelper.h
,诸如
- (id)initWithVoiceMsg:(id)arg1 VoiceID:(id)arg2; - (void)startVoiceTrans; - (void)stopVoiceTrans; - (void)HandleGetVoiceTransResp:(id)arg1 Event:(unsigned long)arg2;
等可疑函数一览无余,等待检阅。对于这个类的调研,就留做练习,交给正在阅读此帖的你来完成吧~!
微信的语音识别技术想必不会比讯飞强大,仅论语音识别配置的精细度来讲,讯飞就要专业不少。可是,若是咱们的需求仅仅是简单的语音识别,没有太多定制化的需求,那么打包讯飞SDK以后的dylib会比采用微信原生语音转换的dylib大5M左右。跟我在这里41提到的思路一脉相承——咱们的dylib存活在别人的进程里,至关因而咱们去其余人家作客,不给别人添麻烦是基本的教养和礼貌,可以节省5+M内存,是一种尊重他人劳动成果的体现,更是咱们自我要求精益求精的缩影。工程师的素养体如今一点一滴中,见微知著,才能成就大业;祝愿你们都能持续进步,再攀高峰