At the end of 2014, then esp8266 has been just arrived, i decided to make universal IoT device with speech recognition, speaker.
Unfortunately the esp8266 hardware is not friendly for microphone connection. I've tried to use internal ADC, but no way. There are option to use internal I2S, but it is multiplexed with UART, and there are no working code example till now is available.
The next step - use external MCU with good sigma-delta ADC. I've tried use MSP430 for audio capture, and streaming samples to ESP8266 via SPI. In this config i've recorded some audio in first time. But MSP430 is too slow, and i've faced to serious performance problems with SPI protocol. Also quality of sound was poor. And if WiFi transmit occurs, then voice is hided by high amplitude noise.
Finally, i managed to use STM32F105 and PDM microphone for audio capture, and then stream audio via spi to ESP8266. The schematics https://github.com/wiieva/schematics
This setup give good sound quality, and pretty stable voice recognition by Google.
Here is sample video
STM32 code are do all hard work. It's captures PDM signal, filter it to aquire PCM, and then encode it to SPEEX format, which is suitable for Google voice recognition. I also tried RAW WAV audio, it works to, but it's less stable, due to requirement bigger buffer sizes and sensible to network delays.
The ESP8266 sources:
Arduino sketch: https://github.com/wiieva/examples/blob ... ro.cpp#L77
SPI protocol implementation: https://github.com/wiieva/wiieva-varian ... Wiring.cpp
Here is STM32 sources
Audio capture and encoding: https://github.com/wiieva/stm32aio/blob ... audio_in.c
SPI protocol implementation: https://github.com/wiieva/stm32aio/blob ... o_server.c
Here is schematics:
PS, If you are interested in UI: uGFX library is used: http://www.ugfx.io . It's really amazing too. As you can see it's 100% compatible with esp8266 and Arduino environment.