I found one implementation on github (which apparently was copied by several others, exactly the same code). This code was using copy&paste code from Espressif (ugly...) and also it was using SLC (aka DMA) for the data transfers. I did not like that, if you have a few ledpixels in a string, it will take a lot of memory and the DMA is really not required. According to the author he had trouble getting it to work completely as conceived. Which I can understand, given the very poor documentation by Espressif.
So I decided to implement it using the FIFO (programmed I/O). The waveform only needs to be output when a value changes, so it doesn't need to be output continuously (which the DMA implementation does). The FIFO is quite large, AFAIK it's about 128 entries, but all of them are 4 bytes (word) wide, so effectively it's 512 bytes. That is enough to hold all the data for 16 leds, without having to wait for free space, so that's enough for me.
I then found that apparently nobody is using the programmed I/O FIFO mode (no results using google) for I2S. I assume that's caused by the lack of documentation by Espressif (once again...). So that was an interesting journey. But I got it working! So my message is really, if you're planning to do this, have a look at my code and save yourself a lot of frustration
A few "interesting" things: after the last sample ("left" and "right", so actually 4 bytes) are fetched from the fifo, they are repeated indefinitely. So make sure they're all zeroes and make sure they're in the fifo before it runs out.
Apparently it works best if you stop the transmitter before supplying new data in the fifo. Otherwise the ordering of which data is output when may get ehrm.... interesting. I found out with an oscilloscope. But how do you now if all data has been sent? I started out with stopping the transmitter just before inserting new data. This worked but was inherently not correct. I searched for a queue counter, but I found none. There are a few entries that suggests this function, but they're not working.
Also as I am not using DMA, I could not use the demonstrated DMA interrupts. But there are interrupts specified for the I2S module. Just not an ISR subscribing function, nowhere. In the end it appears that the I2S module shares it's interrupts with the SPI slave modules (of both SPI and HSPI, watch out!). As my code crashed immediately after registering the ISR I knew I was up to something. It turns out the SPI module has, by default, an interrupt active for slave actions. It triggers, the ISR is called and my own ISR never reset the source, so the the ISR keeps getting called. The solution was to simply disable SPI slave interrupts before registering the ISR. I don't know what the ESP8266 would need it for, it uses the SPI module (not HSPI) only for master access. And after that it works like a charm. FIFO gets empty, interrupt is triggered, my code shuts down transmitter. Which is restarted after the complete FIFO has been filled again.
I thought it was important to share my experiences, as this mode is nowhere documented and apparently used by nobody. I still have lots of questions about the functions of certain registers, but at least it works now.
I found the work on I2S very interesting, because it can be used to generate arbitrary, almost limitless waveforms. The ESP32 has a special moduel for that, but using I2S it can be done just as well. The only limit is that you can't choose the output pins, if you use I2S data out, you loose UART 0 RxD (but that one may be reassigned IIRC).