Exploring the I2S subsystem

1. Motivation

Actually I wanted to implement the Arduino project Timer1 Sqare Wave Generator on the ESP8266. I soon realized that this controller had a completely different architecture and that a one-to-one implementation was not possible. But my efforts were not in vain, because I thereby discovered the I2S subsystem of the ESP8266. I had heard of this interface for exchanging digital audio data between ICs, but had never looked into it in detail. A look at it could certainly be enlightening.

The I2S subsystem is not described in detail in the documentation of the chip manufacturer Espressif. Fortunately, however, there are hobbyists who kindly make their findings or entire projects available to the general public on the Internet.

2. Parts

Wemos D1 R2

3. Wiring

    WEMOS D1 R2            to scope   
     GPIO_15 / D8 o --> I2S_BCK   bit clock
      GPIO_2 / D4 o --> I2S_WS    word select 
      GPIO_3 / RX o --> I2S_DATA  serial data
                  |      _I_				  
      GPIO_0 / D3 o -----o o----/ GND				  

4. I2S Interface

The I2S interface is a 3-wire interface developed by Philips. The original names of the 3 signals are SDK (serial clock), WS ( word select) and SD (serial data). Other manufacturers use other designations, e.g. for SDK also BCLK (bit clock) or SCLK (serial clock). WS is also called LRCLK (left/right clock) or Frame Clock and SD can also be called SDATA, SDOUT or DACDAT.

Normally 16 bits per sample are transmitted for the right and for the left audio channel. So a sample of a stereophonic audio signal can be packed into a 32-bit double word. The differentiation between right and left channel is done with the help of the word-select signal. From this it follows that the frequency of the WS signal is 1/32 of the bit-clock signal.

I2S Protocol

From the diagram above we see that the WS signal transitions one clock period before the completion of a data word. One word (16 bit) of a data double word (32 bit) represents the right channel and is transmitted on WS = 1 with MSB first. The other word represents the left channel and is transmitted on WS = 0
The usual sampling rate for audio data is 44100 Hz which results in a bit rate of 1'411'200 Hz.

On the ESP8266 the I2S bitclock and wordselect signals are derived from an internal base frequency I2SBASEFREQ of 160MHz which is divided by 32 to get f_baseclk of 5MHz. Two further divisors div1 and div2 in the range 2..63 then determine the frequency of the wordselect signal. The internal hardware ensures that the frequency of the bitclock is 32 times the frequency of the wordselect.

Therefore the following formulas apply:
  f_baseclk = I2SBASEFREQ / 32 = 5 MHz
  f_ws      = -----------  with div1, div2 = 2..63
              div1 * div2			  
The difficulty now is to determine the two integer divisors in such a way that the desired sampling rate is approximated as closely as possible.
Fortunately, the core_esp8266_i2s library provides us with features that make our lives easier. For the experiments presented here, I use the following functions:

5. Experiments

1 - Setting the sampling rate to 44100 Hz

First I want to set a sampling rate of 44100 Hz and check, how exactly the desired frequency is reached with i2s_setRate(44100).

Samplingrate 44100 Hz
i2s_44100.jpg i2s_44100_t

The oscilloscope shows on channel 1 the bit clock and on channel 2 the word select signal (sampling signal).
The exact frequencies would be 44100 and 1411200 Hz but the oscilloscope and the display on the monitor show that the set frequency is 43859.648, which corresponds to a bit rate of 1403509 Hz. We also see that the exact divisor would be 113.379. Unfortunately, the next integer 113 is a prime number that cannot be decomposed into a product of two numbers < 63. The second best number is then 114, which makes the number pairs (57, 2), (38, 3), (19, 6) possible as divisors.

2 - Setting the sampling rate to 2500 Hz

For the further experiments I set the sampling rate to 2500 Hz with i2s_set_rate(2500).

Samplingrate 2500 Hz
i2s_2500.jpg i2s_2500_t

As expected, this frequency can be set precisely. The divisor is 2000, which is achieved with the number pair (50, 40). The bit rate (80 kHz) is 32 times the sampling rate (2.5 kHz).

3 - Identifying the stereo channels

In the further experiments I show the I2S data output (RX) on channel 1 of the oscilloscope.

The function i2s_write_sample() writes a 32 bit double word to the output. As the I2S timing showed above, one word (16 bit) contains the information of the right stereo channel and the other word that of the left channel. The selection is done with the wordselect signal.

I want to find out in which word the bits of the left channel are packed and in which those of the right channel. For this I write the bit pattern 0b10000000000000001010000000000000 to the output. This means, in one channel 1 bit is set (high word) and in the other 2 (low word).

  uint32_t sample = 0b10000000000000001010000000000000;	
Identifying the stereo channels

Aha, ws=high selects the channel with 1 bit set, so the high word. This means that the information of the right channel is in the high word and that of the left channel in the low word.

6. User interface

I could not use my preferred Command Line Interface (CLI) here, because RX of the serial port is used for the I2S data output. Therefore I only used a simple push button to scroll through the 10 predefined patterns. One click selects the next pattern, a long click the previous and a double click displays the current pattern.


The remaining samples 1..9 are shown here without further comment.

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

Sample 6

Sample 7

Sample 8

Sample 9

7. Program Code

The functionality for the push button is implemented in the ButtonDebounced class. It debounces the push button and implements the onClick(), onLongClick() and onDoubleClick() methods. The main loop then essentially just writes the sample to the I2S output and calls the button.loop().

  ButtonDebounced button(pinButton, nextSample, prevSample, showCurrent);			
  bool done = false;
  void loop()
    // delay(10000); return; // activate this line to investigate only the set frequencies
      findDivisors(div1xdiv2, div1, div2, zbest, diffbest);
      done = true;

    i2s_write_sample(sample[currentSample].value); // write a doube word

My programming environment is not the native Arduino™ IDE but PlatformIO™ on top of Microsoft's Visual Studio Code™. This combination offers many advantages and allows a much better structuring of the code into several modules especially when we adopt The Object Oriented way.

Interested? Please download the entire program code. The zip-file contains the complete PlatformIO project.