System Architecture
Timbre’s architecture separates control from sound. The ESP32 handles everything digital — MIDI parsing, voice allocation, envelope computation, modulation routing — and pushes parameter updates to each voice chip over I2C. The CY8C29466 chips handle everything analog — oscillation, filtering, amplification, waveshaping — in their reconfigurable switched-capacitor fabric.
This separation means the analog path never touches a DAC or ADC during normal operation. The digital controller sets what the analog fabric does; the fabric does it in continuous-time analog.
Block Diagram
Section titled “Block Diagram”graph TD
MIDI["USB / DIN MIDI Input"] --> ESP32["ESP32-S3 Master Controller"]
ESP32 --> |"I2C Bus"| MUX["TCA9548A I2C Mux<br/>(if >8 voices)"]
MUX --> V1["CY8C29466<br/>Voice 01"]
MUX --> V2["CY8C29466<br/>Voice 02"]
MUX --> VN["CY8C29466<br/>Voice N"]
V1 --> OUT["Audio Output Stage"]
V2 --> OUT
VN --> OUT
ESP32 --> |"Envelope, LFO,<br/>Mod CV values"| MUX
style ESP32 fill:#134e4a,stroke:#0d9488,color:#ccfbf1
style V1 fill:#1a3a37,stroke:#0d9488,color:#ccfbf1
style V2 fill:#1a3a37,stroke:#0d9488,color:#ccfbf1
style VN fill:#1a3a37,stroke:#0d9488,color:#ccfbf1
ESP32 Controller
Section titled “ESP32 Controller”The ESP32-S3 serves as the master controller. It handles all non-analog responsibilities:
| Function | Detail |
|---|---|
| MIDI input | USB MIDI (TinyUSB native) and DIN MIDI (UART) |
| MPE voice allocation | Assigns incoming notes to free voice slots; supports round-robin, lowest-available, and voice-stealing strategies |
| Envelope generation | ADSR computed in firmware at ~1kHz update rate, CV values pushed to voice chips over I2C |
| Modulation routing | LFO, velocity, aftertouch, MPE slide/pressure mapped to filter cutoff, VCA gain, oscillation mode |
| I2C master | Issues configuration and CV update transactions to each voice chip |
| Preset storage | NVS or SD card for patch memory |
The ESP32-S3 was chosen specifically for its native USB support (eliminating the need for a separate USB-MIDI bridge) and its two cores — one for MIDI processing, one for I2C bus management.
Voice Chips
Section titled “Voice Chips”Each CY8C29466 is an I2C slave running firmware that translates parameter updates into analog block reconfiguration. See Voice Engine for details on what happens inside each chip.
The I2C address of each voice chip is set in firmware, allowing identical firmware images across all chips with only an address jumper or configuration byte distinguishing them.
The Voice Card Tradition
Section titled “The Voice Card Tradition”Timbre’s one-chip-per-voice architecture has deep roots in the polysynth tradition. The Sequential Prophet-5 Rev 3 (1978) used approximately five to seven dedicated ICs per voice — two CEM3340 VCOs, one CEM3320 VCF, two CEM3310 envelope generators, plus VCA circuitry. The Oberheim OB-Xa (1981) went further with two CEM3320 filter chips per voice, switchable between 12 dB and 24 dB slopes. The Roland Jupiter-8 (1981) used discrete VCOs paired with proprietary IR3109 quad-OTA filter ICs. The Moog Memorymoog (1982) was the most extravagant, deploying three CEM3340 VCOs per voice plus Moog’s discrete transistor ladder filter — and was notoriously unreliable due to the sheer component count and the thermal sensitivity of all those matched transistor pairs.
Voice cards were genuinely modular in several of these designs. The Prophet-5 became the Prophet-10 by doubling the voice boards. The OB-Xa shipped in four, six, or eight-voice configurations, with voices added by plugging in additional cards. The modern Sequential Rev2 scales from eight to sixteen voices via an add-on board. This scalable voice-card concept — polyphony as a function of how many identical cards you populate — is a direct ancestor of Timbre’s approach of adding PSoC chips to the I2C bus.
The Oberheim Xpander (1984) stands alone as the only production instrument that allowed completely different patches — including different filter topologies — on each voice simultaneously. Its six voices, each built around a CEM3374 dual VCO and CEM3372 filter/mixer/VCA, could run six entirely independent patches with different filter modes selected from fifteen options, different oscillator settings, different modulation routings, and separate audio outputs. The Matrix-12 (1985) doubled this to twelve voices. No modern production instrument has replicated this per-voice topology independence — the Moog One is tri-timbral but voices within each timbre share a patch; the Novation Summit is bi-timbral with shared patches per part.
Timbre extends beyond the Xpander paradigm in three ways. First, the Xpander’s fifteen filter modes were predefined combinations of pole-mixing coefficients — a sophisticated but finite menu. Timbre’s twelve reconfigurable SC blocks per voice can be rewired into arbitrary topologies, not merely selected from a fixed set. Second, the Xpander’s filter mode was a static patch parameter; Timbre can change topology dynamically mid-note in under twelve microseconds. Third, Timbre targets sixteen voices to the Xpander’s six, with each voice carrying the same independent-topology capability. The Lineage page traces this evolution in fuller detail.
I2C Bus
Section titled “I2C Bus”The I2C bus operates at 400kHz (Fast Mode). At this speed, a full parameter update to one voice (cutoff frequency, resonance, VCA gain, oscillation mode — roughly 8 bytes) takes approximately 200μs. Updating all 16 voices in a full polyphonic chord takes ~3.2ms, well within the perceptual threshold for simultaneous control changes.
For systems beyond 8 voices, a TCA9548A I2C multiplexer splits the bus into segments, reducing capacitive loading and allowing up to 64 addressable voices across 8 bus segments.
Scalability
Section titled “Scalability”Voice count is a firmware pool size. Adding CY8C29466 chips to the I2C bus and registering their addresses with the ESP32 allocator increases available voices without architectural changes. The system was designed so that expansion is purely additive — no existing hardware or firmware changes are required.
| Configuration | Voices | I2C Segments | Notes |
|---|---|---|---|
| Minimal | 1 | Direct | Phase 1 PoC |
| Prototype | 4 | Direct | Phase 2 |
| Full | 16 | 2 segments | Phase 3 target |
| Extended | 32+ | 4+ segments | Expansion headers |