Skip to content

How to speed things up (ST7789V2 + ESP32-C6-DevKitC-1) #142

@akx

Description

@akx

Heya,

I'm wondering how to speed things up for an application that will likely need full-screen updates most of the time.

I have an ESP32-C6-DevKitC-1 and a WaveShare 280x240 1.69" LCD module, using esp-idf-svc (at present anyway).

My current experiment code (please excuse the mess, it's an experiment so far) is

use std::collections::VecDeque;
use std::time::Instant;
use esp_idf_svc::hal::gpio::PinDriver;
use esp_idf_svc::hal::prelude::*;
use crate::led::WS2812RMT;

use embedded_graphics::{
    prelude::*,
    primitives::Rectangle,
};
use embedded_hal::spi::MODE_3;
use esp_idf_svc::hal::delay::Ets;
use esp_idf_svc::hal::spi::config::{BitOrder, Config, Duplex};
use esp_idf_svc::hal::spi::{SpiDeviceDriver, SpiDriverConfig};
use display_interface_spi::SPIInterface;
use embedded_graphics::pixelcolor::{Rgb565, Rgb888};
use mipidsi::models::ST7789;
use mipidsi::options::{ColorInversion, Orientation, RefreshOrder, Rotation};
use rand::{Rng, SeedableRng};
use rand::rngs::SmallRng;
use crate::ticktock::Ticktock;

mod led;
mod ticktock;

fn main() -> anyhow::Result<()> {
    esp_idf_svc::sys::link_patches();
    esp_idf_svc::log::EspLogger::initialize_default();
    log::info!("Hello, world!");
    let peripherals = Peripherals::take()?;
    log::info!("Initializing LED");
    let mut led = WS2812RMT::new(peripherals.pins.gpio8, peripherals.rmt.channel0)?;
    led.set_pixel(50, 50, 0)?;
    let mut rng = SmallRng::from_entropy();
    log::info!("Initializing pins");
    let spi = peripherals.spi2;
    let mut backlight = PinDriver::output(peripherals.pins.gpio23)?;
    let rst = PinDriver::output(peripherals.pins.gpio22)?;
    let dc = PinDriver::output(peripherals.pins.gpio21)?; // DC (data/command)
    let sclk = peripherals.pins.gpio19; // CLK
    let sda = peripherals.pins.gpio18; // DIN
    let sdi = peripherals.pins.gpio15; // Tied low
    let cs = peripherals.pins.gpio20; // CS
    log::info!("Initializing SPI device");
    let mut delay = Ets;
    let config = Config::new()
        .bit_order(BitOrder::MsbFirst)
        .data_mode(MODE_3)
        .baudrate(80.MHz().into());

    let device = SpiDeviceDriver::new_single(
        spi,
        sclk,
        sda,
        Some(sdi),
        Some(cs),
        &SpiDriverConfig::new(),
        &config,
    )?;
    log::info!("Initializing display_interface_spi");
    let di = SPIInterface::new(device, dc);
    log::info!("Initializing display");
    let mut display = mipidsi::Builder::new(ST7789, di)
        .reset_pin(rst)
        .display_size(240, 320)
        .orientation(Orientation::new().rotate(Rotation::Deg90))
        .invert_colors(ColorInversion::Inverted)
        .init(&mut delay).unwrap();
    log::info!("Setting BL");
    backlight.set_high()?;

    let mut sw = Ticktock::new();
    let mut buf: Vec<Rgb565> = Vec::with_capacity(320*240);
    let alloc_time = sw.tick();
    for i in 0..240 {
        for j in 0..320u16 {
            let val = i ^ (j as u8);
            let px = Rgb888::new(val, val, val);
            buf.push(Rgb565::from(px));
        }
    }
    let initial_fill_time = sw.tick();
    log::info!("Allocated buffer in {alloc_time}us, filled buffer in {initial_fill_time}us",
        alloc_time = alloc_time,
        initial_fill_time = initial_fill_time,
    );

    let mut frame_times: VecDeque<u32> = VecDeque::with_capacity(20);
    let fullscreen = Rectangle::new(Point::new(0,0), Size::new(320,240));
    let mut last_fps_print_time = Instant::now();
    for _i in 1..10000 {
        let mut sw = Ticktock::new();
        for _ in 1..100 {
            let inoff = rng.gen_range(0..buf.len());
            let outoff = rng.gen_range(0..buf.len());
            buf.swap(inoff, outoff);
        }
        let render_time = sw.tick();
        let clon = buf.clone();
        let clone_time = sw.tick();
        display.fill_contiguous(&fullscreen, clon).unwrap();
        let blit_time = sw.tick();
        let frame_time = sw.total();
        frame_times.push_back((frame_time / 1000) as u32);
        if frame_times.len() > 20 {
            frame_times.pop_front();
        }
        if last_fps_print_time.elapsed().as_millis() > 500 {
            led.set_pixel(0, 50, 0)?;
            let fps = 1000.0 / (frame_times.iter().sum::<u32>() as f32 / (frame_times.len() as f32));
            log::info!(
                "FPS: {fps:.2} / render={render_time}us clone={clone_time}us blit={blit_time}us frame={frame_time}us",
                fps = fps,
                render_time = render_time,
                clone_time = clone_time,
                blit_time = blit_time,
                frame_time = frame_time,
            );
            last_fps_print_time = Instant::now();
            led.set_pixel(0, 0, 50)?;
        }
    }
    unreachable!();
}

and the interesting (performance) output for a --release build is:

I (742) telsu: Allocated buffer in 35us, filled buffer in 25514us
I (1292) telsu: FPS: 9.26 / render=120us clone=649us blit=107889us frame=108662us
I (1832) telsu: FPS: 9.26 / render=122us clone=649us blit=107897us frame=108672us

IOW, 99.2% of the frame time is spent in fill_contiguous. Setting the SPI baudrate to something lower than 80 MHz (which, AIUI, is already pushing it especially given my display is behind 10-centimeter DuPont wires 🤠) doesn't change things a lot; blit time becomes about 123547us).

Is there something obvious I'm doing wrong for an application like this, where I basically just have a buffer of Rgb565 to push to the screen?

And of course thank you for the work you've put into the library and the ecosystem at large! I was surprised to see things working at all (after, of course, having heeded the big red instructions on WaveShare's wiki and powered the display from 3V3 and not 5V...).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions