Skip to content

Transcription Segment does have firat and last received time #1319

@hahooh

Description

@hahooh

I am using Golang SDK and I want to know transcript starting and end time from ParticipantCallback.OnTranscriptionReceived. However, StartTime and EndTime is not supported yet.

So I want to make FirstReceivedTime and LastReceivedTime set from func ExtractTranscriptionSegments(transcription *livekit.Transcription) []*TranscriptionSegment.

I thought it would be great to have it since I want to have transcript starting time and end time (I think they are close enough).

So here is my suggestion.

Adding first_received_time and last_received_time to TranscriptionSegment message like below.

// protocol/protobufs/livekit_models.proto

// ... existing messages ...

message TranscriptionSegment {
  string id = 1;
  string text = 2;
  uint64 start_time = 3;
  uint64 end_time = 4;
  bool final = 5;
  string language = 6;
  int64 first_received_time = 7;
  int64 last_received_time = 8;
}

// ... existing messages ...

And handles in livekit-go-sdk like below.

Add below properties

	segmentTimestamps map[string]*segmentTimingInfo
	segmentMu         sync.RWMutex

to RTCEngine in https://github.com/livekit/server-sdk-go/blob/main/engine.go

set new type for segement time stamp like below

type segmentTimingInfo struct {
	firstReceivedTime int64
	lastReceivedTime  int64
}

and handle firstReceivedTime and lastReceivedTime before OnTranscription from func (e *RTCEngine) handleDataPacket(msg webrtc.DataChannelMessage) like below.

// ... existing code ...
	case *livekit.DataPacket_Transcription:
		now := time.Now().UnixMilli()
		e.segmentMu.Lock()
		for _, segment := range msg.Transcription.Segments {
			if timing, exists := e.segmentTimestamps[segment.Id]; exists {
				// Segment already seen - update last received time
				segment.FirstReceivedTime = timing.firstReceivedTime
				segment.LastReceivedTime = now
				timing.lastReceivedTime = now
			} else {
				// New segment - set both times to now
				segment.FirstReceivedTime = now
				segment.LastReceivedTime = now
				e.segmentTimestamps[segment.Id] = &segmentTimingInfo{
					firstReceivedTime: now,
					lastReceivedTime:  now,
				}
			}

			if segment.Final {
				delete(e.segmentTimestamps, segment.Id)
			}
		}
		e.segmentMu.Unlock()
		e.engineHandler.OnTranscription(msg.Transcription)
	case *livekit.DataPacket_RpcRequest:
// ... existing code ...

and extract first and last received time from func ExtractTranscriptionSegments(transcription *livekit.Transcription) []*TranscriptionSegment by adding two below properties to TranscriptionSegment and extract FirstReceivedTime and LastReceivedTime` from protobuf like below.

type TranscriptionSegment struct {
	ID                string
	Text              string
	Language          string
	StartTime         uint64
	EndTime           uint64
	Final             bool
	FirstReceivedTime int64
	LastReceivedTime  int64
}

func ExtractTranscriptionSegments(transcription *livekit.Transcription) []*TranscriptionSegment {
	var segments []*TranscriptionSegment
	if transcription == nil {
		return segments
	}
	segments = make([]*TranscriptionSegment, len(transcription.Segments))
	for i := range transcription.Segments {
		segments[i] = &TranscriptionSegment{
			ID:                transcription.Segments[i].Id,
			Text:              transcription.Segments[i].Text,
			Language:          transcription.Segments[i].Language,
			StartTime:         transcription.Segments[i].StartTime,
			EndTime:           transcription.Segments[i].EndTime,
			Final:             transcription.Segments[i].Final,
			FirstReceivedTime: transcription.Segments[i].FirstReceivedTime,
			LastReceivedTime:  transcription.Segments[i].LastReceivedTime,
		}
	}
	return segments
}

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions