Missing `plain_text` attribute in hf dataset

Hello, thank you for sharing this great work!
I've been trying to run the pipeline using the command given in `README.md`: 
```bash
python -m zest.run_pipeline \
  --output_file ./outputs.jsonl \
  --language en \
  --examples_per_language 5 \
  --task full \
  --engine gpt-4.1-mini \
  --data_split dev
```

And I get the following error: 
```bash
  File "/path/to/Lemonade/zest/run_pipeline.py", line 286, in main
    await runner.run_full_pipeline(
  File "/path/to/Lemonade/zest/run_pipeline.py", line 114, in run_full_pipeline
    examples, schema = await self.load_data(
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/Lemonade/zest/run_pipeline.py", line 44, in load_data
    examples = load_dataset(
               ^^^^^^^^^^^^^
  File "/path/to/Lemonade/event_dataset/example.py", line 74, in load_dataset
    article=row["plain_text"],
            ~~~^^^^^^^^^^^^^^
KeyError: 'plain_text'
```

It seems like the dataset in huggingface lacks the attribute `plain_text`: 

<img width="1901" height="268" alt="Image" src="https://github.com/user-attachments/assets/67b64626-1065-4ef0-9220-d698ce0cee50" />

Could you please update the dataset (or corresponding code) to include `plain_text`?

This would be very helpful for reproducing and extending your results.
Thanks in advance! 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing `plain_text` attribute in hf dataset #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing plain_text attribute in hf dataset #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Missing `plain_text` attribute in hf dataset #1