This is the official repository for the paper: "ASTRA: Adaptive Structured Tree Reasoning Architecture for Complex Table Question Answering", accepted at ACL 2026.
ASTRA converts complex tables into hierarchical tree structures and performs question answering over the tree instead of directly over a flat 2D table. The repository includes the main pipeline, baselines, evaluation tools, and an interactive demo.
|
Phase 1: Table-to-Tree Construction
|
Phase 2: Tree-based QA
|
Key ideas:
- Convert flat tables into structured tree representations while preserving header and row hierarchies.
- Navigate relevant tree paths before answer generation.
- Support both rule-based and LLM-based tree construction.
- Add optional symbolic reasoning for numerical questions.
table2tree/
├── astra_config.py # Shared environment, dataset, and path helpers
├── model_clients.py # Shared OpenAI-compatible and local model clients
├── tableqa.py # Main batch pipeline
├── table2tree.py # Table-to-tree conversion
├── treeqa.py # Tree-based QA and symbolic reasoning
├── evaluate.py # Prediction evaluation
├── llm_select.py # Answer selector utility
├── demo/ # FastAPI + React demo
├── baseline/ # Direct and tree-direct baselines
├── quality_evaluate/ # Tree quality evaluation
├── Batch_evaluate/ # Multi-rollout stability evaluation
├── fig/ # README figures
└── requirements.txt
The codebase is now organized around three shared layers:
astra_config.py: centralizes environment loading and common paths.model_clients.py: centralizes model endpoint selection and client behavior.- Task modules (
tableqa.py,treeqa.py,demo/,baseline/): focus on pipeline logic instead of local machine configuration.
conda create -n astra python=3.10
conda activate astra
pip install -r requirements.txtOptional demo dependencies are listed in demo/requirements.txt.
Copy the template and fill only the keys you need:
cp .env.example .envMost users only need:
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_BASE_URL=https://api.openai.com/v1Optional variables:
VOLCES_API_KEY,DEEPSEEK_API_KEY,ALIYUN_API_KEY: provider-specific aliases already used in the codebase.ASTRA_DATASET_DIR: dataset root. If unset, the code looks for../dataset.ASTRA_EMBEDDING_MODEL_PATH: local embedding model path for retrieval.ASTRA_LOCAL_MODEL_BASE_URL: local generation server for open-source models.ASTRA_MODEL_PATHorASTRA_MODEL_PATH_<MODEL_NAME>: local checkpoint path formodel_deploy.py.
By default ASTRA expects datasets under ../dataset/:
dataset/
├── hitab/
│ ├── test_samples_clean.jsonl
│ └── tables/raw/*.json
├── AIT-QA/
│ └── aitqa_clean_questions.json
├── SSTQA-zh/
│ ├── test.jsonl
│ └── table/*.xlsx
├── RealHiTBench/
│ ├── QA_final_filter.json
│ └── csv/*.csv
└── MMQA/
└── Synthesized_three_table.json
If your data is elsewhere, set ASTRA_DATASET_DIR in .env.
Run the main pipeline from the terminal:
python tableqa.py \
--dataset hitab \
--table2-tree-method llm_based \
--table2-tree-mode normal \
--model-name-table2tree gpt-4o \
--model-name-treeqa gpt-4o \
--model-type-treecons oai \
--model-type-qa oai \
--start-index 0 \
--end-index 100 \
--enable-quality-evalOutputs are written to record/.
Useful optional flags:
--force-generate: ignore cached tree tables and regenerate them.--no-embedding: disable embedding-assisted retrieval inTreeQA.--disable-quality-eval: skip tree quality evaluation for faster runs.
Evaluate prediction files with:
python evaluate.py --input record/your_results.jsonTree quality evaluation is available through:
from quality_evaluate import evaluate_tree_quality
metrics = evaluate_tree_quality(
original_table=table,
tree_table=generated_tree,
handle_combined_keys=True,
)Multi-rollout stability evaluation is available through:
from Batch_evaluate import TreeBatchEvaluator
evaluator = TreeBatchEvaluator(output_dir="./batch_results")Two baseline scripts are included:
baseline/direct.py: direct table QA without tree conversion.baseline/tree_direct.py: tree construction followed by direct QA over the tree.
Both reuse the shared dataset loader and model configuration.
The demo visualizes upload, tree construction, and QA reasoning.
cd demo
pip install -r requirements.txt
cd frontend
npm install
npm run build
cd ..
python server.pyThen open http://localhost:8080.
For frontend development:
cd demo/frontend
npm run dev@misc{guo2026astraadaptivestructuredtree,
title={ASTRA: Adaptive Structured Tree Reasoning Architecture for Complex Table Question Answering},
author={Xiaoke Guo and Songze Li and Zhiqiang Liu and Zhaoyan Gong and Yuanxiang Liu and Huajun Chen and Wen Zhang},
year={2026},
eprint={2604.08999},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2604.08999}
}
