HInT: Hypergraph Infusion at the Structural Layers
Improves Table Understanding

Wonjin Lee¹, Soomi Jeong¹, Kwang In Kim¹

¹POSTECH

Accepted to ICML 2026

Decoder-only large language models (LLMs) struggle with table reasoning because tables must be serialized, which can obscure row- and column-level structure. Prior graph and hypergraph approaches encode structure with an external encoder, but their gains are often inconsistent under autoregressive decoding. We analyze how tabular structure is represented inside decoder-only LLMs and find that row and column relations are concentrated in a small subset of layers and attention heads. Based on this observation, we propose HInT (Hypergraph Infusion for Table reasoning), which injects hypergraph-derived structural features directly into the layers where these relations are concentrated. HInT constructs a table hypergraph over cells and headers, applies lightweight message passing, and fuses the resulting structural features with token hidden states through gated fusion. Experiments across diverse table reasoning tasks show consistent improvements over text-only baselines and prior (hyper)graph-based methods.

Paper Code Poster

HInT method overview

Method overview (Figure 3 in the paper). (a) Text-based methods process tables after linearization into token sequences. (b) Graph- and hypergraph-based methods first construct a graph or hypergraph from the table, encode it with a dedicated encoder, and pass the resulting embeddings to the LLM. (c) HInT injects hypergraph-based structural features directly into selected structural layers of the LLM, where they are fused with hidden states to enable structure-aware reasoning.

Introduction

Decoder-only LLMs reason over tables by reading them as serialized token sequences. This linearization obscures row- and column-level relations: tokens that are adjacent in a row or column may be far apart in the input sequence, so structure-sensitive reasoning becomes brittle. Recent work tries to compensate by introducing external graph or hypergraph encoders that inject structural features before LLM decoding.

We find that this is not enough. On WikiTQ, HiTab, and TabFact, (hyper)graph-augmented LLMs do not consistently outperform text-only baselines — in several cases, they match or underperform supervised fine-tuning on the same backbone. Our analysis suggests that the bottleneck is not just in extracting table structure, but in integrating it into the autoregressive computation inside the LLM.

Text vs graph-augmented LLMs

Figure 1 in the paper. Comparison of text-only and graph-augmented LLMs on the WikiTQ, HiTab, and TabFact datasets. Despite explicitly modeling table structure, graph-augmented methods do not consistently outperform the text-only baselines.

Where is table structure encoded inside the LLM?

We define two binary structural masks — a row mask $\mathbf{M}^{\text{R}}$ and a column mask $\mathbf{M}^{\text{C}}$ — that mark which token pairs belong to the same table row or column. For each instance, we then fit each mask as a sparse linear combination of attention maps across all layers and heads with a LASSO penalty:

$\mathcal{C}(\mathbf{w}) = \left\lVert \mathbf{m}^{\text{R}} - \sum_{l,h} w_l^h \, \mathbf{a}_l^h \right\rVert_2^2 + \lambda \lVert \mathbf{w} \rVert_1$.

Heads with large coefficients are the ones whose attention patterns best align with row- or column-wise structure. Visualizing the top-ranked heads recovers clear row-wise and column-wise patterns even though the input is a 1D token stream.

Structural masks and attention maps from probed heads

Figure 2 in the paper. Left: row and column structural masks $\mathbf{M}^{\text{R}}$ (top) and $\mathbf{M}^{\text{C}}$ (bottom). Right: attention maps from heads selected by our structural probing, showing row-wise and column-wise patterns that best match the corresponding masks.

These heads are not just visually suggestive — they are functionally critical. Ablating the top-40 structurally aligned heads at inference time causes a much larger drop in WikiTQ accuracy than ablating the same number of randomly chosen heads (12.73% vs. 32.29% EM, compared to 47.51% with all heads). Aggregating these heads across instances shows that they concentrate in a small, non-uniform subset of transformer layers, which we designate as the structural layers $\mathcal{L}^{\star}$.

Head ablation results

Method: HInT

HInT (Hypergraph Infusion for Table reasoning) builds a table hypergraph $\mathcal{G} = (\mathcal{V}, \mathcal{E})$ in which every cell is a node and rows, columns, and hierarchical headers are hyperedges. At each structural layer $l \in \mathcal{L}^{\star}$, we (1) pool token hidden states into per-cell node features $\mathbf{v}_u$ and per-header hyperedge features $\mathbf{e}_v$; (2) run a lightweight HyperTrans message-passing block over $\mathcal{G}$ to obtain structure-enhanced features $(\widetilde{\mathbf{V}}, \widetilde{\mathbf{E}}) = \mathrm{HyperTrans}(\mathbf{V}, \mathbf{E})$; and (3) fuse the resulting structural feature $\widetilde{\mathbf{g}}_i$ back into each token via a sigmoid gate $\alpha_i = \sigma(\mathbf{w}^{\top}[\mathbf{u}_i \| \widetilde{\mathbf{g}}_i])$ and a convex combination $\widetilde{\mathbf{u}}_i = (1-\alpha_i)\,\mathbf{u}_i + \alpha_i\,\widetilde{\mathbf{g}}_i$.

All other layers are left untouched, so the standard autoregressive computation is preserved. Only the HyperTrans block, the gate $\mathbf{w}$, and (optionally) the backbone are trained.

Hypergraph construction from a table

Figure 4 in the paper. Example hypergraph construction. Each table cell is a node. Row-wise (orange) and column-wise (blue) hyperedges group cells that share a row or column, while hierarchical headers (green) induce higher-level hyperedges (e.g., Price ($) spanning Hot and Iced).

Results

Using Qwen2.5-3B-Instruct as the backbone, HInT identifies 12 structural layers $\mathcal{L}^{\star} = \{3,4,5,6,8,9,10,12,15,17,24,30\}$. With this single layer set fixed across all benchmarks, HInT yields consistent gains across table QA (WikiTQ, HiTab), fact verification (TabFact, FEVEROUS), table–text QA (HybridQA, FinQA), and table–to–text generation (FeTaQA, ToTTo), and outperforms both text-only and prior (hyper)graph-based baselines (HeGTa, LLaSA, TaMo).

Main results across 8 benchmarks

Table 2 in the paper. Results on eight table-reasoning benchmarks across four task categories. HInT is the best or second-best method on every benchmark.

Qualitative example

On a representative WikiTQ instance that requires multi-step row-wise reasoning (find another player with the same tally as David Martin), the text-based baseline correctly retrieves David Martin's tally (1–6) but fails to propagate the row-wise constraint to the other player at the same rank. HInT preserves the row-level dependency and correctly answers Seán McLoughlin.

Qualitative reasoning comparison

Figure 5 in the paper. Reasoning comparison on a WikiTQ test example. Top-right: Qwen2.5-3B (SFT) finds the correct tally but fails to propagate the row-wise constraint, yielding an incorrect answer. Bottom-right: HInT preserves the row-level dependency and answers correctly.

Where to inject matters

Injecting hypergraph features into the identified structural layers outperforms (i) a capacity-matched Qwen2.5-3.3B baseline, (ii) injecting into randomly selected layers, and (iii) injecting into all layers. This confirms that the gains come from where structure is added, not from extra parameters or denser intervention.

The same recipe also transfers to a different backbone family: with Llama-3.1-8B-Instruct, the layer-frequency distribution stays sparse, and HInT improves over both the SFT and TaMo variants of the same backbone.

Conclusion

Tabular structure is already encoded inside decoder-only LLMs, but only in a small subset of layers and attention heads. HInT exploits this by injecting hypergraph-derived structural features directly into those structural layers via lightweight message passing and token-level gated fusion, without changing the underlying autoregressive computation. The result is consistent improvements over strong text-only and encoder-based baselines across diverse table reasoning tasks. The same principle — injecting structured relational information into internally aligned layers — is not inherently limited to tables and is a natural direction for other structured inputs such as knowledge graphs or program representations.

Citation

If you find our work useful, please cite it as:

@inproceedings{lee2026hint,
    title     = {HInT: Hypergraph Infusion at the Structural Layers Improves Table Understanding},
    author    = {Lee, Wonjin and Jeong, Soomi and Kim, Kwang In},
    booktitle = {Proceedings of the International Conference on Machine Learning (ICML)},
    year      = {2026}
}