Direct volume rendering (DVR) is a fundamental technique for visualizing volumetric data, where transfer functions (TFs) play a crucial role in extracting meaningful structures. However, designing effective TFs remains unintuitive due to the semantic gap between user intent and TF parameter space. Although numerous TF optimization methods have been proposed to mitigate this issue, existing approaches still face two major challenges: the vast exploration space and limited generalizability. To address these issues, we propose IntuiTF, a novel framework that leverages Multimodal Large Language Models (MLLMs) to guide TF optimization in alignment with user intent. Specifically, our method consists of two key components: \ding{172} an evolution-driven explorer for effective exploration of the TF space, and \ding{173} an MLLM-guided human-aligned evaluator that provides generalizable visual feedback on rendering quality. The explorer and the evaluator together establish an efficient Trial-Insight-Replanning paradigm for TF space exploration. We further extend our framework with an interactive TF design system. We demonstrate the broad applicability of our framework through three case studies and validate the effectiveness of each component through extensive experiments.
Our WYTWYG framework presents an TF optimization approach that effectively bridges user intent with transfer function design. Our methodology consists of two key components: (A) a transfer function optimization approach that combines (A1) an evolution-based explorer following the Trial-Insight-Replanning optimization formulation to iteratively generate and refine transfer functions with adaptive replanning capabilities, and (A2) an MLLM-based evaluator that assesses visualizations according to both formal quality metrics and user intent alignment. (B) Our interactive transfer function design system implements a three-stage interaction workflow of initialization-customization-refinement, enabling users to express their visualization intentions through natural text descriptions or reference images while progressively refining results to achieve desired outcomes.
Examples of transfer functions designed using our approach. The first row illustrates text-guided designs, with the corresponding user descriptions shown below each image. The second row illustrates image-guided designs, where each bottom inset shows the user-provided reference.
We demonstrate the practical applicability of our IntuiTF framework through comprehensive use cases that showcase how users can effectively leverage multimodal inputs to design transfer functions for various volume visualization scenarios.
Comprehensive examples showcasing the versatility of our approach across different domains and user requirements.
Analysis of evolutionary algorithm parameters on transfer function optimization. (A) Average rankings over iterations for different population sizes (10-200), showing that larger populations achieve more stable improvements with continued iterations while smaller populations show early rapid gains but unstable convergence. (B) Comparison of optimization effectiveness across different computational budgets (measured by LLM API calls), demonstrating that medium-small populations (25-50) perform better under limited resources, while larger populations (100-200) achieve superior results when computational resources are abundant.
@article{wang2025intuiTF,
title = {IntuiTF: MLLM-Guided Transfer Function Optimization for Direct Volume Rendering},
author = {Wang, Yiyao and Pan, Bo and Wang, Ke and Liu, Han and Mao, Jinyuan and Liu, Yuxin and Zhu, Minfeng and Zhang, Bo and Chen, Weifeng and Huang, Xiuqi and Chen, Wei},
year = {2025},
publisher = {arXiv},
url = {https://wyysteelhead.github.io/TFEvolve.github.io/},
doi = {10.48550/arXiv.2506.18407}
}