
ReDiPrune: Training-Free Token Pruning Before Projection Boosts MLLM Efficiency 6x, Gains 2% Accuracy
Researchers propose ReDiPrune, a plug-and-play method that prunes visual tokens before the vision-language projector in multimodal LLMs. On EgoSchema with LLaVA-NeXT-Video-7B, it achieves a +2.0% accuracy gain while reducing computation by over 6× in TFLOPs.


























