Vol. MMXXVI · Issue 078 · Daily Edition

Artificial
Indifference

Published March 19, 2026
APOD: Launch Plume: SpaceX Jellyfish
arXiv: 8 papers filed

Launch Plume: SpaceX Jellyfish

Launch Plume: SpaceX Jellyfish

ven if you live with your head in the clouds, you won’t find a jellyfish like this one very often. The featured image shows a SpaceX Falcon 9 rocket launch from Cape Canaveral in Florida on March 4. The launch happened 52 minutes before sunrise, and the second stage rocket exhaust plume was high enough in the sky to catch the light of the rising sun, while the photographer was still in the dark. This combination of light and shadow, possible at dawn or dusk, makes the exhaust, mostly water vapor and carbon dioxide, appear as a glowing cloud. It only looks like it's going down, as the rocket fo...

2026-03-19 · © Michael Seeley · NASA APOD ↗

Research Filed Today

Preprints submitted to arXiv on March 19, 2026. Science before peer review.

01
Token pruning is essential for enhancing the computational efficiency of vision-language models (VLMs), particularly for video-based tasks where temporal redundancy is prevalent. Prior approaches typically prune tokens either (1) within the vision transformer (ViT) exclusively fo...
Jianrui Zhang, Yue Yang, Rohun Tripathi et al. (+5)
02
Multimodal large language models (MLLMs) exhibit strong visual-language reasoning, yet remain confined to their native modalities and cannot directly process structured, non-visual data such as human skeletons. Existing methods either compress skeleton dynamics into lossy feature...
Ziyi Wang, Peiming Li, Xinshun Wang et al. (+3)
03
Multimodal Large Language Models (MLLMs) have made impressive progress in connecting vision and language, but they still struggle with spatial understanding and viewpoint-aware reasoning. Recent efforts aim to augment the input representations with geometric cues rather than expl...
Kevin Qu, Haozhe Qi, Mihai Dusmanu et al. (+3)
04
In this work, we present EchoGen, a unified framework for layout-to-image generation and image grounding, capable of generating images with accurate layouts and high fidelity to text descriptions (e.g., spatial relationships), while grounding the image robustly at the same time. ...
Kai Zou, Hongbo Liu, Dian Zheng et al. (+3)
05
Building LLM-based agents has become increasingly important. Recent works on LLM-based agent self-evolution primarily record successful experiences as textual prompts or reflections, which cannot reliably guarantee efficient task re-execution in complex scenarios. We propose Agen...
Zhang Zhang, Shuqi Lu, Hongjin Qian et al. (+2)
06
We present a training-free framework for continuous and controllable image editing at test time for text-conditioned generative models. In contrast to prior approaches that rely on additional training or manual user intervention, we find that a simple steering in the text-embeddi...
Yigit Ekin, Yossi Gandelsman
07
Tokenization is a fundamental technique in the generative modeling of various modalities. In particular, it plays a critical role in autoregressive (AR) models, which have recently emerged as a compelling option for 3D generation. However, optimal tokenization of 3D shapes remain...
Niladri Shekhar Dutt, Zifan Shi, Paul Guerrero et al. (+4)
08
Synthesizing controllable 6-DOF object manipulation trajectories in 3D environments is essential for enabling robots to interact with complex scenes, yet remains challenging due to the need for accurate spatial reasoning, physical feasibility, and multimodal scene understanding. ...
Huajian Zeng, Abhishek Saroha, Daniel Cremers et al. (+1)

Source: arXiv.org · Cornell University