BEVGen: Street-View Image Generation from
a Bird's-Eye View Layout |
Webpage | Code | Paper |
In this work, we tackle the new task of generating street-view images from a BEV layout and propose a generative model called BEVGen to address the underlying challenges. We develop an autoregressive neural model called BEVGen that generates a set of realistic and spatially consistent images. BEVGen has two technical novelties: (i) it incorporates spatial embeddings using camera instrinsics and extrinsics to allow the model to attend to relevant portions of the images and HD map, and (ii) it contains a novel attention bias and decoding scheme that maintains both image consistency and correspondence.
@article{swerdlow2024streetview, title={Street-View Image Generation from a Bird's-Eye View Layout}, author={Alexander Swerdlow and Runsheng Xu and Bolei Zhou}, year={2024}, journal={IEEE Robotics and Automation Letters}, }