Intel Labs has introduced a groundbreaking artificial intelligence model called the Latent Diffusion Model for 3D (LDM3D), capable of generating realistic 360-degree panoramic images based on text prompts.
The LDM3D model, developed in collaboration with Blockade Labs, uses generative AI to create immersive 3D visual content. Unlike existing diffusion models that generate 2D RGB images, LDM3D goes a step further by generating both an image and a depth map from a given text prompt. This depth mapping feature enhances the realism and immersion of the generated images.
The LDM3D model was trained on a dataset constructed from a subset of 10,000 samples of the LAION-400M database, comprising over 400 million image-caption pairs. The training process involved using the Dense Prediction Transformer (DPT) large-depth estimation model developed by Intel Labs to annotate the training corpus. This approach provides highly accurate relative depth for each pixel in an image.
Users can now transform text descriptions into detailed 360-degree panoramas, ranging from serene tropical beaches to futuristic sci-fi universes. This breakthrough technology has diverse applications, from entertainment and gaming to architecture and design. It can revolutionize content creation, metaverse experiences, and digital interactions across multiple industries.