The AR Illusion: How to Prompt Screen-in-Screen UI in Pixel Art 📱👾
By pikpoo
With the massive wave of augmented reality nostalgia and the 10th anniversary of structural mobile gaming taking over our timelines, I wanted to create a piece that perfectly bridges our physical world with digital spaces. My goal was to capture the chaotic energy of a midnight park gathering where hundreds of players are hunting virtual creatures. However, getting an AI image generator to simultaneously process a massive background landscape and a crisp, readable, close-up user interface on a smartphone screen is one of the hardest layout challenges you can tackle. If you don't structure your prompt correctly, the AI will either blur the background into an unreadable mess or melt the phone screen into a glowing plastic blob. Here is the exact technical workflow I used to solve this scaling issue. 1. The Screen-in-Screen Prompt Hierarchy To pull off this illusion, you cannot just ask the model for a "person holding a phone." The model will prioritize the person and lose the UI graphics. Instead, you have to engineer the prompt using a strict focal hierarchy. I forced the phone interface into the immediate foreground by specifying the exact materials and graphics: "In the foreground, a sharply rendered silver smartphone screen displaying a detailed pixelated battle menu and an augmented reality monster." By anchoring the pixel layout inside the phone screen first, the generator allocates its rendering tokens to the UI text layers before building the rest of the canvas. 2. Balancing Dual-Depth Environments Once the foreground screen is locked in, you need to manage the mid-ground and background architecture so they don't fight for clarity. I wanted a sprawling night scene at a city waterfront park, which means managing heavy atmospheric lighting. I explicitly separated the background elements by prompting a "soft-focus, distant city skyline under a starry twilight sky." Using terms like "soft-focus" or "low-detail background" tells the model to save its rendering power for the foreground phone grid, creating a beautiful cinematic depth-of-field effect. 3. Handling Conflicting Neon Light Sources The final layer of the puzzle is lighting. In a scene like this, you have the warm, natural orange glow of a setting sun clashing with the cold, artificial cyan and magenta light radiating from hundreds of glowing mobile devices in the crowd. I explicitly specified that the crowd figures should be "backlit silhouettes," which naturally blends them into the landscape while making the vibrant, neon-colored game elements pop right off the screen.
Tags: pixelart, mobilegaming, uimapping, gamedesign, cyberneon