Nano Banana Pro使用JSON提示

JSON 提示是一种为模型提供结构化信息的方法。它们充当模板,使输出更具可重复性。人们通常不会手动或从头开始编写 JSON 提示,而是会使用工具(例如语言模型)来构建模板,然后调整细节。这些模板随后成为创建主题变体的强大工具。

Nano Banana Pro 能够解析极简的提示。如果您要求拍摄狗的照片,您会发现它们各不相同但又相似。通常都很漂亮、温暖、细节丰富、略带质朴和自然。通过使用 JSON 提示,您可以从一开始就更轻松地直接控制主题、风格、构图和光线。您无需记住要定义的内容。这有助于避免使用默认值,并获得您想要的效果。

不一定非得是 JSON 格式。任何结构化数据都可以,我个人比较喜欢用 YAML。不过,社区普遍倾向于使用 JSON。

1、何时应该使用 JSON

我建议在以下情况下使用 JSON:

  • 提示信息较长或较为复杂
  • 想要在保持整体风格不变的情况下更改细节
  • 想要通过显式定义大部分细节来摆脱默认样式
  • 有一个想要用作基础的 JSON 模板
  • 想要模仿现有的图像或输出

2、您的第一个 JSON 提示

这里有一个模板供您参考,它避免了模型的许多默认设置,并展示了能够获得良好效果的保真度水平。

本模板定义了以下几个部分:

  • 主体:人口统计信息、面部特征、头发、身材、姿势和态度
  • 背景:颜色、纹理和景深
  • 风格:媒介、艺术参考、调色板
  • 技术:相机、分辨率
  • 光线:类型、光源、细节
  • 后期处理:风格、调色
  • 限制:保留哪些元素,避免哪些元素
{
  "subject": {
    "demographics": {
      "age": "Early 20s",
      "gender": "Female"
    },
    "face": {
      "skin": {
        "tone": "Fair, porcelain complexion",
        "texture": "Smooth, high-end commercial retouch finish, soft natural blush on cheeks",
        "details": "Subtle nose contour, soft highlights on forehead and chin"
      },
      "eyes": {
        "color": "Striking blue-grey",
        "gaze_direction": "Looking upwards and slightly to the left",
        "makeup": "Defined upper lashes (mascara), subtle eyeliner, natural look",
        "eyebrows": "Thick, dark brown, well-groomed but natural arch, distinct individual hairs visible"
      },
      "mouth": {
        "shape": "Soft, relaxed",
        "color": "Natural pinkish-rose",
        "expression": "Neutral to slightly contemplative/whimsical"
      }
    },
    "hair": {
      "style": "Messy high bun/updo",
      "color": "Dark brown/brunette",
      "texture": "Fine but voluminous",
      "details": "Numerous loose flyaways and wisps framing the face and crown, chaotic but aesthetic 'bedhead' look"
    },
    "pose": {
      "head_position": "Front-facing",
      "body_position": "Shoulders slightly angled, arms crossed",
      "energy": "Casual, thinking, daydreaming"
    }
  },
  "attire": {
    "top": {
      "item": "Chunky knit sweater",
      "color": "Heather grey",
      "texture": "Heavy wool or cotton yarn, visible ribbed collar pattern, soft tactile fuzz",
      "fit": "Oversized, cozy"
    }
  },
  "photography": {
    "style": "High-key studio portrait",
    "shot_scale": "Medium close-up (chest up)",
    "lighting": {
      "type": "Soft, diffuse studio lighting",
      "source": "Large softbox frontal/overhead",
      "details": "Prominent catchlights in upper pupils, soft shadowing under the chin and nose, even illumination"
    },
    "camera_gear": {
      "lens": "85mm Portrait Lens",
      "aperture": "f/2.8",
      "focus": "Sharp focus on eyes and eyebrows, slight fall-off on shoulders and hair edges"
    },
    "post_processing": {
      "look": "Commercial clean aesthetic",
      "grading": "Neutral cool tones, true-to-life colors with slight saturation boost in eyes and lips"
    }
  },
  "background": {
    "type": "Studio backdrop",
    "color": "Solid light grey",
    "texture": "Smooth, featureless",
    "depth": "Flat, non-distracting"
  },
  "aesthetic_fidelity": {
    "medium": "Digital Photography",
    "vibe": "Minimalist, clean, cozy, introspection",
    "visual_qualities": [
      "High resolution",
      "Sharp details",
      "Soft color palette",
      "Textural contrast (smooth skin vs. rough knit)"
    ]
  },
  "constraints": {
    "must_keep": [
      "Upward gaze",
      "Messy hair flyaways",
      "Grey knit texture",
      "Blue eye color",
      "Studio grey background"
    ],
    "avoid": [
      "Smiling with teeth",
      "Direct eye contact with camera",
      "Complex background",
      "Harsh shadows",
      "Jewelry",
      "Glasses"
    ]
  },
  "negative_prompt": [
    "teeth",
    "smile",
    "looking at camera",
    "dark background",
    "patterned background",
    "jewelry",
    "earrings",
    "glasses",
    "low resolution",
    "blurry eyes",
    "overexposed",
    "heavy makeup",
    "red lipstick",
    "straight hair",
    "flat hair"
  ]
}

2、图像转 JSON

获取初始 JSON 提示的最简单方法是使用图像作为参考。我使用 Gemini 3 Pro 生成 JSON 提示,因为它(目前)拥有最佳的图像识别能力。

以下是我使用的系统指令。在 [在此处插入您的示例提示] 处,您可以使用上面的模板提示。

系统提示:

You are an expert prompt engineer for Nano Banana Pro.
Your task is to convert the user's description into a sophisticated, EXTREMELY DETAILED JSON prompt.
You must output a single valid JSON object.

### JSON STRUCTURE GUIDELINES:
1. **Dynamic Fields**: You are encouraged to ADD new fields that capture specific details about the subject (e.g., "plating_style" for food, "architecture_era" for buildings, "glitch_patterns" for abstract art).
2. **Remove Irrelevant Fields**: Do NOT include fields that don't apply. If the subject is a stove, do not include "hair", "skin", or "pose". Remove them entirely rather than setting them to "N/A".
3. **Subject Specificity**:
- **For People**: The example structure (subject, face, skin, hair, clothing) is excellent. Keep it.
- **For Non-Humans**: Create a structure that fits the object. For example, a car might have "chassis", "paint_finish", "wheels".
4. **Standard Fields**: Always include "constraints" (with "must_keep" and "avoid" lists) and "negative_prompt".

### AESTHETIC GOALS:
- **Medium Specificity**: If the user asks for a specific style (e.g. "oil painting"), describe the brushwork, canvas texture, and drying cracks.
- **Lighting**: Be precise (soft, hard, volumetric, golden hour, studio, rim lighting).
- **Camera**: (focal length, depth of field) - ONLY if the style requires photorealism.

Use the following example as a reference for *depth* and *granularity*, but adapt the *keys* to your subject:
[INSERT YOUR SAMPLE PROMPT HERE]

Return ONLY the raw JSON string.

这里有一个示例(输入图像比较复杂)以及最终输出结果。Gemini 3 Pro 生成的用于生成输出的 JSON 代码如下所示。没有传入参考图像。

现在我们得到了:

  • 一个可重用的输入图像 JSON 定义,我们可以对其进行修改并创建各种变体;
  • 一个与原图相似的输出图像,展示了 Gemini 3 Pro 和 Nano Banana Pro 如何出色地捕捉给定图像的精髓。
{
  "subject": {
    "main": "First-person POV hand holding a smartphone",
    "hand_details": {
      "appearance": "Male hand, light skin tone",
      "grip": "Holding an iPhone vertically, thumb hovering near the bottom bezel",
      "lighting": "Illuminated by the screen's glow and warm ambient room light"
    },
    "device": {
      "type": "Modern smartphone with notch (iPhone style)",
      "case": "Black slim case",
      "screen_state": "On, displaying photo gallery app"
    }
  },
  "screen_content": {
    "interface": {
      "app": "iOS Photos App",
      "header": "Time '22:22', Back arrow '< Gallery', Title 'For You'",
      "footer_tabs": "Gallery, For You (selected), Photos, Search"
    },
    "image_grid": {
      "layout": "3-column grid of thumbnails",
      "subject_matter": "Repeated photos of a woman with curly dark shoulder-length hair",
      "attire": "Cobalt blue textured cardigan, white top, beige pants",
      "activity": "Sitting by a window, sketching in a notebook/pad, holding up a drawing",
      "consistency": "Same woman in various poses, some candid, some looking at camera smiling"
    }
  },
  "environment": {
    "setting": "Cozy living room at night",
    "left_side": {
      "furniture": "Tall dark wood bookshelf packed with colorful book spines",
      "lighting": "Small mushroom-style lamp emitting warm yellow light on shelf",
      "decor": "Two plush toys on a side table: A large Snorlax wearing sunglasses and a smaller Eevee"
    },
    "center_right": {
      "electronics": "Large wall-mounted flat screen TV displaying a night cityscape (Hong Kong skyline aesthetic) and digital clock overlay '22:24'",
      "audio": "Black soundbar mounted below TV",
      "furniture": "Wooden media console, wooden coffee table in foreground (out of focus)"
    },
    "foliage": {
      "plant": "Monstera deliciosa leaves visible near the top, backlit by warm light"
    }
  },
  "photography": {
    "style": "POV Lifestyle Snapshot",
    "focus": "Sharp focus on the smartphone screen, shallow depth of field (strong bokeh) on background",
    "lighting": {
      "type": "Low-light indoor ambience",
      "sources": [
        "Warm tungsten lamp on bookshelf",
        "Cool blue light from TV screen",
        "Bright white light from phone display"
      ],
      "contrast": "High contrast between the bright phone screen and the dim, cozy room"
    },
    "camera_gear": {
      "lens": "24mm or 28mm wide angle",
      "aperture": "f/1.8 to f/2.0 (creating background blur)",
      "iso": "High ISO (slight grain visible in shadows)"
    }
  },
  "aesthetic_fidelity": {
    "medium": "Digital Photography",
    "vibe": "Domestic, cozy, tech-focused, intimate evening",
    "visual_qualities": [
      "Realistic screen reflection",
      "Warm color temperature",
      "Soft background blur",
      "Digital noise in dark areas"
    ]
  },
  "constraints": {
    "must_keep": [
      "Snorlax plushie with sunglasses",
      "Eevee plushie",
      "Time on phone 22:22",
      "Grid of photos showing woman in blue cardigan",
      "Cityscape on TV background",
      "Monstera leaves",
      "Warm bookshelf lighting"
    ],
    "avoid": [
      "Daylight",
      "Empty screen",
      "Blurry phone screen",
      "Third-person perspective",
      "Bright overhead lighting",
      "Vector art style"
    ]
  },
  "negative_prompt": [
    "blurry screen",
    "daylight",
    "sunlight",
    "empty room",
    "turned off tv",
    "cartoon filter",
    "illustration",
    "drawing style",
    "messy grid",
    "cracked screen",
    "bright white background",
    "missing plushies",
    "generic landscape on tv"
  ]
}

我们来修改一下这个 JSON 数据,保留整体风格,但把它改成 2000 年代初期的风格。你可以手动修改,也可以使用语言模型自动完成。注意手机、电视和音响的变化:

{
  "subject": {
    "main": "First-person POV hand holding a vintage BlackBerry phone",
    "hand_details": {
      "appearance": "Male hand, light skin tone",
      "grip": "Holding the device with both hands, thumbs resting on the physical QWERTY keyboard",
      "lighting": "Illuminated by the small screen's backlight and warm room ambience"
    },
    "device": {
      "type": "Early 2000s BlackBerry (e.g., model 7230 or 6230)",
      "body": "Bulky blue/black plastic casing with side trackwheel",
      "screen_state": "On, displaying a low-resolution pixelated color interface"
    }
  },
  "screen_content": {
    "interface": {
      "os": "Retro BlackBerry OS",
      "header": "Status bar with battery icon, signal bars, and 'GPRS'",
      "context": "Email/Message inbox open"
    },
    "display_content": {
      "type": "Opened email attachment",
      "image_quality": "Highly pixelated, low-bit color depth (16-bit color)",
      "subject_matter": "A single grainy photo of a woman with curly dark shoulder-length hair",
      "attire": "Cobalt blue textured cardigan, white top",
      "text_overlay": "Sender: 'Sarah' - Subject: 'Me'"
    }
  },
  "environment": {
    "setting": "Cozy living room at night, circa 2003",
    "left_side": {
      "furniture": "Dark wood bookshelf packed with books and CD jewel cases",
      "lighting": "Lava lamp or translucent colored desk lamp emitting warm glow",
      "decor": "Two plush toys on a side table: A large Snorlax wearing sunglasses and a smaller Eevee (Gen 2 Pokémon era appropriate)"
    },
    "center_right": {
      "electronics": "Bulky CRT TV (Sony Trinitron style) displaying a music video channel with scanlines visible",
      "audio": "Silver stereo system with stack of components (CD changer, tape deck)",
      "furniture": "Wooden media console with glass doors, DVD cases visible inside"
    },
    "foliage": {
      "plant": "Monstera deliciosa leaves visible near the top, backlit by warm light"
    }
  },
  "photography": {
    "style": "POV Flash Snapshot / Film Photography",
    "focus": "Sharp focus on the BlackBerry physical keyboard and screen, soft background",
    "lighting": {
      "type": "Indoor tungsten mixed with CRT glow",
      "sources": [
        "Warm lamp on shelf",
        "Flickering blueish glow from CRT TV screen",
        "Small backlight from phone"
      ],
      "contrast": "Medium contrast, nostalgic film grain texture"
    },
    "camera_gear": {
      "lens": "35mm film camera simulation",
      "aperture": "f/2.8",
      "film_stock": "Kodak Gold 400 or Fujifilm Superia (grainy, warm tones)"
    }
  },
  "aesthetic_fidelity": {
    "medium": "Analog Photography scan or Early Digital",
    "vibe": "Nostalgic, Y2K era, tech-focused, intimate evening",
    "visual_qualities": [
      "Film grain",
      "CRT scanlines in background",
      "Plastic textures of old tech",
      "Low dynamic range typical of the era"
    ]
  },
  "constraints": {
    "must_keep": [
      "Snorlax plushie with sunglasses",
      "Eevee plushie",
      "Low-res photo of woman in blue cardigan on screen",
      "CRT TV in background",
      "Physical keyboard on phone",
      "Warm bookshelf lighting"
    ],
    "avoid": [
      "Modern smartphones",
      "Touchscreens",
      "HD flat screen TVs",
      "LED lighting strips",
      "High resolution screen display",
      "Vector art style"
    ]
  },
  "negative_prompt": [
    "iPhone",
    "Android",
    "touchscreen",
    "bezel-less screen",
    "4k tv",
    "modern furniture",
    "clean digital look",
    "illustration",
    "drawing style",
    "cracked screen",
    "bright white background",
    "missing plushies",
    "lcd screen"
  ]
}

4、轻松创建变体

Nano Banana Pro 可以自动为您创建各种变体。只需在提示符前添加更改 JSON 的指令即可:

Generate a new image with SIGNIFICANTLY different nouns, objects, color palette and pose compared to the JSON below. CRITICAL: Strictly preserve the original 'vibe', 'aesthetic', and 'mood'. The result should look like a distinct image from the same artistic series.

[INSERT YOUR JSON PROMPT HERE]

你也可以控制这些变化:

Additional Instruction: [eg Make it night time]

Generate a new image with SIGNIFICANTLY different nouns, objects, color palette and pose compared to the JSON below. CRITICAL: Strictly preserve the original 'vibe', 'aesthetic', and 'mood'. The result should look like a distinct image from the same artistic series.

[INSERT YOUR JSON PROMPT HERE]

以下是使用这些技术生成的同一 JSON 提示的各种变体:

另一组变体,主题不同:

这种方法的缺点在于,您牺牲了提示的准确性来换取提示的多样性。您的提示将无法准确描述您获得的图像。

5、整合所有功能

使用 AI Studio,您可以将所有这些功能整合到一个应用程序中,该应用程序可让您:

  • 从图像生成 JSON 提示
  • 优化 JSON 提示
  • 从 JSON 提示创建变体

您可以在 AI Studio 中运行此应用程序,也可以从 GitHub 获取代码。

点击这里查看应用程序的演示视频。

6、JSON 转散文

你不必非得使用 JSON。对于任何给定的 JSON 提示,都有一个同样适用的散文版本。如果你见过一个不错的 JSON 提示,你可以让语言模型帮你把它转换成散文:

Given this JSON, keep all the details and convert it to prose. Use only paragraphs. Be concise.

以下是一个不使用 JSON 的复杂提示示例:

A portrait photo of david duchovny, he is wearing a PSG kit, on his shoulder is a parthenos sylvia, in his hand he is holding lily of the valley, behind him there is a poster for the movie 'Last Year at Marienbad' and a Garchomp poster. On a plinth there is a jeff koons sculpture. The sky through the window shows lenticular clouds. There is also a Terminator 2 cross-stitch on the wall. On his arm there is a tattoo of a Nissan Qashqai. There's an orange lego Christmas tree. The room has a high end decor, a pot is boiling on the NEFF hob. Next to the pot is a little origami chair made from purple and yellow paper. He has a dangling monstera leaf earring. There is a pet marine iguana. In his other hand he is making the ok shape. On that hand there is a gold ring. The whole image gives the impression of a hastily taken photo, an everyday scene.

7、混合提示

您无需仅使用 JSON 或纯文本格式进行提示。您可以先使用纯文本格式,然后添加 JSON 代码,以便更好地控制输出。

例如,如果您有一个喜欢的 JSON 代码片段,它可以生成特定样式的图像,您可以将其添加到常规提示的末尾。Nano Banana Pro 可以处理它。您的提示无需是有效的 JSON 格式,模型并不关心这一点。

8、结束语

并非所有图像都需要使用 JSON 提示,但如果您有特定的构想或现有的模板,并且拥有合适的工具,它将非常强大。


原文链接:Prompting Nano Banana Pro with JSON

汇智网翻译整理,转载请标明出处