Nano Banana Pro使用JSON提示
JSON 提示是一种为模型提供结构化信息的方法。它们充当模板,使输出更具可重复性。人们通常不会手动或从头开始编写 JSON 提示,而是会使用工具(例如语言模型)来构建模板,然后调整细节。这些模板随后成为创建主题变体的强大工具。
Nano Banana Pro 能够解析极简的提示。如果您要求拍摄狗的照片,您会发现它们各不相同但又相似。通常都很漂亮、温暖、细节丰富、略带质朴和自然。通过使用 JSON 提示,您可以从一开始就更轻松地直接控制主题、风格、构图和光线。您无需记住要定义的内容。这有助于避免使用默认值,并获得您想要的效果。
不一定非得是 JSON 格式。任何结构化数据都可以,我个人比较喜欢用 YAML。不过,社区普遍倾向于使用 JSON。
1、何时应该使用 JSON
我建议在以下情况下使用 JSON:
- 提示信息较长或较为复杂
- 想要在保持整体风格不变的情况下更改细节
- 想要通过显式定义大部分细节来摆脱默认样式
- 有一个想要用作基础的 JSON 模板
- 想要模仿现有的图像或输出
2、您的第一个 JSON 提示
这里有一个模板供您参考,它避免了模型的许多默认设置,并展示了能够获得良好效果的保真度水平。
本模板定义了以下几个部分:
- 主体:人口统计信息、面部特征、头发、身材、姿势和态度
- 背景:颜色、纹理和景深
- 风格:媒介、艺术参考、调色板
- 技术:相机、分辨率
- 光线:类型、光源、细节
- 后期处理:风格、调色
- 限制:保留哪些元素,避免哪些元素
{
"subject": {
"demographics": {
"age": "Early 20s",
"gender": "Female"
},
"face": {
"skin": {
"tone": "Fair, porcelain complexion",
"texture": "Smooth, high-end commercial retouch finish, soft natural blush on cheeks",
"details": "Subtle nose contour, soft highlights on forehead and chin"
},
"eyes": {
"color": "Striking blue-grey",
"gaze_direction": "Looking upwards and slightly to the left",
"makeup": "Defined upper lashes (mascara), subtle eyeliner, natural look",
"eyebrows": "Thick, dark brown, well-groomed but natural arch, distinct individual hairs visible"
},
"mouth": {
"shape": "Soft, relaxed",
"color": "Natural pinkish-rose",
"expression": "Neutral to slightly contemplative/whimsical"
}
},
"hair": {
"style": "Messy high bun/updo",
"color": "Dark brown/brunette",
"texture": "Fine but voluminous",
"details": "Numerous loose flyaways and wisps framing the face and crown, chaotic but aesthetic 'bedhead' look"
},
"pose": {
"head_position": "Front-facing",
"body_position": "Shoulders slightly angled, arms crossed",
"energy": "Casual, thinking, daydreaming"
}
},
"attire": {
"top": {
"item": "Chunky knit sweater",
"color": "Heather grey",
"texture": "Heavy wool or cotton yarn, visible ribbed collar pattern, soft tactile fuzz",
"fit": "Oversized, cozy"
}
},
"photography": {
"style": "High-key studio portrait",
"shot_scale": "Medium close-up (chest up)",
"lighting": {
"type": "Soft, diffuse studio lighting",
"source": "Large softbox frontal/overhead",
"details": "Prominent catchlights in upper pupils, soft shadowing under the chin and nose, even illumination"
},
"camera_gear": {
"lens": "85mm Portrait Lens",
"aperture": "f/2.8",
"focus": "Sharp focus on eyes and eyebrows, slight fall-off on shoulders and hair edges"
},
"post_processing": {
"look": "Commercial clean aesthetic",
"grading": "Neutral cool tones, true-to-life colors with slight saturation boost in eyes and lips"
}
},
"background": {
"type": "Studio backdrop",
"color": "Solid light grey",
"texture": "Smooth, featureless",
"depth": "Flat, non-distracting"
},
"aesthetic_fidelity": {
"medium": "Digital Photography",
"vibe": "Minimalist, clean, cozy, introspection",
"visual_qualities": [
"High resolution",
"Sharp details",
"Soft color palette",
"Textural contrast (smooth skin vs. rough knit)"
]
},
"constraints": {
"must_keep": [
"Upward gaze",
"Messy hair flyaways",
"Grey knit texture",
"Blue eye color",
"Studio grey background"
],
"avoid": [
"Smiling with teeth",
"Direct eye contact with camera",
"Complex background",
"Harsh shadows",
"Jewelry",
"Glasses"
]
},
"negative_prompt": [
"teeth",
"smile",
"looking at camera",
"dark background",
"patterned background",
"jewelry",
"earrings",
"glasses",
"low resolution",
"blurry eyes",
"overexposed",
"heavy makeup",
"red lipstick",
"straight hair",
"flat hair"
]
}2、图像转 JSON
获取初始 JSON 提示的最简单方法是使用图像作为参考。我使用 Gemini 3 Pro 生成 JSON 提示,因为它(目前)拥有最佳的图像识别能力。
以下是我使用的系统指令。在 [在此处插入您的示例提示] 处,您可以使用上面的模板提示。
系统提示:
You are an expert prompt engineer for Nano Banana Pro.
Your task is to convert the user's description into a sophisticated, EXTREMELY DETAILED JSON prompt.
You must output a single valid JSON object.
### JSON STRUCTURE GUIDELINES:
1. **Dynamic Fields**: You are encouraged to ADD new fields that capture specific details about the subject (e.g., "plating_style" for food, "architecture_era" for buildings, "glitch_patterns" for abstract art).
2. **Remove Irrelevant Fields**: Do NOT include fields that don't apply. If the subject is a stove, do not include "hair", "skin", or "pose". Remove them entirely rather than setting them to "N/A".
3. **Subject Specificity**:
- **For People**: The example structure (subject, face, skin, hair, clothing) is excellent. Keep it.
- **For Non-Humans**: Create a structure that fits the object. For example, a car might have "chassis", "paint_finish", "wheels".
4. **Standard Fields**: Always include "constraints" (with "must_keep" and "avoid" lists) and "negative_prompt".
### AESTHETIC GOALS:
- **Medium Specificity**: If the user asks for a specific style (e.g. "oil painting"), describe the brushwork, canvas texture, and drying cracks.
- **Lighting**: Be precise (soft, hard, volumetric, golden hour, studio, rim lighting).
- **Camera**: (focal length, depth of field) - ONLY if the style requires photorealism.
Use the following example as a reference for *depth* and *granularity*, but adapt the *keys* to your subject:
[INSERT YOUR SAMPLE PROMPT HERE]
Return ONLY the raw JSON string.这里有一个示例(输入图像比较复杂)以及最终输出结果。Gemini 3 Pro 生成的用于生成输出的 JSON 代码如下所示。没有传入参考图像。
现在我们得到了:
- 一个可重用的输入图像 JSON 定义,我们可以对其进行修改并创建各种变体;
- 一个与原图相似的输出图像,展示了 Gemini 3 Pro 和 Nano Banana Pro 如何出色地捕捉给定图像的精髓。
{
"subject": {
"main": "First-person POV hand holding a smartphone",
"hand_details": {
"appearance": "Male hand, light skin tone",
"grip": "Holding an iPhone vertically, thumb hovering near the bottom bezel",
"lighting": "Illuminated by the screen's glow and warm ambient room light"
},
"device": {
"type": "Modern smartphone with notch (iPhone style)",
"case": "Black slim case",
"screen_state": "On, displaying photo gallery app"
}
},
"screen_content": {
"interface": {
"app": "iOS Photos App",
"header": "Time '22:22', Back arrow '< Gallery', Title 'For You'",
"footer_tabs": "Gallery, For You (selected), Photos, Search"
},
"image_grid": {
"layout": "3-column grid of thumbnails",
"subject_matter": "Repeated photos of a woman with curly dark shoulder-length hair",
"attire": "Cobalt blue textured cardigan, white top, beige pants",
"activity": "Sitting by a window, sketching in a notebook/pad, holding up a drawing",
"consistency": "Same woman in various poses, some candid, some looking at camera smiling"
}
},
"environment": {
"setting": "Cozy living room at night",
"left_side": {
"furniture": "Tall dark wood bookshelf packed with colorful book spines",
"lighting": "Small mushroom-style lamp emitting warm yellow light on shelf",
"decor": "Two plush toys on a side table: A large Snorlax wearing sunglasses and a smaller Eevee"
},
"center_right": {
"electronics": "Large wall-mounted flat screen TV displaying a night cityscape (Hong Kong skyline aesthetic) and digital clock overlay '22:24'",
"audio": "Black soundbar mounted below TV",
"furniture": "Wooden media console, wooden coffee table in foreground (out of focus)"
},
"foliage": {
"plant": "Monstera deliciosa leaves visible near the top, backlit by warm light"
}
},
"photography": {
"style": "POV Lifestyle Snapshot",
"focus": "Sharp focus on the smartphone screen, shallow depth of field (strong bokeh) on background",
"lighting": {
"type": "Low-light indoor ambience",
"sources": [
"Warm tungsten lamp on bookshelf",
"Cool blue light from TV screen",
"Bright white light from phone display"
],
"contrast": "High contrast between the bright phone screen and the dim, cozy room"
},
"camera_gear": {
"lens": "24mm or 28mm wide angle",
"aperture": "f/1.8 to f/2.0 (creating background blur)",
"iso": "High ISO (slight grain visible in shadows)"
}
},
"aesthetic_fidelity": {
"medium": "Digital Photography",
"vibe": "Domestic, cozy, tech-focused, intimate evening",
"visual_qualities": [
"Realistic screen reflection",
"Warm color temperature",
"Soft background blur",
"Digital noise in dark areas"
]
},
"constraints": {
"must_keep": [
"Snorlax plushie with sunglasses",
"Eevee plushie",
"Time on phone 22:22",
"Grid of photos showing woman in blue cardigan",
"Cityscape on TV background",
"Monstera leaves",
"Warm bookshelf lighting"
],
"avoid": [
"Daylight",
"Empty screen",
"Blurry phone screen",
"Third-person perspective",
"Bright overhead lighting",
"Vector art style"
]
},
"negative_prompt": [
"blurry screen",
"daylight",
"sunlight",
"empty room",
"turned off tv",
"cartoon filter",
"illustration",
"drawing style",
"messy grid",
"cracked screen",
"bright white background",
"missing plushies",
"generic landscape on tv"
]
}我们来修改一下这个 JSON 数据,保留整体风格,但把它改成 2000 年代初期的风格。你可以手动修改,也可以使用语言模型自动完成。注意手机、电视和音响的变化:
{
"subject": {
"main": "First-person POV hand holding a vintage BlackBerry phone",
"hand_details": {
"appearance": "Male hand, light skin tone",
"grip": "Holding the device with both hands, thumbs resting on the physical QWERTY keyboard",
"lighting": "Illuminated by the small screen's backlight and warm room ambience"
},
"device": {
"type": "Early 2000s BlackBerry (e.g., model 7230 or 6230)",
"body": "Bulky blue/black plastic casing with side trackwheel",
"screen_state": "On, displaying a low-resolution pixelated color interface"
}
},
"screen_content": {
"interface": {
"os": "Retro BlackBerry OS",
"header": "Status bar with battery icon, signal bars, and 'GPRS'",
"context": "Email/Message inbox open"
},
"display_content": {
"type": "Opened email attachment",
"image_quality": "Highly pixelated, low-bit color depth (16-bit color)",
"subject_matter": "A single grainy photo of a woman with curly dark shoulder-length hair",
"attire": "Cobalt blue textured cardigan, white top",
"text_overlay": "Sender: 'Sarah' - Subject: 'Me'"
}
},
"environment": {
"setting": "Cozy living room at night, circa 2003",
"left_side": {
"furniture": "Dark wood bookshelf packed with books and CD jewel cases",
"lighting": "Lava lamp or translucent colored desk lamp emitting warm glow",
"decor": "Two plush toys on a side table: A large Snorlax wearing sunglasses and a smaller Eevee (Gen 2 Pokémon era appropriate)"
},
"center_right": {
"electronics": "Bulky CRT TV (Sony Trinitron style) displaying a music video channel with scanlines visible",
"audio": "Silver stereo system with stack of components (CD changer, tape deck)",
"furniture": "Wooden media console with glass doors, DVD cases visible inside"
},
"foliage": {
"plant": "Monstera deliciosa leaves visible near the top, backlit by warm light"
}
},
"photography": {
"style": "POV Flash Snapshot / Film Photography",
"focus": "Sharp focus on the BlackBerry physical keyboard and screen, soft background",
"lighting": {
"type": "Indoor tungsten mixed with CRT glow",
"sources": [
"Warm lamp on shelf",
"Flickering blueish glow from CRT TV screen",
"Small backlight from phone"
],
"contrast": "Medium contrast, nostalgic film grain texture"
},
"camera_gear": {
"lens": "35mm film camera simulation",
"aperture": "f/2.8",
"film_stock": "Kodak Gold 400 or Fujifilm Superia (grainy, warm tones)"
}
},
"aesthetic_fidelity": {
"medium": "Analog Photography scan or Early Digital",
"vibe": "Nostalgic, Y2K era, tech-focused, intimate evening",
"visual_qualities": [
"Film grain",
"CRT scanlines in background",
"Plastic textures of old tech",
"Low dynamic range typical of the era"
]
},
"constraints": {
"must_keep": [
"Snorlax plushie with sunglasses",
"Eevee plushie",
"Low-res photo of woman in blue cardigan on screen",
"CRT TV in background",
"Physical keyboard on phone",
"Warm bookshelf lighting"
],
"avoid": [
"Modern smartphones",
"Touchscreens",
"HD flat screen TVs",
"LED lighting strips",
"High resolution screen display",
"Vector art style"
]
},
"negative_prompt": [
"iPhone",
"Android",
"touchscreen",
"bezel-less screen",
"4k tv",
"modern furniture",
"clean digital look",
"illustration",
"drawing style",
"cracked screen",
"bright white background",
"missing plushies",
"lcd screen"
]
}4、轻松创建变体
Nano Banana Pro 可以自动为您创建各种变体。只需在提示符前添加更改 JSON 的指令即可:
Generate a new image with SIGNIFICANTLY different nouns, objects, color palette and pose compared to the JSON below. CRITICAL: Strictly preserve the original 'vibe', 'aesthetic', and 'mood'. The result should look like a distinct image from the same artistic series.
[INSERT YOUR JSON PROMPT HERE]你也可以控制这些变化:
Additional Instruction: [eg Make it night time]
Generate a new image with SIGNIFICANTLY different nouns, objects, color palette and pose compared to the JSON below. CRITICAL: Strictly preserve the original 'vibe', 'aesthetic', and 'mood'. The result should look like a distinct image from the same artistic series.
[INSERT YOUR JSON PROMPT HERE]以下是使用这些技术生成的同一 JSON 提示的各种变体:
另一组变体,主题不同:
这种方法的缺点在于,您牺牲了提示的准确性来换取提示的多样性。您的提示将无法准确描述您获得的图像。
5、整合所有功能
使用 AI Studio,您可以将所有这些功能整合到一个应用程序中,该应用程序可让您:
- 从图像生成 JSON 提示
- 优化 JSON 提示
- 从 JSON 提示创建变体
您可以在 AI Studio 中运行此应用程序,也可以从 GitHub 获取代码。
点击这里查看应用程序的演示视频。
6、JSON 转散文
你不必非得使用 JSON。对于任何给定的 JSON 提示,都有一个同样适用的散文版本。如果你见过一个不错的 JSON 提示,你可以让语言模型帮你把它转换成散文:
Given this JSON, keep all the details and convert it to prose. Use only paragraphs. Be concise.以下是一个不使用 JSON 的复杂提示示例:
A portrait photo of david duchovny, he is wearing a PSG kit, on his shoulder is a parthenos sylvia, in his hand he is holding lily of the valley, behind him there is a poster for the movie 'Last Year at Marienbad' and a Garchomp poster. On a plinth there is a jeff koons sculpture. The sky through the window shows lenticular clouds. There is also a Terminator 2 cross-stitch on the wall. On his arm there is a tattoo of a Nissan Qashqai. There's an orange lego Christmas tree. The room has a high end decor, a pot is boiling on the NEFF hob. Next to the pot is a little origami chair made from purple and yellow paper. He has a dangling monstera leaf earring. There is a pet marine iguana. In his other hand he is making the ok shape. On that hand there is a gold ring. The whole image gives the impression of a hastily taken photo, an everyday scene.7、混合提示
您无需仅使用 JSON 或纯文本格式进行提示。您可以先使用纯文本格式,然后添加 JSON 代码,以便更好地控制输出。
例如,如果您有一个喜欢的 JSON 代码片段,它可以生成特定样式的图像,您可以将其添加到常规提示的末尾。Nano Banana Pro 可以处理它。您的提示无需是有效的 JSON 格式,模型并不关心这一点。
8、结束语
并非所有图像都需要使用 JSON 提示,但如果您有特定的构想或现有的模板,并且拥有合适的工具,它将非常强大。
原文链接:Prompting Nano Banana Pro with JSON
汇智网翻译整理,转载请标明出处