Run75
Version Detail
Trained by Tensor
FLUX.1 Schnell
11
1930s photography mode
Part of my research and an extension of my wooly flux work. Through subsequent testing I've come close to a "one size fits all" solution to a couple of major use cases in training.
This lora was trained using 50 images. 30 of which were captioned using my own specialized format of captioning like so:
1930s photography mode, glamorous lighting, satin textures, era 1930s Hollywood fashion, satin gown with flowing train, solo, glamorous eveningwear, Rating SFW, a woman in a luxurious satin gown stands with effortless poise in a vintage living room,¬ the fabric catching the light as her gown’s long train pools elegantly on the floor,¬ the warm glow of the setting amplifying the timeless glamour of her ensemble, dramatic lighting filter.
--
The next 10 images were captioned with natural language like so:
In the dead of night, rain pours down in torrents, creating a loud, rhythmic sound as it hits the hard ground and metallic objects around it. The camera is focused on a large, industrial pillar or pole, its base firmly planted in the wet pavement. The rain cascades down its sides, pooling at the base, where small streams form and race toward the nearest drain. Every surface is slick and shiny, reflecting the dim, hazy light from street lamps further down the road.
The entire street seems abandoned, save for the relentless rain. The occasional flicker of distant light suggests a passing car, but no people are visible. The ground is littered with wet leaves and debris, swirling in small eddies as water rushes past. The scene is eerily quiet except for the constant drumming of the rain, creating an atmosphere of isolation and melancholy.
--
The final 10 images were captioned with a lot of tags like so:
rotary telephone, vintage phone, woman on phone, old telephone, analog phone, black and white, high contrast, soft lighting, woman, sweater, 1930s hairstyle, retro fashion, rotary dial, old technology, casual conversation, candid pose, old-style phone cord
--
I'm still working on it, but this seems to be somewhere close to a good mix if you want to get started making your own style lora.
The captioning style in particular is my baby:
Understanding Caption Structure and Its Role
The model is trained to interpret visual prompts through structured captions that define styles, characters, and scene elements in a precise manner. Each section of the caption contributes specific information to guide the model’s output, ensuring accurate and cohesive results.
Mode:
What it does: Specifies the artistic medium or style in which the image should be rendered (e.g., oil painting, digital, 3D render).
Purpose: Sets the overall aesthetic of the output by determining how colors, textures, and shapes will be represented.
Additional Tags:
What they do: Describe techniques used within the chosen mode (e.g., smooth gradients, bold outlines).
Purpose: Refines the artistic approach, allowing for customization of brushwork, shading, or texture application.
Era:
What it does: Defines a specific time period or artistic movement (e.g., 1600s Baroque, 2020s Cyberpunk).
Purpose: Establishes the visual atmosphere by referencing historical or futuristic styles, influencing character design, architecture, and mood.
Fashion Style:
What it does: Describes the clothing or costume worn by subjects (e.g., streetwear, medieval armor).
Purpose: Helps in constructing the appearance and identity of characters by focusing on attire, reflecting the theme or setting.
Subject Count:
What it does: Specifies the number of characters or subjects in the scene (e.g., solo, duo).
Purpose: Controls the composition and dynamics of the scene, indicating if it’s focused on a single subject or involves interactions between multiple characters.
Unique Style Identifier:
What it does: Identifies a distinctive visual theme or style that sets the image apart (e.g., whimsical fantasy, futuristic warrior).
Purpose: Adds a signature element to the scene, guiding the model towards a specific mood or creative vision.
Rating:
What it does: Indicates the content rating (e.g., Rating SFW, Rating NSFW).
Purpose: Ensures the generated image adheres to appropriate standards based on intended usage, either safe-for-work or otherwise.
Prompt:
What it does: Describes the scene itself, broken down into detailed visual elements (e.g., “A character standing in a neon-lit city, wielding a plasma sword”).
Purpose: Provides the core description of what should be generated, focusing on characters, objects, and their interactions within the scene.
Filter:
What it does: Defines a visual effect to apply to the final image (e.g., soft light filter, sepia tone).
Purpose: Alters the appearance of the output by adding specific visual treatments, such as changes in color balance, contrast, or atmosphere.
How It Works Together:
Each part of the caption plays a distinct role in guiding the model. The mode sets the foundation for the art style, while the tags and era help fine-tune the specifics of the scene. Fashion style and subject count shape the subjects, while the unique style identifier ensures a clear and cohesive theme. Finally, the prompt and filters add narrative and finishing touches, creating a well-rounded, detailed output based on the desired visual direction.
This structured approach ensures flexibility and precision in generating artwork, allowing for a wide range of creative possibilities.
Project Permissions
Use in TENSOR GREEN Online
As a online training base model on TENSOR GREEN
Use without crediting me
Share merges of this model
Use different permissions on merges
Use Permissions
Sell generated contents
Use on generation services
Sell this model or merges
Commercial Use
Comments
Related Posts
Describe the image you want to generate, then press Enter to send.