Fantasy Vision Model Overview
I’m excited to introduce my latest checkpoint model, based on HunyuanDiT-v1.2. This model has been trained over 60,000 steps to ensure the generation of high-quality fantasy-themed images with vibrant details images.
Model Details :
Type: Photorealistic model/fantasy-themed/vibrant details
Trigger Words: None required
Chinese language support: No
Output: High-detail, high-resolution images that closely resemble real-life photographs
Configuration Used for Training:
GPU: A6000
Dataset: Combination of 2 stock photos and my own custom dataset
Batch Size: 1
Optimizer: AdamW
Scheduler: Cosine
Learning Rate: 1e-5
Epochs: Target of 100 epochs
Captioning: GPT4
Quick Guide and Parameters V2:
V1 and V2 have some significant changes. For example, V1 had issues with the eyes, which I tried to fix in V2. I also worked on improving color tones and prompt accuracy. As we know, longer prompts don't perform well on DiT, and while I haven't fully resolved that yet, further training might help. So far, it's looking good—one of the best DiT models out there😛
VAE: SDXL
Sampler: dpmpp_2m
Scheduler: sgm_uniform (Recommended for best results) | Karras (Now supported)
Sampling Steps: 25+
CFG Scale: 7
Quick Guide and Parameters:
VAE: SDXL
Sampler: dpmpp_2m
Scheduler: sgm_uniform (Recommended for best results)
Sampling Steps: 25+
CFG Scale: 7
Important: Please avoid using NSFW/mature content in your prompts, as it may lead to unreliable results. Additionally, shorter prompts tend to work better with both SD3 and DiT models.
Note:
This is not a merged or modified model. It is the original Realistic Vision fine-tuned model. Some users have been spreading incorrect information in the model's comment section. If you have any questions or want to know more, join my Discord server or share your thoughts in the comment section. Thank you for your time.