AnySomnium XL

CHECKPOINT
Original


Updated:

33

[Proudly introducing, AnySomniumXL v3, an SDXL Model]

You can support me on Ko-Fi

The SDXL model with a 2D (cartoonish) style is trained with the basic SDXL model (SDXL Base v1.0), supported by text encoder training to generate a 2D style with natural language and likely not generate the realistic style inherent in SDXL Base.

The model is trained with 133,000+ curated images from hundreds of thousands of images from various sources. The dataset is built by saving images that have an aesthetic score of at least 17 and a maximum of 50 (to maintain the cartoonish model and not too realistic. The scale is based on our proprietary aesthetic scoring mechanism), and do not have text and watermarks such as signatures or comic/manga images. Thus, images that have an aesthetic score of less than 17 and more than 50 will be discarded, as well as images that have watermarks or text will be discarded.

AnySomniumXL v3 Technical Specifications:

  • Training per 1 Epoch 16 Epoch (Results from AnySomniumXL using Epoch 16)

  • Captioned by proprietary multimodal LLM, better than LLaVA

  • Trained with a bucket size of 1280x1280

  • Shuffle Caption: Yes

  • Clip Skip: 2

  • Trained with 2x NVIDIA A100 80GB

The technology for creating this dataset uses a combination of the CLIP model and MLP scoring method by christophschuhmann and modified by us, utilizing VIT-L/14 to produce aesthetic scoring on a scale of -1-100 and modified with the addition of watermark detection from us.

Achievements:

✓ Produces more 2D Models with Natural Language by default without the need for excessive negative or positive prompts

✓ Most likely to produce better fingers than the average stable diffusion model without adetailer or inpainting

✓ Produces a more authentic 2D model without the need for negative prompts

✓ Does not produce images with random watermarks or text

Limitations:

✓ Slightly of characters holding objects such as weapons or items correctly

✓ Still requires broader dataset training

✓ There are still some gaps in the text encoder. There is room for improvement

✓ Text cannot generated correctly

✓ This optimized for human or mutated human generation. Non human like SCP, Ponies, and more maybe could resulting not what you expecting

AnySomniumXL v3 Pro tips:

Because AnySomniumXL v3 trained on 1280x1280, so the resolution on many aspects ratio maybe different than standard SDXL model

Best Resolution (You could flip the resolution number whether it's landscape or portrait):

  • 1280x1280

  • 1472x1088

  • 1152x1408

  • 1536x1024

  • 1856x832

  • 1024x1600

More versions will be coming with broader datasets and trained text encoder. Our targets is to produce the most enormous clean datasets for our training. It's recommended to using this model on Automatic1111 webui

Version Detail

SDXL 1.0
328940
20
Clip Skip 2 Lower Learning rate Text Encoder Trained Natural Language Captioning

Project Permissions

    Use Permissions

  • Use in TENSOR GREEN Online

  • As a online training base model on TENSOR GREEN

  • Use without crediting me

  • Share merges of this model

  • Use different permissions on merges

    Commercial Use

  • Sell generated contents

  • Use on generation services

  • Sell this model or merges

Comments

Related Posts

Describe the image you want to generate, then press Enter to send.