Skip to content

SD — News 23

Sta­bleS­war­mUI 0.5.9 Alpha 👍

A Modu­lar Sta­ble Dif­fu­si­on Web-User-Inter­face, with an empha­sis on making powert­ools easi­ly acces­si­ble, high per­for­mance, and extensibility.

Die­ses Pro­jekt wur­de mit einem C#-Backend-Server erstellt, um die Leis­tung zu maxi­mie­ren und gleich­zei­tig die Code­kom­ple­xi­tät nur mini­mal zu erhö­hen. Wäh­rend die meis­ten ML-Pro­jek­te in der Regel in Python geschrie­ben wer­den, reicht die­se Spra­che ein­fach nicht aus, um die Leis­tungs­zie­le die­ses Pro­jekts* zu errei­chen (d. h. um einen sehr schnel­len und reak­ti­ons­fä­hi­gen, für meh­re­re Benut­zer geeig­ne­ten Mul­ti-Backend-Dienst bereit­zu­stel­len), ins­be­son­de­re fehlt ihr „true“. „Mul­ti­th­re­a­ding-Fähig­kei­ten (auf­grund von Python GIL), die für Sta­bleS­war­mUI als drin­gend not­wen­dig erach­tet wur­den (es muss in der Lage sein, ver­füg­ba­re CPU-Ker­ne zu nut­zen und gleich­zei­tig Benut­zer­an­fra­gen zu bedie­nen und inter­ne Daten zu ver­wal­ten, um so schnell wie mög­lich auf alle Anfra­gen reagie­ren zu können).



Lucid­Drea­mer: Towards High-Fide­li­ty Text-to-3D Gene­ra­ti­on via Inter­val Score Matching

This live demo allows you to gene­ra­te high-qua­li­ty 3D con­tent using text prompts. The out­puts are 360° ren­de­red 3d gaus­si­an video and trai­ning pro­gress visualization.


Civi­tai SDXL Safe­ten­sor Cog template


attempt to use Ten­sorRT with ComfyUI

best sui­ted for RTX 20xx-30xx-40xx


SD-Tur­bo Model Card

SD-Tur­bo is a fast gene­ra­ti­ve text-to-image model that can syn­the­si­ze pho­to­rea­li­stic images from a text prompt in a sin­gle net­work eva­lua­ti­on. We release SD-Tur­bo as a rese­arch arti­fact, and to stu­dy small, distil­led text-to-image models. For increased qua­li­ty and prompt under­stan­ding, we recom­mend SDXL-Turbo.


Com­fyUI Cus­tom Nodes

Cus­tom nodes that extend the capa­bi­li­ties of ComfyUI



Expe­ri­men­tal usa­ge of sta­ble-fast and TensorRT.


Magi­c­Ani­ma­te: Tem­po­ral­ly Con­sis­tent Human Image Ani­ma­ti­on using Dif­fu­si­on Model

Release infe­rence code and gra­dio demo. We are working to impro­ve Magi­c­Ani­ma­te, stay tuned!



Moti­on­Di­rec­tor: Moti­on Cus­to­miza­ti­on of Text-to-Video Dif­fu­si­on Models

Lar­ge-sca­le pre-trai­ned dif­fu­si­on models have exhi­bi­ted remar­kab­le capa­bi­li­ties in diver­se video gene­ra­ti­ons. Given a set of video clips of the same moti­on con­cept, the task of Moti­on Cus­to­miza­ti­on is to adapt exis­ting text-to-video dif­fu­si­on models to gene­ra­te vide­os with this moti­on. For exam­p­le, gene­ra­ting a video with a car moving in a pre­scri­bed man­ner under spe­ci­fic came­ra move­ments to make a movie, or a video illus­t­ra­ting how a bear would lift weights to inspi­re creators. 



Intro­du­cing SDXL Tur­bo: A Real-Time Text-to-Image Gene­ra­ti­on Model

SDXL Tur­bo achie­ves sta­te-of-the-art per­for­mance with a new distil­la­ti­on tech­no­lo­gy, enab­ling sin­gle-step image gene­ra­ti­on with unpre­ce­den­ted qua­li­ty, redu­cing the requi­red step count from 50 to just one.

See our rese­arch paper for spe­ci­fic tech­ni­cal details regar­ding the model’s new distil­la­ti­on tech­ni­que that lever­a­ges a com­bi­na­ti­on of adver­sa­ri­al trai­ning and score distillation.

Down­load the model weights and code on Hug­ging Face, curr­ent­ly being released under a non-com­mer­cial rese­arch licen­se that per­mits per­so­nal, non-com­mer­cial use.

Test SDXL Tur­bo on Sta­bi­li­ty AI’s image editing plat­form Clip­drop, with a beta demons­tra­ti­on of the real-time text-to-image gene­ra­ti­on capabilities.



A mini­ma­li­stic imple­men­ta­ti­on of Robust Video Mat­ting (RVM) in ComfyUI


Good­bye cold boot — how we made LoRA Infe­rence 300% faster

We swap the Sta­ble Dif­fu­si­on LoRA adap­ters per user request, while kee­ping the base model warm allo­wing fast LoRA infe­rence across mul­ti­ple users. You can expe­ri­ence this by brow­sing our LoRA cata­lo­gue and play­ing with the infe­rence widget.


😴 Lucid­Drea­mer: Domain-free Gene­ra­ti­on of 3D Gaus­si­an Splatt­ing Scenes




Turn your ide­as into emo­jis in seconds. Gene­ra­te your favo­ri­te Slack emo­jis with just one click.



Fooo­cus is a rethin­king of Sta­ble Dif­fu­si­on and Midjourney’s designs


ReAc­tor for Sta­ble Diffusion

The Fast and Simp­le FaceS­wap Exten­si­on with a lot of impro­ve­ments and wit­hout NSFW fil­ter (uncen­so­red, use it on your own responsibility)



some wyr­de work­flows for comfyUI


Cus­tom Nodes, Exten­si­ons, and Tools for ComfyUI



sd-webui-com­fyui is an exten­si­on for A1111 webui that embeds Com­fyUI work­flows in dif­fe­rent sec­tions of the nor­mal pipe­line of the webui. This allows to crea­te Com­fyUI nodes that inter­act direct­ly with some parts of the webui’s nor­mal pipeline.



Com­fy­Box is a front­end to Sta­ble Dif­fu­si­on that lets you crea­te cus­tom image gene­ra­ti­on inter­faces wit­hout any code. It uses Com­fyUI under the hood for maxi­mum power and extensibility.



Drag Your GAN: Inter­ac­ti­ve Point-based Mani­pu­la­ti­on on the Gene­ra­ti­ve Image Manifold



We obser­ve that despi­te their hier­ar­chi­cal con­vo­lu­tio­nal natu­re, the syn­the­sis pro­cess of typi­cal gene­ra­ti­ve adver­sa­ri­al net­works depends on abso­lu­te pixel coor­di­na­tes in an unhe­alt­hy man­ner. This mani­fests its­elf as, e.g., detail appearing to be glued to image coor­di­na­tes ins­tead of the sur­faces of depic­ted objects. We trace the root cau­se to care­less signal pro­ces­sing that cau­ses ali­a­sing in the gene­ra­tor net­work. Inter­pre­ting all signals in the net­work as con­ti­nuous, we deri­ve gene­ral­ly appli­ca­ble, small archi­tec­tu­ral chan­ges that gua­ran­tee that unwan­ted infor­ma­ti­on can­not leak into the hier­ar­chi­cal syn­the­sis pro­cess. The resul­ting net­works match the FID of StyleGAN2 but dif­fer dra­ma­ti­cal­ly in their inter­nal repre­sen­ta­ti­ons, and they are ful­ly equi­va­ri­ant to trans­la­ti­on and rota­ti­on even at sub­pi­xel sca­les. Our results pave the way for gene­ra­ti­ve models bet­ter sui­ted for video and animation.


Com­fyUI ExLl­ama Nodes


Gene­ra­ti­ve AI for Krita

Gene­ra­te images from within Kri­ta with mini­mal fuss: Sel­ect an area, push a but­ton, and new con­tent that matches your image will be gene­ra­ted. Or expand your can­vas and fill new are­as with gene­ra­ted con­tent that blends right in. Text prompts are optio­nal. No twea­king required!

Local. Open source. Free.



Diff­BIR: Towards Blind Image Res­to­ra­ti­on with Gene­ra­ti­ve Dif­fu­si­on Prior


Infe­rence of Sta­ble Dif­fu­si­on in pure C/​C++

Plain C/​C++ imple­men­ta­ti­on based on ggml, working in the same way as llama.cpp
16-bit, 32-bit float sup­port
4‑bit, 5‑bit and 8‑bit inte­ger quan­tiza­ti­on support



Next gene­ra­ti­on face swap­per and enhancer.



The most powerful and modu­lar sta­ble dif­fu­si­on GUI and backend.


Com­fyUI Examples

This repo con­ta­ins examp­les of what is achie­va­ble with Com­fyUI. All the images in this repo con­tain meta­da­ta which means they can be loa­ded into Com­fyUI with the Load but­ton (or drag­ged onto the win­dow) to get the full work­flow that was used to crea­te the image.


WAS Node Suite

A node suite for Com­fyUI with many new nodes, such as image pro­ces­sing, text pro­ces­sing, and more.




ResS­hift: Effi­ci­ent Dif­fu­si­on Model for Image Super-reso­lu­ti­on by Resi­du­al Shifting

Dif­fu­si­on-based image super-reso­lu­ti­on (SR) methods are main­ly limi­t­ed by the low infe­rence speed due to the requi­re­ments of hundreds or even thou­sands of sam­pling steps. Exis­ting acce­le­ra­ti­on sam­pling tech­ni­ques ine­vi­ta­b­ly sacri­fice per­for­mance to some ext­ent, lea­ding to over-blur­ry SR results. To address this issue, we pro­po­se a novel and effi­ci­ent dif­fu­si­on model for SR that signi­fi­cant­ly redu­ces the num­ber of dif­fu­si­on steps, ther­eby eli­mi­na­ting the need for post-acce­le­ra­ti­on during infe­rence and its asso­cia­ted per­for­mance deterioration.


Con­trol­Net and T2I-Adap­ter Examples

Each ControlNet/​T2I adap­ter needs the image that is pas­sed to it to be in a spe­ci­fic for­mat like depth­maps, can­ny maps and so on depen­ding on the spe­ci­fic model if you want good results.



Invo­ke­AI is a lea­ding crea­ti­ve engi­ne for Sta­ble Dif­fu­si­on models, empowe­ring pro­fes­sio­nals, artists, and enthu­si­asts to gene­ra­te and crea­te visu­al media using the latest AI-dri­ven tech­no­lo­gies. The solu­ti­on offers an indus­try lea­ding WebUI, sup­ports ter­mi­nal use through a CLI, and ser­ves as the foun­da­ti­on for mul­ti­ple com­mer­cial products.



With Auto-Pho­to­shop-Stab­le­Dif­fu­si­on-Plug­in, you can direct­ly use the capa­bi­li­ties of Automatic1111 Sta­ble Dif­fu­si­on in Pho­to­shop wit­hout swit­ching bet­ween pro­grams. This allows you to easi­ly use Sta­ble Dif­fu­si­on AI in a fami­li­ar envi­ron­ment. You can edit your Sta­ble Dif­fu­si­on image with all your favo­ri­te tools and save it right in Photoshop.


DWPo­se: New pose detec­tion method


Instal­la­ti­on https://​git​hub​.com/​I​D​E​A​-​R​e​s​e​a​r​c​h​/​D​W​P​o​s​e​/​b​l​o​b​/​m​a​i​n​/​I​N​S​T​A​L​L​.md

How to Down­load FFmpeg

Writ­ten Tuto­ri­al https://​www​.next​dif​fu​si​on​.ai

63 Pho­to­graph­ers to use in SDXL prompts for AI art

AI gene­ra­ted images to pro­vi­de some inspi­ra­ti­on for prompts in Sta­ble Dif­fu­si­on XL 1.0; every image was crea­ted with the prompt “Dra­gon on beach by [Pho­to­grapher Name]”. Pho­to­grapher Names were rejec­ted from inclu­si­on if they did­n’t pro­vi­de results that were good qua­li­ty (not too much weird ana­to­my), weren’t distinct in style from others included, or if they pro­vi­ded NSFW results.

Pho­to­graph­ers included @ you​tube​.com/​w​a​t​c​h​?​v​=​6​g​K​R​e​F​V​q​SLM


Adds Java­script-SVG-Edi­tor (SVG-Edit) as a tab to Sta­ble-Dif­fu­si­on-Webui Auto­ma­tic 1111.
Adds an inter­ac­ti­ve vec­to­ri­zer (mono­chro­me and color: “SVGCode” as a fur­ther tab
Adds postpro­ces­sing using POTRACE — exe­cu­ta­ble to mass con­vert you prompts from png to svg.


Intel® Embree

is a high-per­for­mance ray tra­cing libra­ry deve­lo­ped at Intel which sup­ports x86 CPUs under Linux, macOS, and Win­dows; ARM CPUs on macOS; as well as Intel® Arc™ GPUs under Linux and Windows.



Tem­po­ral­ly Coher­ent Sta­ble Dif­fu­si­on Vide­os via a Video Codec Approach

New rese­arch from Chi­na has used the well-estab­lished pre­cepts of video frame enco­ding as a cen­tral approach to a new method for crea­ting Sta­ble Dif­fu­si­on vide­os that are tem­po­ral­ly con­sis­tent (i.e., that do not show jar­ring chan­ges throug­hout the video).


DeOl­di­fy for Sta­ble Dif­fu­si­on WebUI

This is an exten­si­on for StableDiffusion’s AUTOMATIC1111 web-ui that allows colo­ri­ze of old pho­tos. It is based on deoldify.




Sta­ble Dif­fu­si­on Evaluation

This Hug­ging­Face Space lets you compa­re Sta­ble-Dif­fu­si­on V1.5 vs SDXL image quality.


Leave a reply

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert