top of page

Image Variants : Only 90 Lines of Python


I saw a video that points out that the code of most 'AI' image generators is actually so basic it's almost fraudulent that companies are trying to sell it. Well, that may or may not be true, but I don't like to pay for Credits for anything, so I wondered what would happen if I asked ChatGPT to provide python code for creating image variants using freely available libraries locally. And ... it worked. Only 90 lines of python script. Admittedly the results are very low grade, but it's free and local. The model pipe is StableDiffusionDepth2ImgPipeline.from_pretrained( "stabilityai/stable-diffusion-2-depth"

These are the libraries needed in the Python environment, added using CMD utility in WIndows:

pip install torch pip install diffusers pip install Pillow pip install numpy pip install matplotlib pip install opencv-python pip install timm Here is the script (with usual disclaimer that it's for fun and study only and if you use it that's on you).

Silly Image Generator [Python]

import os

from PIL import Image

import numpy as np

import matplotlib.pyplot as plt

import cv2


TORCH_AVAILABLE = False

DIFFUSERS_AVAILABLE = False

TIMM_AVAILABLE = False


try:

import torch

TORCH_AVAILABLE = True

except ImportError:

pass


try:

from diffusers import StableDiffusionDepth2ImgPipeline

DIFFUSERS_AVAILABLE = True

except ImportError:

pass


try:

import timm

import warnings

warnings.filterwarnings("ignore", category=FutureWarning, module="timm.models.layers")

TIMM_AVAILABLE = True

except ImportError:

pass


def main():

"""Main canvas for user interaction."""

if not TORCH_AVAILABLE or not DIFFUSERS_AVAILABLE or not TIMM_AVAILABLE:

return


input_image_path = input("Enter the path to the input image: ").strip()


if not os.path.exists(input_image_path):

return


content_prompt = input("Enter content keywords (e.g., 'A beautiful landscape'): ").strip()

style_prompt = input("Enter style keywords (e.g., 'in the style of a woodcut.'): ").strip()


output_dir = os.path.dirname(input_image_path)

variants = generate_variants(input_image_path, content_prompt, style_prompt, output_dir)


# Display the generated variants

if variants:

fig, axes = plt.subplots(1, 4, figsize=(16, 4))

for ax, variant_path in zip(axes, variants):

ax.imshow(Image.open(variant_path))

ax.axis("off")

plt.show()


def generate_variants(input_image_path, content_prompt, style_prompt, output_dir):

"""Generates 4 variants of the input image based on content and style prompts."""

os.makedirs(output_dir, exist_ok=True)


try:

# Load the model

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(

"stabilityai/stable-diffusion-2-depth",

torch_dtype=torch.float16,

).to("cuda")

except Exception:

return []


input_image = Image.open(input_image_path)


# Generate 4 variants

variants = []

for i in range(4):

try:

result = pipe(

prompt=f"{content_prompt}, {style_prompt}",

image=input_image,

strength=0.8,

)

output_path = os.path.join(output_dir, f"variant_{i + 1}.png")

result.images[0].save(output_path)

variants.append(output_path)

except Exception:

break


return variants


if __name__ == "__main__":

main()

If running it from IDLE, you'll see a path field for the input image, and then the image generating in steps as shown below.

The first image in this post, above, was the input image for testing.


The images below are outputs from the python script:


Specially prompting for medieval colours:

It may be worth trying to specify different diffusers that are more recent and powerful. But the steps needed to get these to work may be annoying. Look up https://huggingface.co/models?pipeline_tag=image-to-image&sort=trending In the python script, it's Image to Image, so bear that in mind when choosing an alternative diffuser model.

Clicking 'Use this model' pops up some info on what the python needs to have to specify the desired model.




Kommentit


Featured Posts
Recent Posts
Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square
bottom of page