Dataset Management

Some convenience functions for getting data to play with

CelebA

Celebrity faces. A useful test dataset since they’re roughly aligned and thus easier to generate.

tensor_to_image

 tensor_to_image (t)

source

get_celebA_dl

 get_celebA_dl (img_size=128, batch_size=32)

source

CelebADataset

 CelebADataset (img_size=128)

One option: custom Dataset class

And in action:

dl = get_celebA_dl()
images, texts = next(iter(dl))
tensor_to_image(images[0]*2-1)

Using custom data configuration huggan--CelebA-faces-8a807f0d7d4912ca

Downloading and preparing dataset image_folder/default (download: 1.29 GiB, generated: 1.06 GiB, post-processed: Unknown size, total: 2.35 GiB) to /root/.cache/huggingface/datasets/parquet/huggan--CelebA-faces-8a807f0d7d4912ca/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901...

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/parquet/huggan--CelebA-faces-8a807f0d7d4912ca/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901. Subsequent calls will reuse this data.

ImageWoof

Dogs - what’s not to love?

source

get_imagewoof_dl

 get_imagewoof_dl (img_size=128, batch_size=32)

source

ImageWoof

 ImageWoof (img_size=128)

One option: custom Dataset class

dl = get_imagewoof_dl()
images, texts = next(iter(dl))
print(texts[0])
tensor_to_image(images[0]*2-1)

Using custom data configuration johnowhitaker--imagewoof2-320-6229576297321d90
Reusing dataset parquet (/root/.cache/huggingface/datasets/parquet/johnowhitaker--imagewoof2-320-6229576297321d90/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901)

A photo of a Dingo

Conceptual Captions 12M

These actually have text associated with the images. Useful for text-to-image testing and so on. I’ve been meaning to do LAION as well but for now this works.

source

get_cc12m_dl

 get_cc12m_dl (img_size=128, batch_size=32, url=None, num_workers=8)

dl = get_cc12m_dl()
images, texts = next(iter(dl))
tensor_to_image(images[0]*2-1)

Paired VQGAN reconstructions

I also tried an image repair task, that requires a low quality image and a high quality version. The ‘Low quality version’ is a 256px image that has been encoded then decoded with VQGAN. The target is the 512px reference image.

source

get_paired_vqgan

 get_paired_vqgan (batch_size=32)

dl = get_paired_vqgan()
lq, hq = next(iter(dl))
print(lq.shape, hq.shape)
tensor_to_image(lq[0]*2-1)

torch.Size([32, 3, 256, 256]) torch.Size([32, 3, 512, 512])