Dataset Management

Some convenience functions for getting data to play with

CelebA

Celebrity faces. A useful test dataset since they’re roughly aligned and thus easier to generate.


source

tensor_to_image

 tensor_to_image (t)

source

get_celebA_dl

 get_celebA_dl (img_size=128, batch_size=32)

source

CelebADataset

 CelebADataset (img_size=128)

One option: custom Dataset class

And in action:

dl = get_celebA_dl()
images, texts = next(iter(dl))
tensor_to_image(images[0]*2-1)
Using custom data configuration huggan--CelebA-faces-8a807f0d7d4912ca
Downloading and preparing dataset image_folder/default (download: 1.29 GiB, generated: 1.06 GiB, post-processed: Unknown size, total: 2.35 GiB) to /root/.cache/huggingface/datasets/parquet/huggan--CelebA-faces-8a807f0d7d4912ca/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901...
Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/parquet/huggan--CelebA-faces-8a807f0d7d4912ca/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901. Subsequent calls will reuse this data.

ImageWoof

Dogs - what’s not to love?


source

get_imagewoof_dl

 get_imagewoof_dl (img_size=128, batch_size=32)

source

ImageWoof

 ImageWoof (img_size=128)

One option: custom Dataset class

dl = get_imagewoof_dl()
images, texts = next(iter(dl))
print(texts[0])
tensor_to_image(images[0]*2-1)
Using custom data configuration johnowhitaker--imagewoof2-320-6229576297321d90
Reusing dataset parquet (/root/.cache/huggingface/datasets/parquet/johnowhitaker--imagewoof2-320-6229576297321d90/0.0.0/0b6d5799bb726b24ad7fc7be720c170d8e497f575d02d47537de9a5bac074901)
A photo of a Dingo

Conceptual Captions 12M

These actually have text associated with the images. Useful for text-to-image testing and so on. I’ve been meaning to do LAION as well but for now this works.


source

get_cc12m_dl

 get_cc12m_dl (img_size=128, batch_size=32, url=None, num_workers=8)
dl = get_cc12m_dl()
images, texts = next(iter(dl))
tensor_to_image(images[0]*2-1)

Paired VQGAN reconstructions

I also tried an image repair task, that requires a low quality image and a high quality version. The ‘Low quality version’ is a 256px image that has been encoded then decoded with VQGAN. The target is the 512px reference image.


source

get_paired_vqgan

 get_paired_vqgan (batch_size=32)
dl = get_paired_vqgan()
lq, hq = next(iter(dl))
print(lq.shape, hq.shape)
tensor_to_image(lq[0]*2-1)
torch.Size([32, 3, 256, 256]) torch.Size([32, 3, 512, 512])