markweber
/

vqgan_plus_12bit

Model card Files Files and versions

vqgan_plus_12bit / README.md

markweber's picture

Update README.md

a8bbe0d verified about 1 year ago

|

history blame contribute delete

1.03 kB

	---
	license: apache-2.0
	datasets:
	- ILSVRC/imagenet-1k
	model-index:
	- name: VQGAN+
	results:
	- task:
	type: image-generation
	dataset:
	name: ILSVRC/imagenet-1k
	type: ILSVRC/imagenet-1k
	metrics:
	- name: rFID
	type: rFID
	value: 1.39
	- name: InceptionScore
	type: InceptionScore
	value: 193.9
	- name: LPIPS
	type: LPIPS
	value: 0.315
	- name: PSNR
	type: PSNR
	value: 21
	- name: SSIM
	type: SSIM
	value: 0.55
	- name: CodebookUsage
	type: CodebookUsage
	value: 1.0
	---

	This model is the VQGAN+ tokenizer with a vocabulary size of 12 bits. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256.

	You can find more details on the [project page](https://weber-mark.github.io/projects/maskbit.html) and in the [paper](https://arxiv.org/abs/2409.16211).