dataset package

Submodules

dataset.cifar module

class dataset.cifar.CIFAR10(root, train=True, transform=None, target_transform=None, download=False, coarse=False)

Bases: torch.utils.data.dataset.Dataset

CIFAR10 Dataset.

Parameters
  • root (string) – Root directory of dataset where directory cifar-10-batches-py exists or will be saved to if download is set to True.

  • train (bool, optional) – If True, creates dataset from training set, otherwise creates from test set.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

base_folder = 'cifar-10-batches-py'
download()
filename = 'cifar-10-python.tar.gz'
meta = {'filename': 'batches.meta', 'key': 'label_names', 'md5': '5ff9c542aee3614f3951f8cda6e48888'}
test_list = [['test_batch', '40351d587109b95175f43aff81a1287e']]
tgz_md5 = 'c58f30108f718f92721af3b95e74349a'
train_list = [['data_batch_1', 'c99cafc152244af753f735de768cd75f'], ['data_batch_2', 'd4bba439e000b95fd0a9bffe97cbabec'], ['data_batch_3', '54ebc095f3ab1f0389bbae665268c751'], ['data_batch_4', '634d18415352ddfa80567beed471001a'], ['data_batch_5', '482c414d41f54cd18b22e5b47cb7c3cb']]
url = 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'
class dataset.cifar.CIFAR100(root, train=True, transform=None, target_transform=None, download=False, coarse=False)

Bases: dataset.cifar.CIFAR10

CIFAR100 Dataset.

This is a subclass of the CIFAR10 Dataset.

base_folder = 'cifar-100-python'
filename = 'cifar-100-python.tar.gz'
meta = {'filename': 'meta', 'key': 'fine_label_names', 'md5': '7973b15100ade9c7d40fb424638fde48'}
test_list = [['test', 'f0ef6b0ae62326f3e7ffdfab6717acfc']]
tgz_md5 = 'eb9058c3a382ffc7106e4002c42a8d85'
train_list = [['train', '16019d7e3df5f24257cddd939b257f8d']]
url = 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz'

dataset.folder module

class dataset.folder.DatasetFolder(root, loader, extensions, transform=None, target_transform=None)

Bases: torch.utils.data.dataset.Dataset

A generic data loader where the samples are arranged in this way:

root/class_x/xxx.ext
root/class_x/xxy.ext
root/class_x/xxz.ext

root/class_y/123.ext
root/class_y/nsdf3.ext
root/class_y/asd932_.ext
Parameters
  • root (string) – Root directory path.

  • loader (callable) – A function to load a sample given its path.

  • extensions (list[string]) – A list of allowed extensions.

  • transform (callable, optional) – A function/transform that takes in a sample and returns a transformed version. E.g, transforms.RandomCrop for images.

  • target_transform – A function/transform that takes in the target and transforms it.

dataset.folder.IMG_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', 'webp']
class dataset.folder.ImageFolder(root, transform=None, target_transform=None, loader=<function default_loader>)

Bases: dataset.folder.DatasetFolder

A generic data loader where the images are arranged in this way:

root/dog/xxx.png
root/dog/xxy.png
root/dog/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/asd932_.png
Parameters
  • root (string) – Root directory path.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

  • loader – A function to load an image given its path.

dataset.folder.accimage_loader(path)
dataset.folder.default_loader(path)
dataset.folder.has_file_allowed_extension(filename, extensions)

Checks if a file is an allowed extension.

Parameters
  • filename (string) – path to a file

  • extensions (iterable of strings) – extensions to consider (lowercase)

Returns

True if the filename ends with one of given extensions

Return type

bool

dataset.folder.is_image_file(filename)

Checks if a file is an allowed image extension.

Parameters

filename (string) – path to a file

Returns

True if the filename ends with a known image extension

Return type

bool

dataset.folder.make_dataset(dir, class_to_idx, extensions)
dataset.folder.pil_loader(path)

dataset.mnist module

class dataset.mnist.EMNIST(root, split, **kwargs)

Bases: dataset.mnist.MNIST

EMNIST Dataset.

Parameters
  • root (string) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • split (string) – The dataset has 6 different splits: byclass, bymerge, balanced, letters, digits and mnist. This argument specifies which one to use.

  • train (bool, optional) – If True, creates dataset from training.pt, otherwise from test.pt.

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

download()

Download the EMNIST data if it doesn’t exist in processed_folder already.

splits = ('byclass', 'bymerge', 'balanced', 'letters', 'digits', 'mnist')
url = 'https://cloudstor.aarnet.edu.au/plus/index.php/s/54h3OuGJhFLwAlQ/download'
class dataset.mnist.FashionMNIST(root, train=True, transform=None, target_transform=None, download=False)

Bases: dataset.mnist.MNIST

Fashion-MNIST Dataset.

Parameters
  • root (string) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • train (bool, optional) – If True, creates dataset from training.pt, otherwise from test.pt.

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
urls = ['http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz']
class dataset.mnist.KMNIST(root, train=True, transform=None, target_transform=None, download=False)

Bases: dataset.mnist.MNIST

Kuzushiji-MNIST Dataset.

Parameters
  • root (string) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • train (bool, optional) – If True, creates dataset from training.pt, otherwise from test.pt.

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

classes = ['o', 'ki', 'su', 'tsu', 'na', 'ha', 'ma', 'ya', 're', 'wo']
urls = ['http://codh.rois.ac.jp/kmnist/dataset/kmnist/train-images-idx3-ubyte.gz', 'http://codh.rois.ac.jp/kmnist/dataset/kmnist/train-labels-idx1-ubyte.gz', 'http://codh.rois.ac.jp/kmnist/dataset/kmnist/t10k-images-idx3-ubyte.gz', 'http://codh.rois.ac.jp/kmnist/dataset/kmnist/t10k-labels-idx1-ubyte.gz']
class dataset.mnist.MNIST(root, train=True, transform=None, target_transform=None, download=False)

Bases: torch.utils.data.dataset.Dataset

MNIST Dataset.

Parameters
  • root (string) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • train (bool, optional) – If True, creates dataset from training.pt, otherwise from test.pt.

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

property class_to_idx
classes = ['0 - zero', '1 - one', '2 - two', '3 - three', '4 - four', '5 - five', '6 - six', '7 - seven', '8 - eight', '9 - nine']
download()

Download the MNIST data if it doesn’t exist in processed_folder already.

static extract_gzip(gzip_path, remove_finished=False)
property processed_folder
property raw_folder
property test_data
test_file = 'test.pt'
property test_labels
property train_data
property train_labels
training_file = 'training.pt'
urls = ['http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', 'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', 'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', 'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz']
dataset.mnist.get_int(b)
dataset.mnist.read_image_file(path)
dataset.mnist.read_label_file(path)

dataset.utils module

dataset.utils.check_integrity(fpath, md5=None)
dataset.utils.download_url(url, root, filename=None, md5=None)

Download a file from a url and place it in root.

Parameters
  • url (str) – URL to download file from

  • root (str) – Directory to place downloaded file in

  • filename (str) – Name to save the file under. If None, use the basename of the URL

  • md5 (str) – MD5 checksum of the download. If None, do not check

dataset.utils.gen_bar_updater()
dataset.utils.list_dir(root, prefix=False)

List all directories at a given root

Parameters
  • root (str) – Path to directory whose folders need to be listed

  • prefix (bool, optional) – If true, prepends the path to each result, otherwise only returns the name of the directories found

dataset.utils.list_files(root, suffix, prefix=False)

List all files ending with a suffix at a given root

Parameters
  • root (str) – Path to directory whose folders need to be listed

  • suffix (str or tuple) – Suffix of the files to match, e.g. ‘.png’ or (‘.jpg’, ‘.png’). It uses the Python “str.endswith” method and is passed directly

  • prefix (bool, optional) – If true, prepends the path to each result, otherwise only returns the name of the files found

dataset.utils.makedir_exist_ok(dirpath)

Python2 support for os.makedirs(.., exist_ok=True)

Module contents