Saya kesulitan memahami cara menerapkan augmentasi data dengan tensorflow. Saya memiliki kumpulan data (gambar), yang dibagi menjadi dua himpunan bagian; pelatihan dan pengujian. Setelah saya memanggil fungsi ImageDataGenerator dengan berbagai parameter, apakah saya perlu menyimpan gambar (seperti menggunakan flow()) atau akankah Tensorflow menambah data saya saat model sedang pelatihan ?

Ini kode yang saya implementasikan:

# necessary imports

train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    brightness_range=(0.3, 1.0),
    horizontal_flip=True,
    vertical_flip=True,
    fill_mode='nearest',
    validation_split=0.2
)

training_directory = '/tmp/dataset/training'
testing_directory = '/tmp/dataset/testing'

training_set = train_datagen.flow_from_directory(
    training_directory,
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary',
    subset='training'
)

test_set = train_datagen.flow_from_directory(
    testing_directory,
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary',
    subset='validation'
)

# creating a sequential model
...
# fitting and data plotting

Ringkasan model:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0
_________________________________________________________________
dropout (Dropout)            (None, 17, 17, 128)       0
_________________________________________________________________
flatten (Flatten)            (None, 36992)             0
_________________________________________________________________
dense (Dense)                (None, 512)               18940416
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513
=================================================================
Total params: 19,034,177
Trainable params: 19,034,177
Non-trainable params: 0
_________________________________________________________________
0
listout 12 Mei 2021, 15:03

2 jawaban

Jawaban Terbaik

Anda tidak perlu menyimpan data. Data yang diperbesar (latihan/pengujian) dimasukkan langsung ke dalam model untuk langkah-langkah pelatihan atau evaluasi menggunakan pembangkit data kereta dan uji.

Berikut adalah kode Anda yang diperbarui dengan semua langkah menggunakan generator data yang dibuat train_generator dan test_generator.

 datagenerator = ImageDataGenerator(
    rescale=1. / 255,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    brightness_range=(0.3, 1.0),
    horizontal_flip=True,
    vertical_flip=True,
    fill_mode='nearest',
    validation_split=0.2
)
 
training_directory = '/tmp/dataset/training'
testing_directory = '/tmp/dataset/testing'

train_generator = datagenerator.flow_from_directory(
    training_directory,
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary',
    subset='training'
)

test_generator = datagenerator.flow_from_directory(
    testing_directory,
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary',
    subset='validation'
)

# Build and compile the model
....
# Get the number of steps per epoch for each of the data generators
train_steps_per_epoch = train_generator.n // train_generator.batch_size
test_steps_per_epoch = test_generator.n // test_generator.batch_size

# Fit the model
model.fit_generator(train_generator, steps_per_epoch=train_steps_per_epoch, epochs=your_nepochs)

# Evaluate the model
model.evaluate_generator(test_generator, steps=test_steps_per_epoch)
2
saloua 12 Mei 2021, 12:31

Anda tidak perlu menyimpan data baru.

Saat memanggil metode aliran, data ditambah dengan cepat dan disajikan sebagai input ke model.

Jadi, data dihasilkan secara real time dan langsung dimasukkan ke dalam model Anda.

2
George 12 Mei 2021, 12:15