A blog about programming, data and the languages of art and technology

Posts about programming

Hyperparameter optimization with Keras generators and Talos

Published:

In my last post about training on multispectral images I was able to improve my classification accuracy on the EuroSat dataset from 75% to 90%, this using a very simple model. While 90% much better there are likelly further improvments possible by improving the model.

When seeking to improve the model there are two overall strategies one might attempt. Either one might try to analyze the problem and try to find a model that nicly adapts to that understanding, this is teoretically nice but the prommise of machine learning is that the computer can try to do some of this work for us. The other approach is to try different combination and see what works. Doing this manually is rather tedious so we can let the computer do this for us. This approach of letting the computer explore different ways of setting up our model is called hyperparameter optimization, where the hyperparameter part is reffereing to things like number of layers, choice of activation functions and the number of nodes per layer, this is to differentiate it from the ordinary parameters which are the weights of the model. The methods of analysis and hyperparameter optimization can of cource be combined, it can be very powerfull to alternate them and using analysis and understanding of the problem to create the broad strokes of the model and then use hyperparameter optimization to find the exact parameter values of that model.

In this article I will extend my earlier code for classifying the EuroSat dataset and show how you can use hyperparameter optimization to help find a better model. I will use a library called Talos to do this but it can be rather easily done without any dependecies as well if there is a reason to.

This article is intended to demonstrate how to write code for hyperparameter optimization, it is not intended to atempt to create an optimal solution to the classification of the EuroSat images, for much better results on that you can read the article EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification by Patrick Helber, Benjamin Bischke, Andreas Dengel and Damian Borth.

To implement hyperparameter optimization we need to define our parameter space and then tell our code to explore it, in our case using the talos library.

In this example we define the parameters with the following python code

params = { 'epoch': [120], 'batch_size': [32], 'activation': [relu], # convultion layers in the begining 'conv_hidden_layers': [1,2,3], 'conv_depth_shape': ['brick'], 'conv_size_shape': ['brick'], 'conv_depth_first_neuron': [20,40,60], 'conv_depth_last_neuron': [20,40,60], 'conv_size_first_neuron': [5,7], 'conv_size_last_neuron': [3,5], # fully connected layers at the end 'first_neuron': [32,64], 'last_neuron': [32,64], 'shapes': ['brick'], 'hidden_layers': [1,2,3], 'dropout': [0.05] }

And then pass it to talos like this

verbose=True round_limit=30 # NOTE Set this to however many rounds you want to test with model=create_model(training_set,validation_set,verbose) dummyX,dummyY=training_set.__getitem__(0) testX,testY=validation_set.__getitem__(0) validation_set.on_epoch_end() tt = talos.Scan( x=dummyX ,y=dummyY ,params=params ,model=model ,x_val=testX ,y_val=testY ,experiment_name='example.csv' ,print_params=True ,round_limit=round_limit )

Note the call to the create_model function. This is where we have moved our keras model definition and it is defined like this

class __Config__(object): pass config=__Config__() config.optimizer=Adam config.optimizer_parameters={'lr':0.0001,'decay':0.001} config.loss='categorical_crossentropy' config.metric=['accuracy'] def create_model(training_set,validation_set,verbose=False): def _create_conv_shape_(params): def shape(params): if params['hidden_layers']==1: return [params['first_neuron']] if params['hidden_layers']==2: return [params['first_neuron'],params['last_neuron']] else: params=params.copy() params['hidden_layers']-=2 s_list=network_shape.network_shape(params,params['last_neuron']) return [params['first_neuron'],*s_list,params['last_neuron']] conv_depth_params={ 'hidden_layers': params['conv_hidden_layers'], 'shapes': params['conv_depth_shape'], 'first_neuron': params['conv_depth_first_neuron'], 'last_neuron': params['conv_depth_last_neuron'], } conv_size_params={ 'hidden_layers': params['conv_hidden_layers'], 'shapes': params['conv_size_shape'], 'first_neuron': params['conv_size_first_neuron'], 'last_neuron': params['conv_size_last_neuron'], } conv_depth_shape = shape(conv_depth_params) conv_size_shape = shape(conv_size_params) conv_shape=zip(conv_depth_shape,conv_size_shape) return conv_shape def model(dummyXtrain,dummyYtrain,dummyXval,dummyYval,params): conv_shape=_create_conv_shape_(params) model = Sequential() for i,(depth,size) in enumerate(conv_shape): if i==0: model.add(Conv2D(depth, size, input_shape=training_set.shape)) else: model.add(Conv2D(depth, size)) model.add(Activation('relu')) model.add(Flatten()) hidden_layers(model, params, params['last_neuron']) model.add(Dense(training_set.num_classes)) model.add(Activation('softmax')) global config optimizer=config.optimizer(**config.optimizer_parameters) model.compile(loss=config.loss, optimizer=optimizer, metrics=config.metric) training_set.batch_size=params['batch_size'] validation_set.batch_size=params['batch_size'] history = model.fit_generator( training_set, validation_data=validation_set, epochs=params['epoch'], verbose=int(params['verbose']), ) return history,model return model

While this is a bit more complex than our model definition was previously note that half the code is just to allow us to vary the number of of convolutional layers, something that talos at the time of writing this does not automate.

As this is running the training over and over again this code will take quite a while to run (for me it took overnight) I think it is prudent to save the results of the run to a file rather than printing them directly. Doing this allows us to analyze the output in new ways without having to rerun the code. To do this I have defined some helper functions to select the parts of talos output (stored in the tt object) to store and write that to disk. The call looks like this

t = project_object(tt,'params','saved_models','saved_weights','data','details','round_history') save_object(t,'example.pickle')

And the helper functions are implemented like this

def save_object(obj, filename): with open(filename, 'wb') as output: pickle.dump(obj, output, protocol=2) def project_object(obj,*attributes): out={} for a in attributes: out[a]=getattr(obj,a) return out

To analyze the output I made another script I called example_inspect.py with the following code

#!/usr/bin/env python3 import pickle import pandas as pd from prettytable import PrettyTable, PLAIN_COLUMNS def save_object(obj, filename): with open(filename, 'wb') as output: pickle.dump(obj, output, protocol=2) def load_object(filename): with open(filename, 'rb') as f: return pickle.load(f) def print_hyperparameter_search_stats(t): print(" *** params: ",{ p:(v if len(v)<200 else [v[0],v[1],v[2],'...',v[-1]]) for p,v in t['params'].items()}) print() print(" *** data ",type(t['data']),len(t['data'])) print(t['data'].sort_values('val_acc',ascending=False).to_string()) print() distinct_data=t['data'] nunique = distinct_data.apply(pd.Series.nunique) cols_to_drop = nunique[nunique == 1].index distinct_data = distinct_data.drop(cols_to_drop, axis=1) print(nunique,cols_to_drop) print(" *** distinct data ",type(distinct_data),len(distinct_data)) print(distinct_data.sort_values('val_acc',ascending=False).to_string()) print() print(" *** details ",type(t['details']),len(t['details'])) print(t['details']) print() tt = load_object('example.pickle') print(tt['details']) for ttt in tt['round_history']: table = PrettyTable() table.set_style(PLAIN_COLUMNS) iterations=max([len(x) for x in ttt.values()]) table.add_column('epoch',range(1,iterations+1)) for key,val in sorted(ttt.items()): table.add_column(key, sorted(val)) print(table) print_hyperparameter_search_stats(tt)

Running this gets us (among other things) the following table (sorted on val acc):

val
loss
val
acc
loss acc conv
depth
first
neuron
conv
depth
last
neuron
conv
hidden
layers
conv
size
first
neuron
conv
size
last
neuron
first
neuron
hidden
layers
last
neuron
0.357977 0.910370 0.144818 0.95325 60 60 3 7 5 32 1 64
0.304265 0.902963 0.237622 0.92300 60 60 3 5 3 64 3 64
0.317458 0.901296 0.206647 0.93250 60 40 3 5 5 64 2 64
0.407395 0.900926 0.016828 0.99700 60 60 2 5 5 64 1 64
0.359424 0.898519 0.211113 0.93300 40 60 3 5 5 32 2 32
0.374605 0.888889 0.197840 0.93875 60 60 2 7 5 32 2 64
0.400512 0.879444 0.160062 0.94700 20 20 3 7 5 64 1 64
0.430621 0.876111 0.036848 0.99050 40 40 1 5 5 64 1 64
0.442248 0.873889 0.110187 0.96850 60 40 1 7 5 64 3 64
0.458507 0.872037 0.115722 0.96400 60 60 1 7 3 64 3 64
0.458475 0.868704 0.076465 0.97725 60 20 1 7 3 32 1 64
0.513024 0.867963 0.060912 0.98375 40 40 1 7 3 64 2 32
0.425103 0.864630 0.221249 0.92350 20 40 2 7 3 64 3 64
0.451033 0.863889 0.179650 0.94750 20 20 2 7 3 64 1 64
0.469304 0.862222 0.206574 0.93975 60 20 1 7 3 32 3 32
0.453625 0.861852 0.088379 0.97775 40 20 2 7 3 64 1 32
0.444886 0.861481 0.128833 0.96150 60 40 1 7 3 32 1 64
0.507459 0.858148 0.315459 0.90150 40 60 2 5 3 64 2 32
0.486624 0.852963 0.354109 0.89050 20 20 2 5 5 64 1 64
0.485049 0.851111 0.145122 0.96050 20 20 1 5 5 64 1 64
0.559950 0.849815 0.027219 0.99625 20 40 1 7 5 64 1 32
0.494588 0.847963 0.150461 0.95500 20 60 1 5 3 64 1 32
0.548905 0.824444 0.496872 0.83700 20 40 1 5 5 32 2 32
0.613613 0.807037 0.564718 0.80950 60 60 1 7 3 32 2 32
0.626898 0.798519 0.563555 0.81325 60 40 1 7 5 32 2 64
1.310021 0.495741 1.370861 0.46675 40 20 1 7 5 32 3 32
2.302581 0.114630 2.302587 0.09625 40 60 1 7 5 32 2 64
2.302576 0.114630 2.302588 0.10000 60 60 1 5 5 32 3 32
14.327196 0.111111 14.506286 0.10000 60 20 1 7 5 64 1 32
2.302568 0.110556 2.302587 0.09900 20 40 3 5 3 32 3 64

As can be seen from the above table our best result of this run is 91% validation accuracy, only a small improvement over the 90% we got earlier. This can be either because I was lucky with my initial guess of parameters or because we are changing the wrong parameters in the model in our experiment. As this is meant to demonstrate the technique I will not delve deeper into this in this article. Despite this I hope this gives you an overview of how hyperparameter optimization can be realized.

Our new source now looks like this (example.py) and the inspect code like this (example_inspect.py). The generators code is unchanged.

Feel free to use this code for any and all purposes, consider it in the public domain or if that is not workable for you you can use it under the terms of the MIT License

Training on multispectral images using Keras

Published:

Edited:

In my recent post about using Keras generators I was able to achive 75% classification accuracy on the EuroSat dataset using a very simple model. While there is a lot to do in regards to improving the model there is a simple change that can be made without the need for the analysis work needed for an improved model.

In my generators post I elected to use the JPEG variant of the dataset, for reasons of not introducing to many new concepts into that post. Alternatively what can be done is to use the multispectral TIFF images from the dataset, thus gaining access to much more information for the machine learning to base its conclusions on.

This turned out to be a relatively simple thing to do which surprised me as very little informations on this was available online, I mostly found blog posts of people asking how to get it working.

Starting with the code in my post on generators (generators.py, example.py) we can make a simply replace the read_image function and the code will be able to process multispectral images. Code below

import numpy as np import rasterio read_image_cache={} def read_image(path, rescale=None): key="{},{}".format(path,rescale) if key in read_image_cache: return read_image_cache[key] else: with rasterio.open(path) as img: data=img.read() data=np.moveaxis(data,0,-1) if rescale!=None: data=data*rescale read_image_cache[key]=data return data

What this code does is that it stops using the Keras load_img function and instead uses the Rasterio library to directly read images to numpy arrays. This function will return a 3D array with a depth equal to the number of bands in the image.

Making that change and running the same test as in the generators post we get the following results

Epoch 1/120 125/125 [==========] - 49s 394ms/step - loss: 2.7323 - acc: 0.2477 - val_loss: 1.7014 - val_acc: 0.3857 Epoch 2/120 125/125 [==========] - 49s 392ms/step - loss: 1.3841 - acc: 0.4800 - val_loss: 1.2359 - val_acc: 0.5559 Epoch 3/120 125/125 [==========] - 49s 393ms/step - loss: 1.0834 - acc: 0.5998 - val_loss: 1.1012 - val_acc: 0.5928 Epoch 4/120 125/125 [==========] - 49s 392ms/step - loss: 0.8800 - acc: 0.6778 - val_loss: 0.8057 - val_acc: 0.7107 Epoch 5/120 125/125 [==========] - 49s 393ms/step - loss: 0.7929 - acc: 0.7115 - val_loss: 0.7359 - val_acc: 0.7394 Epoch 6/120 125/125 [==========] - 49s 392ms/step - loss: 0.7211 - acc: 0.7380 - val_loss: 0.7304 - val_acc: 0.7544 Epoch 7/120 125/125 [==========] - 49s 393ms/step - loss: 0.6667 - acc: 0.7578 - val_loss: 0.7604 - val_acc: 0.7031 Epoch 8/120 125/125 [==========] - 49s 393ms/step - loss: 0.6208 - acc: 0.7830 - val_loss: 0.6004 - val_acc: 0.7833 Epoch 9/120 125/125 [==========] - 49s 392ms/step - loss: 0.6095 - acc: 0.7867 - val_loss: 0.6019 - val_acc: 0.7885 Epoch 10/120 125/125 [==========] - 49s 393ms/step - loss: 0.5913 - acc: 0.7905 - val_loss: 0.5670 - val_acc: 0.7961 ... Epoch 90/120 125/125 [==========] - 48s 384ms/step - loss: 0.2038 - acc: 0.9375 - val_loss: 0.3243 - val_acc: 0.8854 Epoch 91/120 125/125 [==========] - 48s 382ms/step - loss: 0.2064 - acc: 0.9315 - val_loss: 0.3140 - val_acc: 0.8943 Epoch 92/120 125/125 [==========] - 48s 384ms/step - loss: 0.2059 - acc: 0.9325 - val_loss: 0.3232 - val_acc: 0.8870 Epoch 93/120 125/125 [==========] - 48s 382ms/step - loss: 0.1994 - acc: 0.9345 - val_loss: 0.3165 - val_acc: 0.8900 Epoch 94/120 125/125 [==========] - 48s 382ms/step - loss: 0.2030 - acc: 0.9375 - val_loss: 0.3013 - val_acc: 0.8970 Epoch 95/120 125/125 [==========] - 48s 381ms/step - loss: 0.1952 - acc: 0.9400 - val_loss: 0.3164 - val_acc: 0.8917 Epoch 96/120 125/125 [==========] - 48s 381ms/step - loss: 0.1961 - acc: 0.9380 - val_loss: 0.3295 - val_acc: 0.8878 Epoch 97/120 125/125 [==========] - 48s 381ms/step - loss: 0.2003 - acc: 0.9387 - val_loss: 0.3145 - val_acc: 0.8920 Epoch 98/120 125/125 [==========] - 48s 381ms/step - loss: 0.1886 - acc: 0.9400 - val_loss: 0.3096 - val_acc: 0.8926 Epoch 99/120 125/125 [==========] - 48s 381ms/step - loss: 0.1983 - acc: 0.9323 - val_loss: 0.3287 - val_acc: 0.8907 Epoch 100/120 125/125 [==========] - 48s 380ms/step - loss: 0.1923 - acc: 0.9338 - val_loss: 0.3190 - val_acc: 0.8887 Epoch 101/120 125/125 [==========] - 48s 382ms/step - loss: 0.1927 - acc: 0.9313 - val_loss: 0.3107 - val_acc: 0.8957 Epoch 102/120 125/125 [==========] - 47s 376ms/step - loss: 0.1788 - acc: 0.9375 - val_loss: 0.3131 - val_acc: 0.8941 Epoch 103/120 125/125 [==========] - 47s 377ms/step - loss: 0.1932 - acc: 0.9370 - val_loss: 0.3008 - val_acc: 0.8978 Epoch 104/120 125/125 [==========] - 48s 380ms/step - loss: 0.1894 - acc: 0.9405 - val_loss: 0.3049 - val_acc: 0.9019 Epoch 105/120 125/125 [==========] - 47s 377ms/step - loss: 0.1821 - acc: 0.9420 - val_loss: 0.3138 - val_acc: 0.8915 Epoch 106/120 125/125 [==========] - 47s 379ms/step - loss: 0.1811 - acc: 0.9400 - val_loss: 0.3159 - val_acc: 0.8924 Epoch 107/120 125/125 [==========] - 47s 375ms/step - loss: 0.1797 - acc: 0.9400 - val_loss: 0.3079 - val_acc: 0.8972 Epoch 108/120 125/125 [==========] - 47s 378ms/step - loss: 0.1826 - acc: 0.9382 - val_loss: 0.3215 - val_acc: 0.8935 Epoch 109/120 125/125 [==========] - 47s 378ms/step - loss: 0.1798 - acc: 0.9393 - val_loss: 0.3031 - val_acc: 0.8972 Epoch 110/120 125/125 [==========] - 47s 376ms/step - loss: 0.1763 - acc: 0.9455 - val_loss: 0.3588 - val_acc: 0.8776 Epoch 111/120 125/125 [==========] - 47s 379ms/step - loss: 0.1723 - acc: 0.9445 - val_loss: 0.3039 - val_acc: 0.8965 Epoch 112/120 125/125 [==========] - 47s 376ms/step - loss: 0.1822 - acc: 0.9407 - val_loss: 0.3099 - val_acc: 0.8978 Epoch 113/120 125/125 [==========] - 47s 378ms/step - loss: 0.1831 - acc: 0.9412 - val_loss: 0.3140 - val_acc: 0.8917 Epoch 114/120 125/125 [==========] - 47s 376ms/step - loss: 0.1674 - acc: 0.9455 - val_loss: 0.3166 - val_acc: 0.8898 Epoch 115/120 125/125 [==========] - 48s 381ms/step - loss: 0.1734 - acc: 0.9475 - val_loss: 0.3126 - val_acc: 0.8965 Epoch 116/120 125/125 [==========] - 47s 377ms/step - loss: 0.1677 - acc: 0.9430 - val_loss: 0.3025 - val_acc: 0.8954 Epoch 117/120 125/125 [==========] - 47s 377ms/step - loss: 0.1788 - acc: 0.9463 - val_loss: 0.3092 - val_acc: 0.8920 Epoch 118/120 125/125 [==========] - 47s 377ms/step - loss: 0.1622 - acc: 0.9472 - val_loss: 0.2990 - val_acc: 0.9004 Epoch 119/120 125/125 [==========] - 47s 376ms/step - loss: 0.1629 - acc: 0.9465 - val_loss: 0.3225 - val_acc: 0.8900 Epoch 120/120 125/125 [==========] - 47s 378ms/step - loss: 0.1800 - acc: 0.9397 - val_loss: 0.3025 - val_acc: 0.8981

As you can see from the program output we are getting a much better result of approximately 90%. We can also see that from about epoch 100 we are mostly oscillating around this value, this tells us that we are likely at the limit for how good our simple model can become, necessitating a more thought out one for better results (potentially more and or better training data might also be needed). We could keep running for more epochs but that would most likely lead to over training.

Our new source now looks like this (example.py). The generators code is unchanged.

Feel free to use this code for any and all purposes, consider it in the public domain or if that is not workable for you you can use it under the terms of the MIT License

Utilizing generators to use Keras training with existing file structure

Published:

I recently wanted to use Keras, a deep learning framework, to solve an image classification problem and ran into an issue. Keras built-in image load functions assumes that my training data is organized in a single folder with a subfolder for each class of images. This is then replicated for the validation data unless Keras automatic validation split is used. In my case the data where spread out over several folders (an artifact from how the data was sourced) and it would be impractical to copy the data which were already taking up a significant part of the total disk space in the development system.

The solution to this is to use Keras generators. There are two kinds of generators in Keras, either a simple python generator using yield or a class inheriting from keras.utils.Sequence. The later one is the more flexible one and what this post focuses on.

My initial attempt did work but was rather messy to use and when I needed to extend it to handle splitting the data into three parts (test,validation and training) doing that in the original design would have been very messy. So I took a step back and figured that I wanted the following operations.

  • create empty generator
  • add a directory with files to the generator
    this could be extended to add data from other sources or directory structures
  • shuffle the data
  • split the generator into new generators using a list of split-points (real number between 0 and 1)
  • a way to get the class names of the generator
  • a way to get the filename of images yielded by the generator

Of these, the key operations are the splitting and mapping of generated images to filenames. The splitting is important as it allows us to control how many and how large sets we are splitting our data into, allowing for training, validation and test sets or more. The mapping of images back to filenames are important as it allows us to use the generators for prediction as well as allowing us to generate lists of images which the network gets wrong for manual analysis of the networks behaviour.

In addition to this we have some additional operations included later as their need became apparent.

  • A function to set constructor properties after the fact, such as verbosity
  • A function to preload the images into a cache
  • Controls for the batch size used
  • Controls for restricting the maximum number of images per class each epoch

While not central to the functioning of the generator these functionalities proved needed in practical application.


To create a generator based on keras.utils.Sequence we are required to provide a few methods to get it to work.

class SplitSetImageGenerator(keras.utils.Sequence): def __getitem__(self,index): # gets the batch for the supplied index # return a tuple (numpy array of image, numpy array of labels) or None at epoch end def __len__(self): # gets the number of batches # return the number of batches in this epoch (do not change in the middle of an epoch) def on_epoch_end(self): # performs auto shuffle if enabled # Do what we need to do between epochs

Adding our methods we arrive at

class SplitSetImageGenerator(keras.utils.Sequence): def __init__(self): # do initialization def set(self,**attributes): # set some config property, eg batch_size, verbose or max_per_class_and_epoch def add_dir(self,image_dir_reader,*paths): # add the directories in paths to this generator as image sources # image_dir_reader should be a function returning a tuple of lists: # names - filenames of images # classes - class of each image as a number # classnames - names of all the classes in the directory # classindices - companion list to classnames mapping each name to its number def shuffle(self): # shuffle the contents without loosing filename associations def preload(self): # load all images which will cache them if caching is configured def split(self,*splitpoints): # splits the generator at the provided fractions of all images, duplicate fractions # generates empty child generators and non increasing fractions is disallowed def get_filenames(self,indices): # returns the filenames of the images corresponding to the indices in the current epoch def __getitem__(self,index): # gets the batch for the supplied index # return a tuple (numpy array of image, numpy array of labels) or None at epoch end def __len__(self): # gets the number of batches # return the number of batches in this epoch (do not change in the middle of an epoch) def on_epoch_end(self): # performs auto shuffle if enabled # Do what we need to do between epochs

When we have these methods we are starting to be able to write useful code. If we adopt the convention that all methods except split, get_filenames and the methods from keras.utils.Sequence will return self we can now do.

training,validation=SplitSetImageGenerator().add_dir(*paths).shuffle().preload().split(0.8) model.fit_generator(training,validation_data=validation,epochs=10)

Once we have this in place we will not add any more external methods, we will however define some useful properties that the generator will have defined that a user of the generator can access. The primary ones are:

  • filenames - a list of all filenames known to the generator
  • classes - a corresponding list of class numbers for each filename
  • classnames - a list where class names can be looked up from class numbers

These are the ones most useful to access. Some further properties we will define, mostly to configure the behaviour of the generator (using __init__ or the set method) are:

  • batch_size - the number of images returned on each call of __getitem__
  • verbose - to spam or not to spam stdout
  • max_per_class_and_epoch - a limit on how many images of each class to return
  • auto_shuffle - if the generator should be shuffled between epochs
  • scale - a number to scale all pixel values in an image with
  • image_load_function - a function that can load an image into a numpy array
  • image_cache - a cache object that can be passed to the image load function

I think most of these are rather obvious, the one I want to comment on are the max_per_class_and_epoch. I added that one after I got problems with the training, turned out that I had many more examples of one of my classes so the training got stuck in a local maxima where it always predicted that class. This option solved that by ensuring that in each epoch the generator will always produce the same number of each class as long as its value is set lower than the number of images in the smallest class in the training set.

I will not go through the implementation in detail, if you are interested you can look at the source yourself. I will however show some examples of how to use the code.


To use the generator some steps are needed and other are probably recommended. The following example shows how to read images from a folder in the same manner as Keras built in image data generator and then split that dataset in a consistent way. I will be using the EuroSAT dataset available at https://github.com/phelber/eurosat in this example.

# build the data generators test_validation_train_split=[0.2,0.4] test_set,validation_set,training_set=[dataset.set(verbose=False) for dataset in SplitSetImageGenerator(image_load_function=read_image,scale=1.0/255) .add_dir(image_data_generator_dir_reader,'data/EuroSat/jpg/') .shuffle() .split(*test_validation_train_split)] # preload images to speed up training for s in [validation_set,training_set]: s.set(verbose=True).preload().set(verbose=False).shuffle()

As can be seen from the code we start by creating the image generator and passing it an image load function (to be defined later) and a scale factor (here used to scale pixels into the range 0-1). We then add a directory with data to the generator by passing a reader function (to be defined) as well as a path to a directory of images. At this point we have a generator capable of being used in training etc.

In the next step we shuffle the generator to avoid the risk of all the images of some classes ending up in the same part of the data when we split into test, training and validation sets. We follow the shuffle by splitting the data placing the data in the range of 0%-20% into the first set, 20%-40% into the next set and 40%-100% into the last set. We then disables verbosity for all sets and store the sets as test, validation and training.

The final step is preloading the images in the validation and training set to avoid slowdowns caused by disk access during training.

To make this work we need to define the functions for reading a directory and for reading the individual image files. We do that using the following code.

read_image_cache={} def read_image(path, rescale=None): key="{},{}".format(path,rescale) if key in read_image_cache: return read_image_cache[key] else: img=image.load_img(path) data=image.img_to_array(img) if rescale!=None: data=data*rescale read_image_cache[key]=data return data # function to return filenames and classes of images # also returns a list of class names and a list of class indices corresponding to the class names def image_data_generator_dir_reader(path): sys.stdout=sys.stderr # redirect problematic output # here we use the keras ImageDataGenerator to get a list of filenames and classes ig = image.ImageDataGenerator() gen = ig.flow_from_directory(path) sys.stdout=sys.__stdout__ # restore stdout names=[os.path.normpath(path+'/'+n.replace('\\','/')).replace('\\','/') for n in gen.filenames] return (names,gen.classes,*zip(*gen.class_indices.items()))

The first of these functions reads a single image using Keras load_img function and caches it as well as applying any supplied rescaling.

The second function uses the Keras ImageDataGenerator to get filenames and classes for them from a directory. If the data is stored in some other organisation than the one handled by Keras ImageDataGenerator we only need to supply a function of this type that can read that format to add_dir and we can keep using the rest of the code unchanged and without needing to reorganize data on disk. Also as were the original motivation we are not restricted to one call to add_dir but can add many directories if we have several datasets we want to combine.

Having read the data we can then define a simple model and train a network using this code. (full source here example.py)

################### MODEL DEFINITION ################### # this is not an optimized model, just a simple example # for good results this model needs some thought model = Sequential() model.add(Conv2D(60, 5, input_shape=training_set.shape)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(20, 5, input_shape=training_set.shape)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(120)) model.add(Dense(60)) model.add(Dense(training_set.num_classes)) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.0001,decay=0.001), metrics=['accuracy']) ################### TRAINING ################### history = model.fit_generator( training_set, validation_data=validation_set, epochs=30 )

Running this produces output as follows and as we can see we are with this basic model still reaching validation accuracies of 75%.

Epoch 1/120 125/125 [===========] - 39s 311ms/step - loss: 2.3054 - acc: 0.1133 - val_loss: 2.2371 - val_acc: 0.1220 Epoch 2/120 125/125 [===========] - 39s 312ms/step - loss: 2.0607 - acc: 0.1675 - val_loss: 1.8251 - val_acc: 0.2598 Epoch 3/120 125/125 [===========] - 39s 311ms/step - loss: 1.6701 - acc: 0.3575 - val_loss: 1.5512 - val_acc: 0.4070 Epoch 4/120 125/125 [===========] - 39s 310ms/step - loss: 1.4910 - acc: 0.4300 - val_loss: 1.4137 - val_acc: 0.4667 Epoch 5/120 125/125 [===========] - 39s 312ms/step - loss: 1.4091 - acc: 0.4610 - val_loss: 1.3520 - val_acc: 0.5046 Epoch 6/120 125/125 [===========] - 40s 318ms/step - loss: 1.3175 - acc: 0.5067 - val_loss: 1.3070 - val_acc: 0.4922 Epoch 7/120 125/125 [===========] - 39s 314ms/step - loss: 1.3077 - acc: 0.5090 - val_loss: 1.3010 - val_acc: 0.4761 Epoch 8/120 125/125 [===========] - 39s 315ms/step - loss: 1.2480 - acc: 0.5327 - val_loss: 1.2288 - val_acc: 0.5443 Epoch 9/120 125/125 [===========] - 39s 311ms/step - loss: 1.2073 - acc: 0.5588 - val_loss: 1.2555 - val_acc: 0.5157 Epoch 10/120 125/125 [===========] - 39s 310ms/step - loss: 1.2273 - acc: 0.5618 - val_loss: 1.1627 - val_acc: 0.5794 ... Epoch 110/120 125/125 [===========] - 31s 249ms/step - loss: 0.7036 - acc: 0.7505 - val_loss: 0.7085 - val_acc: 0.7424 Epoch 111/120 125/125 [===========] - 31s 248ms/step - loss: 0.7144 - acc: 0.7535 - val_loss: 0.7177 - val_acc: 0.7413 Epoch 112/120 125/125 [===========] - 31s 249ms/step - loss: 0.7088 - acc: 0.7630 - val_loss: 0.7053 - val_acc: 0.7535 Epoch 113/120 125/125 [===========] - 31s 249ms/step - loss: 0.6910 - acc: 0.7620 - val_loss: 0.6994 - val_acc: 0.7513 Epoch 114/120 125/125 [===========] - 31s 249ms/step - loss: 0.7053 - acc: 0.7518 - val_loss: 0.6969 - val_acc: 0.7531 Epoch 115/120 125/125 [===========] - 31s 249ms/step - loss: 0.6863 - acc: 0.7655 - val_loss: 0.6980 - val_acc: 0.7544 Epoch 116/120 125/125 [===========] - 31s 248ms/step - loss: 0.6859 - acc: 0.7600 - val_loss: 0.7182 - val_acc: 0.7433 Epoch 117/120 125/125 [===========] - 31s 249ms/step - loss: 0.7222 - acc: 0.7460 - val_loss: 0.6948 - val_acc: 0.7528 Epoch 118/120 125/125 [===========] - 31s 248ms/step - loss: 0.7032 - acc: 0.7602 - val_loss: 0.7140 - val_acc: 0.7444 Epoch 119/120 125/125 [===========] - 31s 248ms/step - loss: 0.6917 - acc: 0.7615 - val_loss: 0.6946 - val_acc: 0.7496 Epoch 120/120 125/125 [===========] - 31s 247ms/step - loss: 0.6862 - acc: 0.7562 - val_loss: 0.6945 - val_acc: 0.7502

That's all for this post, I hope to write more about machine learning in the feature if I do you should be able to find them using the tags on this post.

All source code for this post

generator: generators.py
example: example.py

Feel free to use this code for any and all purposes, consider it in the public domain or if that is not workable for you you can use it under the terms of the MIT License

Predicate combinators in input validation

Published:

I have spent some time coding in python and ran accross the problem of parsing' comand line parameteras and validating them. The python argparse library proved to be great at parsing but at first glance did not provide obvious means for validating the parameters.

It turns out however that there is a feature in the argparse library we can exploit to easily add that.

Consider the following example from the argparse documentation:

import argparse parser = argparse.ArgumentParser(description='Process some integers.') parser.add_argument('integers', metavar='N', type=int, nargs='+', help='an integer for the accumulator') parser.add_argument('--sum', dest='accumulate', action='store_const', const=sum, default=max, help='sum the integers (default: find the max)') args = parser.parse_args() print(args.accumulate(args.integers))

Consider especially the parameter type=int to the first argument.

On the surface what this does is passing a function to convert the argument to the desired type which was not what I was trying to do. But considering that this is an arbitrary function call this is a great place for injecting my own validation code.

The naive approach would write a validation function for each parameter and then passing it to the type argument and then do the checks there. This is quite good but it can be improved.

Consider this function

def create_checked(predicate,error="ERROR: value does not meet constraints"): def fun(value): if predicate(value): return value raise Exception(error) return fun

which is then used like this

parser.add_argument('integers', metavar='N', type=create_checked(int), nargs='+', help='an integer for the accumulator')

In this example the situation is actually not improving, we are checking if the argument can be converted to string and if a false value is returned we are trowing an exception. This is a bit redundant as the int function already throws.

But lets consider another example

parser.add_argument('config', metavar='FILENAME', type=create_checked(os.path.isfile), help='a config file')

This is more useful, now we can use standard function to check if input files exists and any other function that already exists and can return a boolean based on a string like a pathname (os.path.isdir comes to mind).

We are still not done thou.

What if we want to check more than one thing about a parameter, or we want to check that the file does not exists.

Enter the following functions

def And(*predicates): def inner(obj): for p in predicates: if not p(obj): return False return True return inner def Or(*predicates): def inner(obj): for p in predicates: if p(obj): return True return False return inner def Not(predicate): def inner(obj): return not predicate(obj) return inner

These functions allows us to combine several things to check regarding the same parameter or to negate the value of a checking function.

parser.add_argument('output', metavar='FILENAME', type=create_checked(Not(os.path.isfile)), help='an output file') parser.add_argument('input-zip', metavar='FILENAME', type=create_checked(And(os.path.isfile,valid_zip)), help='an input zip file')

The code above first checks that the output file does not exists and then for the input parameter checks that the file exists and is a valid zip file (assuming the valid_zip function exists).

That's all for this post, hope you find it useful

Source code for this: create_checked.py

Feel free to use this code for any and all purposes, consider it in the public domain or if that is not workable for you you can use it under the terms of the MIT License

Using decorators to emulate ad-hoc inheritance in Java

Published:

Consider Java code that is based on the rather common pattern of factories, producing instances of some interface you want to use. In the simple case this works fine but what do you do when you want to modify a method in the returned instance? If there where no factory involved you could just inherit from the class anonymously and override that method. But given that the object is created via a factory that option is closed to us.

Luckily there is a solution for us in the Java API, the java.lang.reflect.Proxy classes newProxyInstance method. This however have a rather clunky interface where we are expected to handle Method objects and InvocationHandler instances.

We can do better than that.

After a bit of experimenting I came up with the following API

public class Decorator { @Retention(RetentionPolicy.RUNTIME) @Target(ElementType.METHOD) public static @interface Override {} public static <I, T extends I,D> I decorate(T proxyBase, Class<I> asInterface, D decorations) throws NoSuchMethodException { // ... Implementation } }

Which is used like this

MyInterface m=Decorator.decorate(MyInterfaceFactory.create(), MyInterface.class, new Object(){ @Decorator.Override void close(MyInterface i){ System.err.println("Closing down"); i.close(); } }); m.close(); // Will print "Closing down"

source code: Decorator.java
test case: Test.java

Enjoy

Feel free to use this code for any and all purposes, consider it in the public domain or if that is not workable for you you can use it under the terms of the MIT License